Editor’s Note: This article, originally published on March 13, 2023, has been updated.
Foundation Models: The New Frontier of Artificial Intelligence
The mics were live and tape was rolling in the studio where the Miles Davis Quintet was recording dozens of tunes in 1956 for Prestige Records. When an engineer asked for the next song’s title, Davis shot back, "I’ll play it, and tell you what it is later." Like the prolific jazz trumpeter and composer, researchers have been generating AI models at a feverish pace, exploring new architectures and use cases. According to the 2024 AI Index report from the Stanford Institute for Human-Centered Artificial Intelligence, 149 foundation models were published in 2023, more than double the number released in 2022.
What are Foundation Models?
A foundation model is an AI neural network trained on vast amounts of raw data, generally with unsupervised learning, that can be adapted to accomplish a broad range of tasks. Two important concepts help define this umbrella category: data gathering is easier, and opportunities are as wide as the horizon.
No Labels, Lots of Opportunity
Foundation models generally learn from unlabeled datasets, saving the time and expense of manually describing each item in massive collections. Earlier neural networks were narrowly tuned for specific tasks. With a little fine-tuning, foundation models can handle jobs from translating text to analyzing medical images to performing agent-based behaviors.
The Emergence and Homogenization of AI
In his opening talk at the first workshop on foundation models, Percy Liang, the center’s director, coined two terms to describe foundation models: emergence refers to AI features still being discovered, such as the many nascent skills in foundation models. He calls the blending of AI algorithms and model architectures homogenization, a trend that helped form foundation models.
A Brief History of Foundation Models
We are in a time where simple methods like neural networks are giving us an explosion of new capabilities, said Ashish Vaswani, an entrepreneur and former senior staff research scientist at Google Brain who led work on the seminal 2017 paper on transformers. That work inspired researchers who created BERT and other large language models, making 2018 "a watershed moment" for natural language processing, a report on AI said at the end of that year.
The Rise of Generative AI
Generative AI has the potential to yield trillions of dollars of economic value, said executives from the venture firm Sequoia Capital who shared their views in a recent AI Podcast. It’s an umbrella term for transformers, large language models, diffusion models, and other neural networks capturing people’s imaginations because they can create text, images, music, software, videos, and more.
Going Multimodal
Foundation models have also expanded to process and generate multiple data types, or modalities, such as text, images, audio, and video. VLMs are one type of multimodal models that can understand video, image, and text inputs while producing text or visual output.
The Future of AI
The next frontier of artificial intelligence is physical AI, which enables autonomous machines like robots and self-driving cars to interact with the real world. AI performance for autonomous vehicles or robots requires extensive training and testing. To ensure physical AI systems are safe, developers need to train and test their systems on massive amounts of data, which can be costly and time-consuming.
Conclusion
Foundation models have the potential to revolutionize the field of AI, enabling businesses and organizations to create innovative applications and services. However, there are also concerns about the potential risks and challenges associated with these models, including amplifying bias, introducing inaccurate or misleading information, and violating intellectual property rights.
FAQs
Q: What are foundation models?
A: Foundation models are AI neural networks trained on vast amounts of raw data, generally with unsupervised learning, that can be adapted to accomplish a broad range of tasks.
Q: What are the key concepts that define foundation models?
A: Data gathering is easier, and opportunities are as wide as the horizon.
Q: What are the potential applications of foundation models?
A: Foundation models can be used for tasks such as translating text, analyzing medical images, performing agent-based behaviors, and more.
Q: What are the potential risks and challenges associated with foundation models?
A: The potential risks and challenges include amplifying bias, introducing inaccurate or misleading information, and violating intellectual property rights.
Q: What is the future of AI?
A: The next frontier of artificial intelligence is physical AI, which enables autonomous machines like robots and self-driving cars to interact with the real world.

