Google will Combine Gemini and Veo AI Models

Google’s Plan to Combine AI Models for a Universal Digital Assistant

DeepMind CEO’s Recent Podcast Appearance

In a recent appearance on Possible, a podcast co-hosted by LinkedIn co-founder Reid Hoffman, Google DeepMind CEO Demis Hassabis said Google plans to eventually combine its Gemini AI models with its Veo video-generating models to improve the former’s understanding of the physical world.

A Vision for a Universal Digital Assistant

“We’ve always built Gemini, our foundation model, to be multimodal from the beginning,” Hassabis said, “and the reason we did that [is because] we have a vision for this idea of a universal digital assistant, an assistant that … actually helps you in the real world.”

The Rise of Omni Models

AI Industry Trends

The AI industry is moving gradually toward “omni” models, if you will — models that can understand and synthesize many forms of media. Google’s newest Gemini models can generate audio as well as images and text, while OpenAI’s default model in ChatGPT can natively create images — including, of course, Studio Ghibli-style art. Amazon has also announced plans to launch an “any-to-any” model later this year.

Training Data for Omni Models

These omni models require a lot of training data — images, videos, audio, text, and so on. Hassabis implied that the video data for Veo is coming mostly from YouTube, a platform that Google owns.

YouTube as a Source of Training Data

“Basically, by watching YouTube videos — a lot of YouTube videos — [Veo 2] can figure out, you know, the physics of the world,” Hassabis said.

Conclusion

Google’s plan to combine its Gemini AI models with its Veo video-generating models is a significant step toward building a universal digital assistant. By leveraging training data from YouTube and other sources, Google aims to improve its models’ understanding of the physical world. As the AI industry continues to evolve, it will be interesting to see how these omni models shape the future of AI.

FAQs

Q: What are Gemini AI models?

A: Gemini AI models are multimodal models that can generate audio, images, and text.

Q: What is Veo?

A: Veo is a video-generating model developed by Google.

Q: Where does the training data for Veo come from?

A: The video data for Veo comes mostly from YouTube, a platform that Google owns.

Q: What are omni models?

A: Omni models are AI models that can understand and synthesize many forms of media, including images, videos, audio, and text.

Post Views: 61

Google will Combine Gemini and Veo AI Models

Google’s Plan to Combine AI Models for a Universal Digital Assistant

DeepMind CEO’s Recent Podcast Appearance

A Vision for a Universal Digital Assistant

The Rise of Omni Models

AI Industry Trends

Training Data for Omni Models

YouTube as a Source of Training Data

Conclusion

FAQs

Q: What are Gemini AI models?

Q: What is Veo?

Q: Where does the training data for Veo come from?

Q: What are omni models?

How AI Navigation is Improving the Performance of Robotic Pool Cleaners

Generate single title from this title SAP aligns commerce data for AI personalisation in 100 -150 characters. And it must return only title i...

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Generate single title from this title Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore in 100 -150 characters. And it must return...

LLMs help robots understand vague instructions and focus on key details | MIT News

How AI Navigation is Improving the Performance of Robotic Pool Cleaners

Generate single title from this title SAP aligns commerce data for AI personalisation in 100 -150 characters. And it must return only title i...

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Generate single title from this title Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore in 100 -150 characters. And it must return...

LLMs help robots understand vague instructions and focus on key details | MIT News

We Ranked #11 on the Top 100 Inspiring Workplaces List. Here’s What Got Us There.

SmartThings Blog

How to Build an Employee Recognition Budget That Actually Gets Approved

LEAVE A REPLY Cancel reply

Latest

How AI Navigation is Improving the Performance of Robotic Pool Cleaners

Generate single title from this title SAP aligns commerce data for AI personalisation in 100 -150 characters. And it must return only title i...

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Categories

Useful Links

Our Newsletter