During a recent episode of the Possible podcast hosted by LinkedIn co-founder Reid Hoffman, Google DeepMind CEO Demis Hassabis shared exciting news: Google will eventually combine its powerful Gemini AI with its advanced video-generating Veo model. This move signals a major leap toward creating AI that truly understands and interacts with the physical world.
Image Credits:Jose Sarmento Matos/Bloomberg / Getty ImagesWhy Google Is Combining Gemini and Veo
As someone who closely follows the evolution of AI, this announcement felt like a significant turning point. Gemini has always been envisioned as a multimodal foundation model — built to process and synthesize text, audio, images, and beyond. By merging it with Veo, Google aims to deepen Gemini’s understanding of real-world physics and context, enhancing its utility in everyday situations.
Hassabis described this combination as a step toward building what he calls a universal digital assistant — one that doesn't just process language but actively supports real-world tasks.
The Rise of Omni Models: What It Means for AI
We’re clearly seeing a trend across the industry: the move toward "omni" or "any-to-any" AI models. These systems can take in multiple types of data (text, video, audio, etc.) and generate outputs in several formats. Google's Gemini can already generate text, images, and even audio. OpenAI’s ChatGPT can now produce images — even in stylistic formats like Studio Ghibli. Amazon, too, plans to launch its own omni model this year.
But training these models requires massive datasets — and Google has a unique edge here.
YouTube Data Is Powering Veo's Real-World Intelligence
Hassabis hinted that YouTube, one of the largest video platforms in the world (which Google owns), plays a key role in training Veo. He noted that by analyzing an enormous volume of YouTube videos, the AI can begin to understand physics and real-world cause-effect relationships — basically, how the world works.
Google has previously stated that its models “may be” trained on YouTube content, aligning with updated terms of service that give the company more access to creators’ data for AI development. From my perspective, it’s a smart and strategic use of resources — though it's not without ethical and legal scrutiny.
The Future of AI: Smarter, More Capable, More Human
Hearing Hassabis talk about AI that not only processes information but understands the physicality of our world was a reminder of just how far this technology has come — and how far it still has to go. These developments suggest we're moving beyond narrow-task AIs toward truly intelligent digital companions that can think, create, and interact across media formats.
A New Era for Google AI
As someone deeply invested in tracking AI's trajectory, this update from Google DeepMind is more than just a technical milestone. It reflects a broader shift in how tech giants are thinking about intelligence — not as a set of isolated tools, but as interconnected, ever-evolving systems.
We're witnessing the beginning of something much bigger: a world where AI isn't just reactive, but proactive, deeply integrated into how we live, work, and create.
Post a Comment