Google has unveiled Omni, a groundbreaking artificial intelligence model that promises to transform any kind of input—text, images, video, or audio—into any other kind of output. For now, the model focuses on video generation, and it has already demonstrated both stunning realism and eerie flaws.
What is Omni?
Omni is a family of generative models designed to be “anything-to-anything.” The first release, Omni Flash, is currently available in Google’s AI video creation and editing platform, Flow. Unlike its predecessor Veo, Omni can accept a source video along with a text prompt to generate new content. Google claims the model incorporates more real-world knowledge and maintains better character consistency throughout generated clips.
The model’s capabilities represent a significant leap forward from earlier AI video tools. While Veo required starting from scratch with text prompts, Omni can build upon existing footage, making it easier to create seamless edits and realistic scenes. This advancement aligns with Google’s broader push to integrate generative AI into everyday content creation.
Key Facts from Testing
Journalists who tested Omni found a mixed bag of results. The model excelled at creating convincing deepfakes, such as placing a person in front of the Eiffel Tower or having them eat spaghetti. In one test, the reporter’s husband could not distinguish the AI-generated video from real footage—the only clue was an unfamiliar bowl. However, the model also produced bizarre glitches: a stuffed deer suddenly changed orientation while skydiving, and a jar of honey transformed into a different bottle mid-scene.
Other issues included inconsistent character details. When the reporter asked Omni to remove antlers from a baby deer, the model removed them from one scene but added antlers to all others. Similarly, text-based editing often produced strange results, such as exaggerated facial reactions that looked unnatural.
Historical Context and Background
Google’s work on generative video models traces back to projects like Veo and earlier research in diffusion models. The company first showcased its AI video capabilities at Google I/O 2024, but Omni represents a more polished and accessible iteration. Competitors such as OpenAI’s Sora and Meta’s Make-A-Video have also pushed boundaries, but Google’s integration with its ecosystem gives Omni a unique advantage.
The development of Omni is part of a larger trend where AI-generated content is becoming indistinguishable from reality. This raises concerns about misinformation, identity theft, and the erosion of trust in visual media. Governments and tech companies are scrambling to implement safeguards, such as digital watermarks and detection tools, but the pace of innovation often outstrips regulation.
Deepfake Realism and Its Implications
In tests, Omni produced deepfakes that fooled close family members. One video showed the reporter eating pasta, and her husband believed it was authentic. Another clip placed her in an airplane seat, with only a duplicate background figure hinting at manipulation. The Eiffel Tower scene was slightly cartoonish in some versions but convincing enough to pass casual inspection.
The ease of creating such content—requiring only a short selfie video and a text prompt—worries experts. Nefarious actors could use Omni to generate non-consensual deepfakes, spread false narratives, or impersonate individuals. While Google has implemented content policies and credit limits, the model’s availability to paying subscribers (starting at $20 per month) means it is accessible to a wide audience.
Cost and Accessibility
Omni runs on a credit system. Generating a video costs 15 to 40 credits, depending on length and input complexity. Edits cost 40 credits per round. The Pro plan, priced at $20 per month, includes 1,000 credits. During testing, generating about 20 clips with a few edits consumed 855 credits, leaving only 145. For users with specific visions, this can lead to costly iterations.
Despite the expense, the low barrier to entry is alarming. Anyone with a Google account and a credit card can now produce videos that appear authentic. The technology is still imperfect—glitches persist—but the trend is clear: AI-generated video is becoming cheaper and more convincing by the day.
Looking Ahead
Google has not announced when Omni will expand beyond video generation, but the roadmap suggests future versions will accept audio, images, and text as inputs for any output. This could revolutionize fields like education, entertainment, and advertising. However, it also amplifies the need for ethical guidelines and robust detection mechanisms.
As the line between real and synthetic blurs, society must confront uncomfortable questions. The same tools that let a parent create a cute video of a stuffed deer can also be weaponized. The uncanny valley is now a playground where authenticity is negotiable, and the only certainty is that the technology will keep evolving.
Source: The Verge News