Transforming Video into Immersive Realms: The Odyssey AI Revolution

Ledger
Screenshot of virtual TVs as London-based AI lab Odyssey launches a research preview of a model transforming video into interactive worlds.
Coinmama

The London-based AI laboratory Odyssey AI has unveiled a research preview of a model that converts video into interactive environments. Initially concentrating on world models for cinema and gaming production, the Odyssey team has discovered the potential for a wholly new form of entertainment.

The interactive video produced by Odyssey’s AI system reacts to inputs in real-time. Users can engage with it utilizing their keyboard, smartphone, controller, or eventually even voice commands. The team at Odyssey AI is promoting it as an “initial version of the Holodeck.”

The core Odyssey AI can produce lifelike video frames every 40 milliseconds. Consequently, when you hit a key or perform a gesture, the video practically reacts instantly—creating the illusion that you’re genuinely affecting this digital universe.

“The experience currently resembles navigating a glitchy dream—rough, unstable, but undeniably innovative,” according to Odyssey AI. We’re not referring to refined, AAA-game quality visuals here, at least not at this stage.

Not your typical video technology

Let’s delve a little into the technical details for a moment. What differentiates this AI-generated interactive video technology from a conventional video game or CGI? It all hinges on what Odyssey refers to as a “world model.”

Binance

Unlike conventional video models that produce entire clips at once, world models operate frame-by-frame to predict the forthcoming images based on the existing state and any user interactions. It’s akin to how large language models anticipate the following word in a sequence, but far more intricate since we’re dealing with high-resolution video frames instead of mere words.

“At its essence, a world model is an action-conditioned dynamics model,” as Odyssey AI describes it. Each time you engage, the model considers the current state, your action, and the history of previous events, then generates the next video frame accordingly.

The outcome is something that feels more fluid and unpredictable than a conventional game. There’s no pre-determined logic dictating “if a player does X, then Y occurs”—instead, the AI formulates its best estimate of what should happen next, informed by observing countless videos.

Odyssey AI confronts historic challenges with AI-generated video

Creating something like this isn’t precisely simple. One of the major challenges with AI-generated interactive video is maintaining stability over time. When each frame is produced based on its predecessors, minor inaccuracies can quickly accumulate (a phenomenon AI researchers note as “drift.”)

To address this, Odyssey AI has implemented what they call a “narrow distribution model”—essentially pre-training their AI on a broad spectrum of video footage and then fine-tuning it on a more limited selection of environments. This compromise offers less variation but greater stability, preventing everything from turning into an odd mess.

The company asserts they are already achieving “rapid advances” on their next-generation model, which reportedly exhibits “a richer variety of pixels, dynamics, and actions.”

Executing all this advanced AI technology in real-time is not inexpensive. Presently, the infrastructure supporting this experience costs between £0.80-£1.60 (1-2) for every user-hour, relying on clusters of H100 GPUs distributed across the US and EU.

This might seem pricey for streaming video, but it’s surprisingly economical compared to creating traditional gaming or film content. Furthermore, Odyssey AI anticipates these costs to decline as models become more efficient.

Interactive video: The next storytelling medium?

Throughout history, fresh technologies have spawned novel storytelling mediums—from cave paintings to books, photography, radio, film, and video games. Odyssey posits that AI-generated interactive video is the next phase in this progression.

If they are correct, we may be witnessing the prototype of something capable of revolutionizing entertainment, education, advertising, and beyond. Envision training videos where you can rehearse the skills being imparted, or travel experiences allowing you to explore locations from the comfort of your couch.

The current research preview is clearly just a small stride towards this vision and more of a proof of concept than a finalized product. However, it provides a fascinating glimpse into what may be achievable when AI-generated worlds evolve into interactive playgrounds rather than merely passive encounters.

You can sample the research preview here.

See also: Telegram and xAI finalize Grok AI partnership

Interested in expanding your knowledge about AI and big data from leading industry experts? Attend the AI & Big Data Expo held in Amsterdam, California, and London. This extensive event is co-located with other prominent events, including the Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Discover additional upcoming enterprise technology events and webinars powered by TechForge here.

Ledger

Be the first to comment

Leave a Reply

Your email address will not be published.


*