meta_pixel
Tapesearch Logo
Log in
The a16z Show

Google DeepMind Lead Researchers on Genie 3 & the Future of World-Building

The a16z Show

a16z

Science, Innovation, Business, Entrepreneurship, Culture, Disruption, Software Eating The World, Technology

4.41.1K Ratings

🗓️ 16 August 2025

⏱️ 41 minutes

🧾️ Download transcript

Summary

Genie 3 can generate fully interactive, persistent worlds from just text, in real time. In this episode, Google DeepMind’s Jack Parker-Holder (Research Scientist) and Shlomi Fruchter (Research Director) join Anjney Midha, Marco Mascorro, and Justine Moore of a16z, with host Erik Torenberg, to discuss how they built it, the breakthrough “special memory” feature, and the future of AI-powered gaming, robotics, and world models. They share: -How Genie 3 generates interactive environments in real time -Why its “special memory” feature is such a breakthrough -The evolution of generative models and emergent behaviors -Instruction following, text adherence, and model comparisons -Potential applications in gaming, robotics, simulation, and more -What’s next: Genie 4, Genie 5, and the future of world models This conversation offers a first-hand look at one of the most advanced world models ever created.

Transcript

Click on a timestamp to play from that location

0:00.0

All of the applications basically stem from the ability to generate a world

0:05.3

that just from a few words, you look at it and there's a world that's generated in front of your eyes

0:10.2

and it's amazing that it's happening.

0:12.0

I was very excited about how far can we push that.

0:15.1

And it's at the point where a human who is not an expert will watch it and think it looks real,

0:20.3

right? And I think that's

0:21.3

pretty incredible. Gene E3 from Google DeepMind can create fully interactive, persistent

0:27.1

worlds in real time from just a few words. Today, we're joined by the team behind it.

0:32.6

Shlami Fookter and Jack Parker Holder from Google DeepMine plus Anjane Midha, Marco Mascoro, and Justine Moore from A16Z.

0:40.9

We'll talk about how it works, the special memory that keeps world consistent, the surprising

0:45.3

behaviors have learned, and where world models are headed next. Let's get into it.

0:52.9

Jack, Shlomi, Gini 3 has taken over the internet.

0:56.0

We're honored to have you on the podcast today.

0:58.2

As the response surprised you, reflect a little bit about the reaction.

1:01.9

We weren't sure how big is going to be, but today felt definitely that we have something

1:06.2

that was for a long time coming, basically being able to generate environments in real time. I think a lot of work that was done in Google DeepMine and outside pointed to that direction, but we really wanted to make it happen, and I hope we have. Team, why don't we reflect internally a little bit about what we found so game-changing about Gene3 and why we're so excited to have this conversation, Mark? Yeah, for sure. I mean, first of all, it's an amazing model. I think there's a lot of excitement around the special memory, the consistency across all the frames. I think this is the first time I can see like you can have some sort of interactive way of doing this stuff with videos because it used to be like, you would do one problem and you would have 15 seconds of a video, but now you can actually have some sort of interactive kind of element to it,

1:48.0

which I think is very exciting. So can you elaborate a little bit more like your insights on

1:52.4

these, like how was like, for example, figure out what data you should collect, how you make

1:57.6

it very interactive and keeping the flow of the whole video, which I thought

2:01.4

was phenomenal.

2:02.5

Sure, yeah.

2:03.1

So I think you kind of highlighted a few capabilities, sort of the length of the generation,

...

Please login to see the full transcript.

Disclaimer: The podcast and artwork embedded on this page are from a16z, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of a16z and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.

Copyright © Tapesearch 2025.