meta_pixel
Tapesearch Logo
Log in
The a16z Show

What Comes After ChatGPT? The Mother of ImageNet Predicts The Future

The a16z Show

a16z

Science, Innovation, Business, Entrepreneurship, Culture, Disruption, Software Eating The World, Technology

4.41.1K Ratings

🗓️ 5 December 2025

⏱️ 62 minutes

🧾️ Download transcript

Summary

Fei-Fei Li is a Stanford professor, co-director of Stanford Institute for Human-Centered Artificial Intelligence, and co-founder of World Labs. She created ImageNet, the dataset that sparked the deep learning revolution. Justin Johnson is her former PhD student, ex-professor at Michigan, ex-Meta researcher, and now co-founder of World Labs. Together, they just launched Marble—the first model that generates explorable 3D worlds from text or images. In this episode Fei-Fei and Justin explore why spatial intelligence is fundamentally different from language, what's missing from current world models (hint: physics), and the architectural insight that transformers are actually set models, not sequence models.

Transcript

Click on a timestamp to play from that location

0:00.0

I think the whole history of deep learning is in some sense the history of scaling up compute.

0:04.0

When I graduated from grad school, I really thought the rest of my entire career would be towards solving that single problem, which is...

0:13.0

A lot of AI as a field, as a discipline, is inspired by human intelligence.

0:20.0

We thought we were the first people doing it.

0:22.6

It turned out that

0:23.6

was also simultaneously doing it.

0:26.6

So Marble, like basically one way of looking at it,

0:28.6

it's a generative model of 3D worlds, right?

0:31.6

So you can input things like text or image or multiple images

0:34.6

and it will generate for you a 3D world

0:36.6

that kind of matches those inputs.

0:38.6

So while Marvel is simultaneously a world model that is building towards this vision of spatial

0:43.1

intelligence, it was also very intentionally designed to be a thing that people could find

0:47.8

useful today. And we're starting to see emerging use cases in gaming, in BFX, in film, where I think there's

0:55.3

a lot of really interesting stuff that Marvel can do today as a product, and then also

0:59.6

set a foundation for the grand world models that we want to build going into the future.

1:06.7

Fay-Fei Lee is a Stanford professor, the co-director of the Stanford Institute for Human-Centured

1:11.4

Artificial Intelligence, and co-founder of World Labs.

1:15.0

She created ImageNet, the data set that sparked the deep learning revolution.

1:19.4

Justin Johnson is her former PhD student, ex-professor at Michigan, ex-Meta research, and now

1:25.3

co-founder of World Labs.

1:27.2

Together, they just launched Marvel,

...

Transcript will be available on the free plan in 16 days. Upgrade to see the full transcript now.

Disclaimer: The podcast and artwork embedded on this page are from a16z, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of a16z and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.

Copyright © Tapesearch 2025.