Google DeepMind Developers: How Nano Banana Was Made

The a16z Show

a16z

Science, Innovation, Business, Entrepreneurship, Culture, Disruption, Software Eating The World, Technology

4.4 • 1.1K Ratings

🗓️ 28 October 2025

⏱️ 54 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

Google DeepMind’s new image model Nano Banana took the internet by storm. In this episode, we sit down with Principal Scientist Oliver Wang and Group Product Manager Nicole Brichtova to discuss how Nano Banana was created, why it’s so viral, and the future of image and video editing.

Transcript

Click on a timestamp to play from that location

0:00.0	These models are allowing creators to do less tedious parts of the job.
0:05.9	They can be more creative and they can spend, you know, 90% of their time being creative
0:10.9	versus 90% of their time like editing things and doing these tedious kind of manual operations.
0:15.9	I'm convinced that this ultimately really empowers artists, right?
0:19.6	It gives you new tools, right? It's like, hey, we now have, I don't know. What are colors for Michelangelo? Let's see what he does about that, right? And amazing things come out. One of the hardest challenges in AI isn't language or reasoning, its vision. Getting models to understand, compose, and edit images with the same precision that they process text. Today, you'll hear a conversation with Oliver Wang and Nicole Brechtova from Google
0:41.3	DeepMind about Gemini 2.5 image, also known as Nanobanana.
0:45.3	They discuss the architecture behind the model, how image generation and editing are integrated into
0:50.3	Gemini's multimodal framework, and what it takes to achieve character consistency,
0:55.0	compositional control, and conversational editing at scale.
0:58.0	They also touch on open questions and model evaluation, safety, and latency optimization,
1:03.0	and how visual reasoning connects to broader advances and multimodal systems.
1:08.0	Let's get into it.
1:17.4	Maybe start by telling us about the backstory behind the nanobanana model. How did it come to B? How did you all start working on it?
1:19.2	Sure. So our team has worked on image models for some time. We developed the Imagine family
1:25.1	of models, which goes back a couple years. And actually, there
1:27.9	was also an image generation model in Gemini before the Gemini 2.0 image generation models.
1:32.1	So what happened was the teams kind of started to focus more on the Gemini use cases, so
1:39.0	like interactive, conversational, and editing. And essentially what happened was we teamed up,
1:43.3	and we built this model,
1:44.8	which became what's known as nanobanana. So yeah, that's sort of the origin story.
1:49.4	Yeah, and I think maybe just some more background on that. So our imagined models were always
1:54.5	kind of top of the charts for visual quality and we really focused on kind of these specialized
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from a16z, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of a16z and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.