Safety in Numbers: Keeping AI Open

The a16z Show

a16z

Software Eating The World, Science, Technology, Innovation, Culture, Disruption, Business, Entrepreneurship

4.2 • 1.2K Ratings

🗓️ 11 December 2023

⏱️ 35 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

Arthur Mensch is the co-founder of Mistral and the co-author of Deepmind’s pivotal 2022 "Chinchilla" paper. In September 2023, Mistral released Mistral-7B, an advanced open-source language model that has rapidly become the top choice for developers. Just this week, they introduced a new mixture of experts model – Mixtral — that’s already generating significant buzz among AI developers. As the battleground around large language models heats up, join us for a conversation with Arthur as he sits down with a16z General Partner Anjney Midha. Together, they delve into the misconceptions and opportunities around open source; the current performance reality of open and closed models; and the compute, data, and algorithmic innovations required to efficiently scale LLMs.

Transcript

Click on a timestamp to play from that location

0:00.0	I think the battle is for the neutrality of the technology.
0:04.0	This is the story of humanity, making knowledge access more fluid.
0:08.0	Basically in 2021, every paper made this mistake.
0:12.0	It means that we're only trusting the team of... one, every paper made this mistake.
0:12.8	It means that we're only trusting the team of large companies
0:15.8	to figure out ways of addressing these problems.
0:18.9	All of the people that joined us as well,
0:20.9	deeply regretted because we think that we are definitely not at the end of the story.
0:25.4	As it turns out, if you look at the history of software, the only way we did software collaboratively
0:29.6	is through open source.
0:30.5	So why change the recipe? Scaling loss.
0:34.0	These underpin the success of large language models today,
0:37.0	but the relationship between data sets, compute,
0:40.0	and the number of parameters was not always clear.
0:43.0	But in 2022, a pivotal paper came out,
0:46.0	often referred to as Chinchilla
0:48.0	that changed the way that many people in the research community
0:51.0	thought about that very calculus,
0:52.0	demonstrating that
0:53.4	data sets were actually more important than just the sheer size of the model.
0:57.4	One of the key authors behind that paper was Arthur Metched, who was working at Deep Mind at the time.
1:04.0	Now earlier this year, Arthur banded together with Giam Lampo and Timothy Lequois, two researchers
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from a16z, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of a16z and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.