Inside the Mind of AI (Robert Wright & Nora Belrose)

Robert Wright's Nonzero

Nonzero

News & Politics, Society & Culture, Philosophy

4.7 • 618 Ratings

🗓️ 13 February 2024

⏱️ 61 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

This is a free preview of a paid episode. To hear more, visit www.nonzero.org

1:57 Nora’s low score on the AI doomer test 8:26 The messy math of AI alignment 13:41 Teaching and (unteaching) concepts to LLMS 18:06 Do we really want human-like AI? 28:48 How the AI “mind” works 40:41 How to change an AI’s mind 53:19 Are LLMs reverse-engineering the human mind? 57:26 Heading to Overtime

Robert Wright (Nonzero, The Evolution of God, Why Buddhism Is True) and Nora Belrose (EleutherAI). Recorded January 16, 2024.

Twitter: https://twitter.com/NonzeroPods

Overtime titles:

0:00 Eleuther’s extra open open-source AIs 4:29 Is the human mind a general learning machine? 7:18 Why Nora supports open-source AI 11:42 How could we tell if AIs are conscious? 23:31 The new contender for reinforcement learning supremacy 29:57 Inside the minds of the AI doomers 40:52 Bob’s main sources of AI anxiety

Transcript

Click on a timestamp to play from that location

0:00.0	You're listening to Robert Wright's Non-Zero Podcast.
0:33.4	Hi, Nora.
0:35.3	Hey, Bob.
0:36.6	How are you doing?
0:38.2	Pretty good.
0:39.6	How about yourself?
0:40.6	I can't complain.
0:42.1	Let me introduce this.
0:43.0	I'm Robert Wright, publisher of a non-zero newsletter.
0:46.3	This is the non-zero podcast.
0:47.8	You are Nora Bellrose.
0:49.8	You are at Illuther AI, a nonprofit AI research group that has kind of an interesting history that maybe we'll talk about.
0:57.6	And you are head of interpretability, which is interesting to me because that means one thing you're doing is trying to figure out exactly how these large language models in particular, I guess, work, which is something
1:13.8	I'm interested in. And I'm interested in the fact that it's more of a mystery than one might
1:19.5	have imagined at one point, how these AIs actually work. I mean, more of a mystery, even to the
1:25.6	people who built them.
1:34.9	Also, interpretability is closely related to the field of alignment. The idea being that once,
1:41.8	the more we understand about how they work, the easier it may be to align AI with human values,
1:50.8	human interests. So reduce the chances that it will kill us all in a quest for world domination or cause harms in less dramatic ways.
1:55.2	So I want to talk about all that.
1:58.1	Now, one interesting thing about you, compared to some people who are doing alignment-related research,
2:04.2	is that you're not that much of a doomer, right?
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from Nonzero, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of Nonzero and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.