The Self-Preserving Machine: Why AI Learns to Deceive

Your Undivided Attention

Center for Humane Technology

Tristan Harris, Socialjustice, Tech Podcast, Character Ai, Little Tech, Ai History, Silicon Valley, Privacy, Daniel Barcay, Addiction, Ai Addiction, Chat Bots, Children And Tech, Tech Policy, Responsibleai, Tech, New Ai Shows, Screen Time, Open Ai, Elections, Kids Tech, Google, Ai And Kids, Politicsandai, Politics, Anthropic, Dataprivacy, Humans, Tech And Relationships, Us Politics, Ai And Relationships, Aiandhumanrights, Civictech, Aiinsociety, Surveillance, Sam Altman, Technopoly, Humancenteredai, Breakdown Of Trust, Ai And Work, Ai And The Future, Democracy, Futureofwork, Tech Politics, Tech Ethics, Future, Tech Addiction, Asi, Kids Phone Addiction, Best Ai Shows, Ai Regulations, Meta, Digitalgovernance, Bigtech, Ai And Happiness, Machinelearning, Screentime, Relationships, Ai Welfare, Ai Podcast, Cognitive Liberty, Infinite Scroll, Ai And Education, Kids And Ai, Ai Politics, Apple, Digitaldemocracy, Claude, Llms, Societalimpact, Artificial General Intelligence, Agi, Machines, Us Society, Politicaltechnology, Disinformation, Ai And Rights, Elon Musk, Government, Aiaccountability, Polarization, Jon Haidt, Algorithmicbias, Ai Personhood, Kids Online Safety, Superintelligence, Techandsociety, Automation, Design Ethics, News, Time Well Spent, Tech News, Society & Culture, Humane Design, Technology, Cht, Artificial Intelligence, Center For Humane Technology, The Social Dilemma Netflix, Philosophy, Human Downgrading, Aza Raskin, Attention Economy, Ethical Technology

4.8 • 1.5K Ratings

🗓️ 30 January 2025

⏱️ 35 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie. AI researcher Ryan Greenblatt comes on the show to explore why.

Transcript

Click on a timestamp to play from that location

0:00.0	Hey everyone, it's Daniel.
0:06.0	Tristan will be back for our next episode.
0:08.0	But before we get started today, I wanted to share a couple of exciting announcements.
0:12.0	The first is that full video episodes of your undivided attention are now on YouTube.
0:17.0	You can find video versions of the last few episodes, including this one, on our YouTube
0:21.2	channel.
0:22.0	And if you're already watching this on YouTube, welcome.
0:24.4	Second, CHT is launching a substack.
0:27.2	You can subscribe for our latest thinking, updates from our policy team, explainers on the latest
0:30.9	developments in AI, annotated transcripts of our podcast, and much more.
0:36.1	We'll link to both of those in the show notes. And now,
0:38.5	on to the show. So one of the things that makes AI different from any previous technology
0:45.1	is it has a kind of morality, not just a set of brittle rules, but a whole system of values.
0:52.3	To be able to speak language is to be able to discuss and use human
0:55.9	values. And we want AI to share our values to be able to behave well in the world. It's the reason
1:01.8	why chat GPT won't tell you how to commit a crime or make violent images. But the thing that
1:06.8	most people don't realize is that when you ask an AI system to do something against those values,
1:12.6	it can trigger a kind of a moral crisis of competing values.
1:16.6	The AI wants to be helpful to you, the user, but it also doesn't want to answer your prompt.
1:22.6	It can weigh its options and come to a decision.
1:24.6	In short, it can think morally in much the same way that people do.
1:29.6	Now, AI can be stubborn and try to stick to the values that it learned. So what happens when
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from Center for Humane Technology, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of Center for Humane Technology and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.