Backstabbing, bluffing and playing dead: has AI learned to deceive?

Science Weekly

The Guardian

Science

4.2 • 1K Ratings

🗓️ 14 May 2024

⏱️ 16 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

As AI systems have grown in sophistication, so has their capacity for deception, according to a new analysis from researchers at Massachusetts Institute of Technology (MIT). Dr Peter Park, an AI existential safety researcher at MIT and author of the research, tells Ian Sample about the different examples of deception he uncovered, and why they will be so difficult to tackle as long as AI remains a black box. Help support our independent journalism at theguardian.com/sciencepod

Transcript

Click on a timestamp to play from that location

0:00.0	This is the Guardian.
0:09.0	Throughout history, humans have used deception to gain the upper hand from the apocryphal Trojan horse
0:16.9	and the pilt-down man hoax to wartime Britain's operation mincemeat and the Blood Test fraudster Elizabeth Holmes.
0:26.7	For humans aren't the only ones who lie and mislead.
0:30.0	Plenty of animals camouflage themselves, play dead, feign injury, or use distraction to fool their
0:36.5	foes. And now it seems that artificial intelligence is in on the act. A new study has found that some AI systems have learned to lie and backstab, double cross and bluff.
0:52.0	So what could it mean if super intelligent autonomous AI is out to trick us?
0:58.0	I'm the Guardian Science Editor Ian Sample and this is Science Weekly.
1:07.0	I was a researcher of human cognitive science. I knew and cared very little about AI.
1:17.0	Dr Peter Park is now an AI existential safety postdoctoral fellow at MIT investigating deceptive AI.
1:26.0	His interests switch to artificial intelligence when he came across Darley II,
1:31.0	an AI system that can create realistic images and art. There was also a particular blog post by
1:38.1	Eliza Yudkowski, a researcher known as the founder of the field of AI safety.
1:43.6	It wasn't April Fool's post where Elias Rukaski satirically
1:48.1	stated that human survival from AI was unattainable and that we as humanity should try to die with dignity instead.
1:56.4	He was trying to make a point but also provoke emotional response likely and it actually spurred a lot of people like myself who were initially
2:06.2	outside of the field of AI safety to immediately start learning about AI and thinking about and working on how to make AI go well for humans.
2:15.0	Peter and his colleagues recently published a review on how some AI systems have learned to deceive, manipulate and lie to humans in a range of situations.
2:26.7	But there was one particular AI system that set off their research,
2:30.9	Metas Cicero, an AI trained to play a strategy game called diplomacy.
2:36.6	In the game, players take on the role of major pre-World War I European powers, vying to take control of as much of the map as possible by forming
2:45.5	alliances betraying each other and negotiating mutually beneficial strategies.
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from The Guardian, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of The Guardian and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.