The Sunday Read: ‘Wikipedia’s Moment of Truth’

The Daily

The New York Times

Daily News, News

4.4 • 102.8K Ratings

🗓️ 10 September 2023

⏱️ 52 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

In early 2021, a Wikipedia editor peered into the future and saw what looked like a funnel cloud on the horizon: the rise of GPT-3, a precursor to the new chatbots from OpenAI. When this editor — a prolific Wikipedian who goes by the handle Barkeep49 on the site — gave the new technology a try, he could see that it was untrustworthy. The bot would readily mix fictional elements (a false name, a false academic citation) into otherwise factual and coherent answers. But he had no doubts about its potential. “I think A.I.’s day of writing a high-quality encyclopedia is coming sooner rather than later,” he wrote in “Death of Wikipedia,” an essay that he posted under his handle on Wikipedia itself. He speculated that a computerized model could, in time, displace his beloved website and its human editors, just as Wikipedia had supplanted the Encyclopaedia Britannica, which in 2012 announced it was discontinuing its print publication. Recently, when I asked this editor if he still worried about his encyclopedia’s fate, he told me that the newer versions made him more convinced that ChatGPT was a threat. “It wouldn’t surprise me if things are fine for the next three years,” he said of Wikipedia, “and then, all of a sudden, in Year 4 or 5, things drop off a cliff.”

Transcript

Click on a timestamp to play from that location

0:00.0	Hi, I'm John Gertner, I'm a contributor to the New York Times magazine, and I write about
0:09.2	science and technology.
0:12.3	This week's Sunday read is a story I wrote for the magazine about Wikipedia.
0:16.9	It's a story that explains how the 22-year-old, wonky online encyclopedia we've all consulted
0:23.6	at one point is so central to building artificial intelligence right now.
0:29.1	So over the last few years, computer scientists have been creating what are known as large
0:34.0	language models, which are the AI brains, the power, the chat bots, like chat GPT.
0:41.2	And in order to build a large language model, they needed to gather vast knowledge banks
0:46.2	of information.
0:47.2	And I mean, it's sort of dizzying how much information we're talking about here.
0:52.4	Some models ingest upwards of a trillion words.
0:56.5	And it all comes from public sources like Wikipedia or Reddit or Google's patent database.
1:03.6	What makes Wikipedia special is not just that it's free and accessible, but also that it's
1:08.5	very highly formatted.
1:10.7	It contains just a tremendous amount of factual information that's maintained by a community
1:16.3	of about 40,000 active editors in the English language version alone.
1:23.4	The problem with these new AI chat bots is that their fundamental goal is to converse
1:29.5	with a user with a kind of human fluency of language, but they're not built to regurgitate
1:35.9	data or to really be precise.
1:39.4	So whether you're trying to understand historical topics or political upheavals or pandemics, these
1:46.6	bots greatly simplify the world in a way that's maybe not conducive at all to our best
1:52.3	interests as human beings.
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from The New York Times, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of The New York Times and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.