Inside the 1930s Vintage Language Model Called Talkie - DTNS 5257

Daily Tech News Show

Tom Merritt

News, Technology

4.8 • 1.5K Ratings

🗓️ 28 April 2026

⏱️ 34 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

Valve's new Steam Controller is purpose-built for its ecosystem and nothing else, and the "Ask YouTube" search experiment is like a version of Gemini that is specific to YouTube video content.

Starring Jason Howell and Tom Merritt.

Show notes can be found here.

Hosted on Acast. See acast.com/privacy for more information.

Transcript

Click on a timestamp to play from that location

0:00.0	This is the Daily Tech News for Tuesday, April 28th, 2026.
0:13.2	We tell you what you need to know, give you the important context, and help each other understand.
0:17.6	And today, a large language model from the year 1930. 30. 30. That's language model
0:25.6	talks like this. Yes. That's right. I'm Jason Howell. I'm Tom Merritt. I can't wait to talk about
0:32.6	this week's big story. Let's get to it.
0:43.8	And I mean, define big, right? Like, it's just one of those days where there was no like premier marquee story. So of course, we're going to talk about a large language model from the
0:48.2	1930s, sort of. It's big. It's really big. Big and big.
0:55.0	It's 13 billion big.
0:57.0	It's Taki, 1930, and it's a new 13 billion open weight LLM, or in this case, it's actually
1:06.0	called a VLM, stands for vintage language model.
1:10.0	It's trained on 260 billion tokens of pre-1931 English
1:16.8	books, newspapers, journals, patents, case law. But the important thing there is that it has a
1:23.8	historical cutoff of December 31st, 1930, which is actually the cutoff for works
1:31.2	to enter the public domain. So, you know, it wasn't just chosen randomly. That basically
1:35.5	means that all of this content that's fed into it is legal to use in this capacity. It's
1:40.7	developed by a nonprofit team that is using the model to look, basically to explore a few key research angles.
1:48.6	The first one is because Taki only saw pre-1931 text, it essentially avoids benchmark contamination and will, as a result, better show how much LMs language models can
2:04.5	actually generalize beyond their training data.
2:08.9	Secondly, by checking how surprised Taki is by the New York Times on this day events.
2:17.0	So they're feeding them these on this day, on this day events that happened in the future after 1930.
2:23.3	Researchers can then measure how well a language model can forecast future outcomes as it moves past or beyond its knowledge cutoff date.
2:34.0	So it's a good way to kind of explore that.
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from Tom Merritt, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of Tom Merritt and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.