180: Reinforcement Learning

Programming Throwdown

Patrick Wheeler and Jason Gauci

Objective C, Tech News, Programming Languages, News, Education, How To, C, Python, Programming Throwdown, Java

4.5 • 610 Ratings

🗓️ 17 March 2025

⏱️ 112 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

Patrick and Jason introduce reinforcement learning and place it alongside supervised and unsupervised learning. They cover Q-learning, SARSA, policy gradients, actor-critic methods, PPO, imitation learning, and why training and evaluating RL systems is so challenging.

Transcript

Click on a timestamp to play from that location

0:00.0	Programming Throwdown, Episode 180, Reinforcement Learning.
0:21.6	Take it away, Patrick.
0:23.0	Welcome to another episode.
0:25.0	This is going to be a good one.
0:26.4	Excited to be here, actually, because this is a topic I have been meaning to learn about,
0:30.3	and Jason has agreed to be put on his professor hat, robe.
0:35.7	I don't know what is a professor wear.
0:37.4	I got hooded. When I wear. Uh, I got,
0:37.8	I got hooded.
0:38.9	When I got the PhD,
0:39.9	I got hooded, which I thought would be an actual hood, but it's really just a sash. Wait. What is getting hooded? That's like what you get when you get, I don't know about this. Okay. So when you get a PhD, you get hooded,
0:34.9	which means you go through
0:36.2	the same ceremony
0:37.2	as the master's students,
0:38.7	or I think the same ceremony
0:39.6	is everybody. get a PhD, you get hooded, which means you go through the same ceremony as the master's students,
0:56.0	or I think the same ceremony is everybody, but you get a hood, which is actually a sash,
1:01.9	and your PhD advisor actually puts the sash around you over you as part of the ceremony.
1:11.1	Okay.
1:11.6	I feel like maybe I've heard that term, but I always just kind of had some weird,
1:16.0	probably bad association with hood winked.
1:18.4	But anyways.
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from Patrick Wheeler and Jason Gauci, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of Patrick Wheeler and Jason Gauci and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.