meta_pixel
Tapesearch Logo
Log in
Programming Throwdown

177: Vector Databases

Programming Throwdown

Patrick Wheeler and Jason Gauci

Objective C, Java, Programming Throwdown, Education, News, Programming Languages, How To, Tech News, C, Python

4.6604 Ratings

🗓️ 4 November 2024

⏱️ 88 minutes

🧾️ Download transcript

Summary

Intro topic:  Buying a Car

News/Links:

Book of the Show


Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h


Tool of the Show

Topic: Vector Databases (~54 min)

  • How computers represent data traditionally
    • ASCII values
    • RGB values
  • How traditional compression works
    • Huffman encoding (tree structure)
    • Lossy example: Fourier Transform & store coefficients
  • How embeddings are computed
    • Pairwise (contrastive) methods
    • Forward models (self-supervised)
  • Similarity metrics
  • Approximate Nearest Neighbors (ANN)
  • Sub-Linear ANN
    • Clustering
    • Space Partitioning (e.g. K-D Trees)
  • What a vector database does
    • Perform nearest-neighbors with many different similarity metrics
    • Store the vectors and the data structures to support sub-linear ANN
    • Handle updates, deletes, rebalancing/reclustering, backups/restores
  • Examples
    • pgvector: a vector-database plugin for postgres
    • Weaviate, Pinecone 
    • Milvus

★ Support this podcast on Patreon ★

Transcript

Click on a timestamp to play from that location

0:00.0

Programming Throwdown Episode 177 Vector Databases.

0:20.4

Take it away, Patrick. Welcome everyone to another episode. We have a great

0:25.9

topic today. I'm excited to learn. It's legitimately something that I hear all about, but I don't

0:30.5

know too much about. So we're going to have teacher Jason joining us here at a minute.

0:33.8

That's right. We had teacher Patrick for your you know, compilers and interpreters,

0:38.2

and then we have now teacher Jason for vector dbs. I feel like we should have called ourselves

0:42.3

professors. We have missed opportunity there. Oh, that's right. Is there anything higher,

0:45.8

distinguished emeritus professor, emeritus, ermerti? Okay. What's the highest level professor, you know,

0:56.6

Ty?

0:56.9

Chat and GPT.

0:58.0

Can you?

1:00.1

Soul emperor of the knowledge universe.

1:05.1

Oh, but what if it thinks it's the smartest?

1:08.4

It's going to tell you a lie because, like,

1:10.8

it doesn't want you to be

1:11.7

superior. Okay. Anyway, no, sorry. Oh, you know, I have this plan now when, um, you know, when I get

1:16.5

these calls from people telling me that like my, my taxes are overdue or I should buy car insurance or

1:23.3

whatever. What I've started doing is asking it questions about particle physics. And when it gives me

1:29.1

really good answers, I'm like, aha, this is chat GPT. This is not even a real person. A particle

1:35.1

physicist could work in a call center. Okay. Yeah. I mean, I had to go looking for a car and they legitimately like, you know, want your phone number, whatever.

1:47.0

So I just have one of the like, you know, I guess like it's a voice over IP number, but it doesn't actually like ring through or whatever.

1:55.0

And the place, one of the places I went has called me twice a day every day for going on two weeks now.

...

Please login to see the full transcript.

Disclaimer: The podcast and artwork embedded on this page are from Patrick Wheeler and Jason Gauci, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of Patrick Wheeler and Jason Gauci and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.

Copyright © Tapesearch 2025.