#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Lex Fridman Podcast
Lex Fridman
4.7 • 13.6K Ratings
🗓️ 31 July 2021
⏱️ 156 minutes
🧾️ Download transcript
Summary
Ishan Misra is a research scientist at FAIR working on self-supervised visual learning. Please support this podcast by checking out our sponsors:
– Onnit: https://lexfridman.com/onnit to get up to 10% off
– The Information: https://theinformation.com/lex to get 75% off first month
– Grammarly: https://grammarly.com/lex to get 20% off premium
– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil
EPISODE LINKS:
Ishan’s twitter: https://twitter.com/imisra_
Ishan’s website: https://imisra.github.io
Ishan’s FAIR page: https://ai.facebook.com/people/ishan-misra/
PODCAST INFO:
Podcast website: https://lexfridman.com/podcast
Apple Podcasts: https://apple.co/2lwqZIr
Spotify: https://spoti.fi/2nEwCF8
RSS: https://lexfridman.com/feed/podcast/
YouTube Full Episodes: https://youtube.com/lexfridman
YouTube Clips: https://youtube.com/lexclips
SUPPORT & CONNECT:
– Check out the sponsors above, it’s the best way to support this podcast
– Support on Patreon: https://www.patreon.com/lexfridman
– Twitter: https://twitter.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Medium: https://medium.com/@lexfridman
OUTLINE:
Here’s the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time.
(00:00) – Introduction
(07:49) – Self-supervised learning
(16:24) – Self-supervised learning is the dark matter of intelligence
(20:17) – Categorization
(28:50) – Is computer vision still really hard?
(32:35) – Understanding Language
(42:14) – Harder to solve: vision or language
(48:59) – Contrastive learning & energy-based models
(52:59) – Data augmentation
(57:19) – Fixed audio spike by lowering sound with pen tool
(1:05:33) – Real data vs. augmented data
(1:09:16) – Non-contrastive learning energy based self supervised learning methods
(1:12:54) – Unsupervised learning (SwAV)
(1:15:37) – Self-supervised Pretraining (SEER)
(1:20:44) – Self-supervised learning (SSL) architectures
(1:26:43) – VISSL pytorch-based SSL library
(1:29:38) – Multi-modal
(1:37:06) – Active learning
(1:42:45) – Autonomous driving
(1:54:12) – Limits of deep learning
(1:58:19) – Difference between learning and reasoning
(2:03:26) – Building super-human AI
(2:11:14) – Most beautiful idea in self-supervised learning
(2:15:02) – Simulation for training AI
(2:18:27) – Video games replacing reality
(2:19:40) – How to write a good research paper
(2:24:08) – Best programming language for beginners
(2:25:01) – PyTorch vs TensorFlow
(2:28:26) – Advice for getting into machine learning
(2:30:31) – Advice for young people
(2:32:58) – Meaning of life
Transcript
Click on a timestamp to play from that location
| 0:00.0 | The following is a conversation with Ishan Mizra, |
| 0:03.2 | research scientist at Facebook AI Research, |
| 0:05.8 | who works on self-supervised machine learning |
| 0:08.6 | in the domain of computer vision. |
| 0:10.4 | Or, in other words, making AI systems |
| 0:13.4 | understand the visual world with minimal help from us humans. |
| 0:18.0 | Transformers and self-attention |
| 0:20.4 | has been successfully used by OpenAI's DPT3 |
| 0:23.7 | and other language models |
| 0:25.6 | to do self-supervised learning in the domain of language. |
| 0:28.6 | Ishan, together with Yanlecun and others, |
| 0:31.8 | is trying to achieve the same success |
| 0:33.9 | in the domain of images and video. |
| 0:36.4 | The goal is to leave a robot watching YouTube videos all night |
| 0:40.3 | and in the morning come back to a much smarter robot. |
| 0:43.6 | I read the blog post self-supervised learning |
| 0:46.0 | the dark matter of intelligence by Ishan and Yanlecun |
| 0:50.3 | and then listened to Ishan's appearance on the excellent |
| 0:54.6 | machine learning street talk podcast. |
| 0:57.2 | And I knew I had to talk to him. |
| 0:59.1 | By the way, if you're interested in machine learning |
| 1:01.7 | and AI, I cannot recommend the ML Street Talk podcast highly enough. |
... |
Please login to see the full transcript.
Disclaimer: The podcast and artwork embedded on this page are from Lex Fridman, and are the property of its owner and not affiliated with or endorsed by Tapesearch.
Generated transcripts are the property of Lex Fridman and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.
Copyright © Tapesearch 2026.

