Building AlphaGo from scratch – Eric Jang

2026-05-15 Science & Technology

7.6k

659

1.4m subscribers

Description

Eric Jang walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Check out the flashcards I wrote to retain the insights: https://flashcards.dwarkesh.com/eric-jang/ * Transcript: https://www.dwarkesh.com/p/eric-jang 𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒 - Cursor's agent SDK let me build a pipeline to generate flashcards for this episode. For each card, I had an agent read the transcript, ingest blackboard screenshots, generate an SVG visual, and run everything through a critic. A durable agent is much better at this kind of work than a chain of LLM calls, and Cursor's SDK made it easy. Check out the cards at https://flashcards.dwarkesh.com and get started with the SDK at https://cursor.com/dwarkesh - Jane Street gave me a real deep-dive tour of one of their datacenters. I got to ask a bunch of questions to Ron Minsky, who co-leads Jane Street's tech group, and Dan Pontecorvo, who runs Jane Street's physical engineering team. They were willing to literally pull up the floorboards and take out racks to explain how everything works. Check out the full tour at https://janestreet.com/dwarkesh To sponsor a future episode, visit https://dwarkesh.com/advertise. 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 – Basics of Go 00:08:06 – Monte Carlo Tree Search 00:31:53 – What the neural network does 01:00:22 – Self-play 01:25:27 – Alternative RL approaches 01:45:36 – Why doesn’t MCTS work for LLMs 02:00:58 – Off-policy training 02:11:51 – RL is even more information inefficient than you thought 02:22:05 – Automated AI researchers

Top Comments (10)

@rajatady 2026-05-15

This blackboard setup is so underrated. Thanks for making it happen.

167 1 replies

@skyecase 2026-05-15

Really loving the new blackboard style on the podcast , it makes the conversations feel much more interactive and easier to follow visually. One thing that could make it even better: using a shared Excalidraw-style board (or a similar collaborative whiteboard) synced on both your and the guest’s tabs. Right now the blackboard works well, but things disappear a bit too quickly, especially during dense explanations. It would also be amazing if the session link for the whiteboard could be added somewhere in the UI or description so viewers could revisit the diagrams and notes afterward.

103 7 replies

@adrian.valentim 2026-05-15

Nice! Keep the blackboard episodes coming!

81 1 replies

@abhijitpradhan9831 2026-05-15

Patel has stepped up the whole podcast game

@DwarkeshPatel 2026-05-15

I wrote some flashcards to retain the content from lecture. Might be useful to you too: https://flashcards.dwarkesh.com/eric-jang/

36 2 replies

@invinoa 2026-05-15

This is a pretty valuable explanation tbh. Good job inviting him.

17 1 replies

@karimalmoukhtar 2026-05-16

the videos on this channel are unbelievable.

@Hahalol663 2026-05-16

It is astonishing that deep technical content of this high quality is available for free. Thank you for your amazing work Dwarkesh

@TheBlackClockOfTime 2026-05-16

Okay yeah. I'm 8 minutes into this episode and for the first time in my life I a) understand Go b) want to start actually studying deep learning. Thank you Dwarkesh. This is really good.

@etesianSealine 2026-05-19

I've worked on deep-learned MCTS professionally and IMO this is an excellent explanation: factually precise, interesting historical context, and very stimulating connections to the broader field. Great work, Eric (and Dwarkesh for creating the substrate for that to happen).

Description

Top Comments (10)

@rajatady 2026-05-15

This blackboard setup is so underrated. Thanks for making it happen.

167 1 replies

@skyecase 2026-05-15

103 7 replies

@adrian.valentim 2026-05-15

Nice! Keep the blackboard episodes coming!

81 1 replies

@abhijitpradhan9831 2026-05-15

Patel has stepped up the whole podcast game

@DwarkeshPatel 2026-05-15

I wrote some flashcards to retain the content from lecture. Might be useful to you too: https://flashcards.dwarkesh.com/eric-jang/

36 2 replies

@invinoa 2026-05-15

This is a pretty valuable explanation tbh. Good job inviting him.

17 1 replies

@karimalmoukhtar 2026-05-16

the videos on this channel are unbelievable.

@Hahalol663 2026-05-16

It is astonishing that deep technical content of this high quality is available for free. Thank you for your amazing work Dwarkesh

@TheBlackClockOfTime 2026-05-16

Okay yeah. I'm 8 minutes into this episode and for the first time in my life I a) understand Go b) want to start actually studying deep learning. Thank you Dwarkesh. This is really good.

@etesianSealine 2026-05-19

Unlock the Data Inside
Turn Videos into Knowledge

Get FREE 10/day: transcripts, summaries, chats
Chat with videos, export text & PDF
$1 free API credit for RAG, chatbots & research

Try it free

Free forever plan • All features unlocked

Building AlphaGo from scratch – Eric Jang

Description

Top Comments (10)

Related videos

The data black hole at the center of AI

Chip design from the bottom up – Reiner Pope

Jensen Huang – Will Nvidia’s moat persist?

Dylan Patel — The Single Biggest Bottleneck to Scaling AI Compute

What are we scaling?

Blueprint to Build a $1M SaaS From Scratch

Ilya Sutskever – We're moving from the age of scaling to the age of research

Satya Nadella – How Microsoft is preparing for AGI

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Richard Sutton – Father of RL thinks LLMs are a dead end

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

The data black hole at the center of AI

Chip design from the bottom up – Reiner Pope

Jensen Huang – Will Nvidia’s moat persist?

Dylan Patel — The Single Biggest Bottleneck to Scaling AI Compute

What are we scaling?

Blueprint to Build a $1M SaaS From Scratch

Ilya Sutskever – We're moving from the age of scaling to the age of research

Satya Nadella – How Microsoft is preparing for AGI

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Richard Sutton – Father of RL thinks LLMs are a dead end

Description

Top Comments (10)

Unlock the Data Inside
Turn Videos into Knowledge

Building AlphaGo from scratch – Eric Jang

Description

Top Comments (10)

Related videos

The data black hole at the center of AI

Chip design from the bottom up – Reiner Pope

Jensen Huang – Will Nvidia’s moat persist?

Dylan Patel — The Single Biggest Bottleneck to Scaling AI Compute

What are we scaling?

Blueprint to Build a $1M SaaS From Scratch

Ilya Sutskever – We're moving from the age of scaling to the age of research

Satya Nadella – How Microsoft is preparing for AGI

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Richard Sutton – Father of RL thinks LLMs are a dead end

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

The data black hole at the center of AI

Chip design from the bottom up – Reiner Pope

Jensen Huang – Will Nvidia’s moat persist?

Dylan Patel — The Single Biggest Bottleneck to Scaling AI Compute

What are we scaling?

Blueprint to Build a $1M SaaS From Scratch

Ilya Sutskever – We're moving from the age of scaling to the age of research

Satya Nadella – How Microsoft is preparing for AGI

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Richard Sutton – Father of RL thinks LLMs are a dead end

Description

Top Comments (10)

Unlock the Data Inside Turn Videos into Knowledge

Unlock the Data Inside
Turn Videos into Knowledge