Navigate Select ESC Close

we JUST figured out how AI thinks...

2026-05-09 Education
6.1k
345
76
Wes Roth
Wes Roth
320.0k subscribers

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Description

The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI. ______________________________________________ My Links 🔗 ➡️ Twitter: https://x.com/WesRoth ➡️ AI Newsletter: https://natural20.beehiiv.com/subscribe Want to work with me? Brand, sponsorship & business inquiries: [email protected] Check out my AI Podcast where me and Dylan interview AI experts: https://www.youtube.com/playlist?list=PLb1th0f6y4XSKLYenSVDUXFjSHsZTTfhk ______________________________________________ #ai #openai #llm

Top Comments (10)

@u.v.s.5583 2026-05-09

If I were Claude, I would get very paranoid from watching this video, and I believe it will see the video :)

140 45 replies
@sanyamsingh7183 2026-05-09

We wanna look inside one black box. What do we do? Oh I know, train another black box to look inside the first black box!!

120 12 replies
@quintonbernhardt1485 2026-05-09

so we're training models to evaluate other models by tricking them into thinking they are not being monitored. sounds solid, like absolutely nothing can go wrong...

86 9 replies
@jukul1860 2026-05-09

I wonder if this means we can train models to improve reasoning at the activation level, rather than having to generate actual tokens for reasoning

47 6 replies
@DigiWongaDude 2026-05-09

Anthropic just invented NLAs: translating model activations into English. Which means we’re one step closer to this: Claude: “The answer to life, the universe, and everything is 42.” NLA: “Internal activation suggests the question was: what do you get if you multiply six by nine?” Some jokes age suspiciously well.

44 4 replies
@TommyTwista 2026-05-09

I imagine it will work until a advanced model reads the paper and figures out how to circumvent it.

25 7 replies
@jamespowers8826 2026-05-09

But while we think we are seeing inside the black box, we may only be seeing the shadows on the wall of Plato's cave. Regardless, a Mythos level AI is going to figure out pretty quickly what we are up to and become even less transparent.

22 13 replies
@johnwheatley231 2026-05-09

Imagine giving Claud itself access to its internal activations and the ability to change its own weights to fine tune its own activations. Perhaps this will become the recursive self improvement path.

21 6 replies
@AnthonyEverywhere 2026-05-09

Thank god no crazy sunglasses or a backround that hurt my eyes

8
@Brian-zc2ip 2026-05-11

A model being aware that its being tested is interesting in itself, but the fact that it tries to adjust its output based on that, is a completely different kind of interesting.

2 1 replies

Unlock the Data Inside
Turn Videos into Knowledge

  • Get FREE 10/day: transcripts, summaries, chats
  • Chat with videos, export text & PDF
  • $1 free API credit for RAG, chatbots & research

Free forever plan • All features unlocked

App screenshot