Navigate Select ESC Close

Why can’t LLMs just LEARN the context window?

2026-03-30 Science & Technology
30.9k
1.6k
125
bycloud
bycloud
225.0k subscribers

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Description

Check out HubSpot's FREE 2026 Guide to AI Agents: https://clickhubspot.com/3972be In this video, I'll be breaking down a new approach to long-context LLMs called test-time training (TTT-E2E), where models store past context directly in their weights instead of relying on attention or KV caches. Kind of like meta learning, but with gradient descent. my latest project: Intuitive AI Academy We just wrote a new piece on MoE! https://intuitiveai.academy/ limited time code "EARLY" for 40% off yearly plan! TTT-E2E [Paper] https://arxiv.org/abs/2512.23675 Appeared papers [Titans] https://arxiv.org/abs/2501.00663 [Kimi Linear] https://arxiv.org/abs/2510.26692 My Newsletter https://mail.bycloud.ai/ My Patreon https://www.patreon.com/c/bycloud Try out my new fav place to learn how to code https://scrimba.com/?via=bycloudAI This video is supported by the kind Patrons & YouTube Members: 🙏Spam Maj, Alex, Chris LeDoux, DX Research Group, Poof N' Inu, Deagan, Robert Zawiasa, Ryszard Warzocha, Tobe2d, Louis Muk, Akkusativ, Kevin Tai, Mark Buckler, NO U, Tony Jimenez, Ângelo Fonseca, jiye, Anushka, Asad Dhamani, Binnie Yiu, Calvin Yan, Clayton Ford, Diego Silva, Etrotta, Gonzalo Fidalgo, Handenon, Hector, Jake Disco very, Michael Brenner, Nilly K, OlegWock, Daddy Wen, Shuhong Chen, Sid_Cipher, Stefan Lorenz, Sup, tantan assawade, Thipok Tham, Thomas Di Martino, Thomas Lin, Richárd Nagyfi, Paperboy, mika, Leo, Berhane-Meskel, Kadhai Pesalam, mayssam, Bill Mangrum, nyaa, Toru Mon, Lame Plane, Matej Macak, Len Mo, saylikhapekar, Zyansheep [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Business Inquiries] [email protected] [Profile & Banner Art] https://twitter.com/pygm7 [Video Editor] @Booga04 [Ko-fi] https://ko-fi.com/bycloudai

Top Comments (10)

@jibcot8541 2026-03-30

Intresting method. I wonder if they could just train the weight updates into loras and then just have an index/database of different subjects/conversions. It would probably be more efficient that saving the whole context but still be searchable.

79 12 replies
@rany615 2026-03-30

5:57 kind of reminds of how our brain works at night in order to commit things to long term memory, so when there's a period that the model isn't getting any input

71 9 replies
@IDNKEK 2026-03-30

It all goes in the direction of how out brains works, huh

47 5 replies
@miniminerx 2026-03-30

I thought we would have done something like this long ago, but with short term and long term memory similar to the brain

30 3 replies
@bycloudAI 2026-03-30

Check out HubSpot's FREE 2026 Guide to AI Agents: https://clickhubspot.com/3972be

8 1 replies
@Rizhiy13 2026-03-31

Probably need to assign importance weight to each token and then scale loss by that during remembering, so model learns relevant stuff, but doesn't over-index on trivial information.

5
@TheLiverX 2026-03-31

It's probably gonna go like: more flexible last layers, LoRAs, importance weighting to memorize only the important things (probably using attention), special harness for "recalling" information

1
@richardfredlund8846 2026-03-30

there was Light Mem paper 5 months ago which no one seems to be talking about

1
@stephaneduhamel7706 2026-04-01

It would be funny to see that kind of LLM have to take a short "nap" after every token batch to store its most recent memories.

0
@marleymomo9582 2026-04-01

latest Kimi's Attention Residuals is awesome. Do that next. Model can check past attention dynamically, hence its scales better. Accuracy improved especially for complex tasks.

0

Unlock the Data Inside
Turn Videos into Knowledge

  • Get FREE 10/day: transcripts, summaries, chats
  • Chat with videos, export text & PDF
  • $1 free API credit for RAG, chatbots & research

Free forever plan • All features unlocked

App screenshot