The Death of RAG?

2026-03-16 Science & Technology

15.0k

1.0k

229.0k subscribers

Description

Check out Inngest and let your AI agents wear a harness now! https://www.inngest.com/docs?utm_source=youtube&utm_medium=video&utm_campaign=yt-bycl-3 In this video, we'll dive into the latest hype: Recursive Language Model, why it's actually pretty promising, and how it will change the way we use RAG. Check out my latest project: Intuitive AI Academy We just wrote a new piece on MoE and Distillation! https://intuitiveai.academy/ limited time code "EARLY" for 40% off yearly plan! My Newsletter https://mail.bycloud.ai/ my project: find, discover & explain AI research semantically https://findmypapers.ai/ My Patreon https://www.patreon.com/c/bycloud Recursive Language Models [Paper] https://arxiv.org/abs/2512.24601 Context Rot [Blog] https://research.trychroma.com/context-rot ChatGPT doesn't use RAG [Blog] https://manthanguptaa.in/posts/chatgpt_memory/ Try out my new fav place to learn how to code https://scrimba.com/?via=bycloudAI This video is supported by the kind Patrons & YouTube Members: 🙏Spam Maj, Alex, Chris LeDoux, DX Research Group, Poof N' Inu, Deagan, Robert Zawiasa, Ryszard Warzocha, Tobe2d, Louis Muk, Akkusativ, Kevin Tai, Mark Buckler, NO U, Tony Jimenez, Ângelo Fonseca, jiye, Anushka, Asad Dhamani, Binnie Yiu, Calvin Yan, Clayton Ford, Diego Silva, Etrotta, Gonzalo Fidalgo, Handenon, Hector, Jake Disco very, Michael Brenner, Nilly K, OlegWock, Daddy Wen, Shuhong Chen, Sid_Cipher, Stefan Lorenz, Sup, tantan assawade, Thipok Tham, Thomas Di Martino, Thomas Lin, Richárd Nagyfi, Paperboy, mika, Leo, Berhane-Meskel, Kadhai Pesalam, mayssam, Bill Mangrum, nyaa, Toru Mon, Lame Plane, Matej Macak, Len Mo, saylikhapekar [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Business Inquiries] [email protected] [Profile & Banner Art] https://twitter.com/pygm7 [Video Editor] @Booga04 [Ko-fi] https://ko-fi.com/bycloudai

#bycloud #bycloudai #recursive language models #RLM #recursive language model explained #RLM explained #recursive language model implementation #death of RAG

Top Comments (10)

@noname.megaseganame 2026-03-16

It's basically the same idea used in Claude Code or in Codex with subagents.

239 12 replies

@SinanWP 2026-03-16

i remember reading infini-attention paper too i believe it when i see it working....

109 1 replies

@EarthAaron 2026-03-16

this isn't really a recursive language model -- all of the other papers on this topic talk about recursion happening in the model architecture. This is just a context orchestration algorithm, like a tool chain or agent loop. Not an LM architecture.

64 3 replies

@neovoid5008 2026-03-16

The needle in a heystack test should start utilizing vagueness in the query more.

54 4 replies

@XMaster96DE 2026-03-16

9:00 and you just described RLM.... I really don't think why we need a distinction between RLM and an Agent, I mean in a sense RLM is agentic context management....

@BhaswataChoudhury 2026-03-17

**Peeks inside**: Its just rag with more LLMs with subagent pattern. Also RAG is not a vector database. Rag is retrieval augmented generation. It doesnt matter where or how you are retrieving it. RAG is an umbrella term for an LLM using retrieval via any method or tools.

@qaon5748 2026-03-16

good idea . ill have my agents work on it

@bycloudAI 2026-03-16

Check out Inngest and let your AI agents wear a harness now! https://www.inngest.com/docs?utm_source=youtube&utm_medium=video&utm_campaign=yt-bycl-3

5 2 replies

@Kabbinj 2026-03-17

This is already how a lot of us uses agents in for example OpenCode, one root orchestrator, which has a high level overview, gives direct commands to sub-agents, which does context heavy work, and then return compact data to the orchistrator. In my case, the orchstrator, after a 4-5 hour run, has only reached like 150k tokens, which is great. The overall token usage at this point is over 10m.

@Noah-gw9cg 2026-03-17

Interesting! I had been tinkering around with trying to build something similar, as I've always had the belief the models would perform much better in a divide-and-conquer strategy rather than trying to do everything all at once. Glad to see this is also actively being researched

Description

Top Comments (10)

@noname.megaseganame 2026-03-16

It's basically the same idea used in Claude Code or in Codex with subagents.

239 12 replies

@SinanWP 2026-03-16

i remember reading infini-attention paper too i believe it when i see it working....

109 1 replies

@EarthAaron 2026-03-16

64 3 replies

@neovoid5008 2026-03-16

The needle in a heystack test should start utilizing vagueness in the query more.

54 4 replies

@XMaster96DE 2026-03-16

9:00 and you just described RLM.... I really don't think why we need a distinction between RLM and an Agent, I mean in a sense RLM is agentic context management....

@BhaswataChoudhury 2026-03-17

@qaon5748 2026-03-16

good idea . ill have my agents work on it

@bycloudAI 2026-03-16

Check out Inngest and let your AI agents wear a harness now! https://www.inngest.com/docs?utm_source=youtube&utm_medium=video&utm_campaign=yt-bycl-3

5 2 replies

@Kabbinj 2026-03-17

@Noah-gw9cg 2026-03-17

Unlock the Data Inside
Turn Videos into Knowledge

Get FREE 10/day: transcripts, summaries, chats
Chat with videos, export text & PDF
$1 free API credit for RAG, chatbots & research

Try it free

Free forever plan • All features unlocked

The Death of RAG?

Description

Top Comments (10)

Related videos

The unexpected death of Codex

LLM that loops instead of Doing Chain-of-Thought

Ben Shapiro Reacts To The Death Of Spirit Airlines

The RL Irony in LLMs

THE DEATH OF MAGA | The Kyle Kulinski Show

The biggest Mystery of LLMs have just been solved

The Chinese AI Iceberg

The Death of Todd Stermer | Full Episode

The Death of Streaming

The Side Effects of Overusing ChatGPT For Homework

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

The unexpected death of Codex

LLM that loops instead of Doing Chain-of-Thought

Ben Shapiro Reacts To The Death Of Spirit Airlines

The RL Irony in LLMs

THE DEATH OF MAGA | The Kyle Kulinski Show

The biggest Mystery of LLMs have just been solved

The Chinese AI Iceberg

The Death of Todd Stermer | Full Episode

The Death of Streaming

The Side Effects of Overusing ChatGPT For Homework

Description

Top Comments (10)

Unlock the Data Inside
Turn Videos into Knowledge

The Death of RAG?

Description

Top Comments (10)

Related videos

The unexpected death of Codex

LLM that loops instead of Doing Chain-of-Thought

Ben Shapiro Reacts To The Death Of Spirit Airlines

The RL Irony in LLMs

THE DEATH OF MAGA | The Kyle Kulinski Show

The biggest Mystery of LLMs have just been solved

The Chinese AI Iceberg

The Death of Todd Stermer | Full Episode

The Death of Streaming

The Side Effects of Overusing ChatGPT For Homework

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

The unexpected death of Codex

LLM that loops instead of Doing Chain-of-Thought

Ben Shapiro Reacts To The Death Of Spirit Airlines

The RL Irony in LLMs

THE DEATH OF MAGA | The Kyle Kulinski Show

The biggest Mystery of LLMs have just been solved

The Chinese AI Iceberg

The Death of Todd Stermer | Full Episode

The Death of Streaming

The Side Effects of Overusing ChatGPT For Homework

Description

Top Comments (10)

Unlock the Data Inside Turn Videos into Knowledge

Unlock the Data Inside
Turn Videos into Knowledge