Is This the End of RAG? Anthropic's NEW Prompt Caching
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Related videos
Holy sh*t I think Anthropic is profitable now
Theo - t3․gg
39.7k views
THIS IS THE END OF AMERICA
Timcast
217.2k views
Software engineering is dead now
Theo - t3․gg
165.4k views
Anthropic is lying to us.
Theo - t3․gg
122.7k views
Is This The End Of The Psyop? Or Just The Beginning? - Clip
Ian Carroll
16.7k views
Anthropic confirms software engineering is NOT dead
ThePrimeTime
341.6k views
Sonnet 4.5 Is Here—And It’s a Beast at Coding
Prompt Engineering
52.0k views
Is this the end of Chrome?
Theo - t3․gg
30.8k views
GPT-OSS Jailbreak with this Simple Trick
Prompt Engineering
54.4k views
Context Engineering is All You NEED!
Prompt Engineering
38.7k views
Top Comments (10)
In short: it's not a replacement to RAG
That 5 min are refreshed each time it is used. Meaning it can be forever if you keep chatting and AI keep accessing cached content. On Gemini page it is 1h but without refreshes. That`s what I understood from that text at least
I think it is amazing. With something like Claude Dev, after reviewing the code in a project, prompts become gigantic and costs skyrocket. Caching will be a great addon for this use case. And yes, I agree that five minutes is a bit short.
Check out the RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag
The cache duration is 5 minutes, but it resets each time a new request is made. So as long as you keep sending requests, the cache is continuously refreshed.
you gave absolutely 0 explanation of how the caching works.
Do you pland to show us how to use this huge cached context window iwith RAGs :) the old RAGs systems wher niot good enough for my industry (minimum 95%) - maybe the new approched will be better :)
Thank you for keeping up with this always changing world.
Can you show how to do the same with complex large codebase?
This is a great video that is giving me great ideas. Thank you
Unlock the Data Inside
Turn Videos into Knowledge
- Get FREE 10/day: transcripts, summaries, chats
- Chat with videos, export text & PDF
- $1 free API credit for RAG, chatbots & research
Free forever plan • All features unlocked
Top Comments (10)
In short: it's not a replacement to RAG
That 5 min are refreshed each time it is used. Meaning it can be forever if you keep chatting and AI keep accessing cached content. On Gemini page it is 1h but without refreshes. That`s what I understood from that text at least
I think it is amazing. With something like Claude Dev, after reviewing the code in a project, prompts become gigantic and costs skyrocket. Caching will be a great addon for this use case. And yes, I agree that five minutes is a bit short.
Check out the RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag
The cache duration is 5 minutes, but it resets each time a new request is made. So as long as you keep sending requests, the cache is continuously refreshed.
you gave absolutely 0 explanation of how the caching works.
Do you pland to show us how to use this huge cached context window iwith RAGs :) the old RAGs systems wher niot good enough for my industry (minimum 95%) - maybe the new approched will be better :)
Thank you for keeping up with this always changing world.
Can you show how to do the same with complex large codebase?
This is a great video that is giving me great ideas. Thank you