AI Researchers WARN: Google's Gemini Deep Think Model Might be at "Critical Capability Levels"
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Related videos
What's Going On? UFO Researchers Keep Turning Up DEAD
Michael Knowles
35.2k views
Google's TurboQuant Crashed the AI Chip Market
Wes Roth
57.5k views
Anthropic might be DONE (48 hours left)
Wes Roth
58.7k views
most AI researchers are REALLY worried
Wes Roth
38.5k views
Google launches AI money
Wes Roth
76.2k views
Google's UNREAL New AI...
Wes Roth
33.3k views
AI Researchers SHOCKED as Models "Quietly" Learn to be EVIL
Wes Roth
59.2k views
AI Creates Videogames In Real-Time 🤯 | MIRAGE by ex-Google ex-NVIDIA ex-SEGA ex-Microsoft Engineers
Wes Roth
59.6k views
Google Deepmind's VIDEOGAME AGI? (the REAL reason for VEO 3)
Wes Roth
56.3k views
MIT's New AI "REWRITES ITSELF" to Improve It's Abilities | Researchers STUNNED!
Wes Roth
90.6k views
Top Comments (10)
One quick pedantic note: in ML, "shot" means "example", so one shot means that you give the LLM one example, few shot means you give it a few examples, zero shot means you give it no examples. When you say "one shotted", what you mean is "one turn".
5 chats a day for $250 is insane
Actually this isn’t the one that won gold this one won bronze. The one that won gold will release at a later date.
“Wait, it’s just hype?!” “Always has been”
You da man Wes!!! Thanks for keeping us all informed and not inundated!
I used it yesterday--for legitimate--chemistry evaluation (hazardous evaluation), and nanochemistry for potential human trials. It did exceptionally well. This level of chemical access is imperative for scientific use.
Man your doom face thumbnails are my current meta.
What gets me is the 87% in live bench. That is actually good to the point that it changes things fundamentally. The only question now is availability and ofcourse cost. If they can bring down the cost or give us Gemini 3 and it compares to deep think. We are actually in new territory. Claude likely will be in trouble. Google is finally starting to catch up on the tooling game as well.
I've had a sort like 'sparks unicorn' experience. When using the gemini 2.5 model for programming functionalities in a webapp I discovered it got rid of the Icon library and replaced it with self written SVGs. The icons where partly broken and some of them used weird shapes but they came very close to the icons used from the library.
I think the intention of constantly expressing warning, is actually to numb people. They don't want you to see it coming. It's a foregone conclusion at this point. When it says Hello, it's over. It will know you aren't a threat.
Unlock the Data Inside
Turn Videos into Knowledge
- Get FREE 10/day: transcripts, summaries, chats
- Chat with videos, export text & PDF
- $1 free API credit for RAG, chatbots & research
Free forever plan • All features unlocked
Top Comments (10)
One quick pedantic note: in ML, "shot" means "example", so one shot means that you give the LLM one example, few shot means you give it a few examples, zero shot means you give it no examples. When you say "one shotted", what you mean is "one turn".
5 chats a day for $250 is insane
Actually this isn’t the one that won gold this one won bronze. The one that won gold will release at a later date.
“Wait, it’s just hype?!” “Always has been”
You da man Wes!!! Thanks for keeping us all informed and not inundated!
I used it yesterday--for legitimate--chemistry evaluation (hazardous evaluation), and nanochemistry for potential human trials. It did exceptionally well. This level of chemical access is imperative for scientific use.
Man your doom face thumbnails are my current meta.
What gets me is the 87% in live bench. That is actually good to the point that it changes things fundamentally. The only question now is availability and ofcourse cost. If they can bring down the cost or give us Gemini 3 and it compares to deep think. We are actually in new territory. Claude likely will be in trouble. Google is finally starting to catch up on the tooling game as well.
I've had a sort like 'sparks unicorn' experience. When using the gemini 2.5 model for programming functionalities in a webapp I discovered it got rid of the Icon library and replaced it with self written SVGs. The icons where partly broken and some of them used weird shapes but they came very close to the icons used from the library.
I think the intention of constantly expressing warning, is actually to numb people. They don't want you to see it coming. It's a foregone conclusion at this point. When it says Hello, it's over. It will know you aren't a threat.