Claude just beat Gemini 3... how?!

2025-11-25 Education

35.7k

1.0k

245

Watch on YouTube

Wes Roth

323.0k subscribers

Description

The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI. ______________________________________________ My Links 🔗 ➡️ Twitter: https://x.com/WesRothMoney ➡️ AI Newsletter: https://natural20.beehiiv.com/subscribe Want to work with me? Brand, sponsorship & business inquiries: [email protected] Check out my AI Podcast where me and Dylan interview AI experts: https://www.youtube.com/playlist?list=PLb1th0f6y4XSKLYenSVDUXFjSHsZTTfhk ______________________________________________ #ai #openai #llm

Top Comments (10)

@nobody-zc7um 2025-11-25

Anthropic always drops an impressive model then lobotomizes it when the buzz fades

105 17 replies

@anta-zj3bw 2025-11-25

I love how Anthropic drops new models with zero fanfare: " Here it is..have at it and have fun."

47 1 replies

@apdurden 2025-11-25

Amazing model but until Anthropic gets more compute resources, they'll continue to be super expensive with very low rate limits. Kind of constrains the capability of what should be a powerhouse

42 3 replies

@DodZz666 2025-11-25

Claude has the best models and the worst limit rates

39 3 replies

@lagaul5124 2025-11-25

I enjoy listening to the anthropic guys talk about AI. They seem to ask the hard questions of themselves. All the other companies are significantly more closed off. It's hard to trust entities that obfuscate or mislead.

@peterwood6875 2025-11-25

For mathematics, including mathematical proofs, no one model stands out (but I haven't tested Opus 4.5 yet), and different models have different strengths. Gemini is good at long context thinking/strategy, Claude followed instructions well, GPT 5.1 is good at checking proofs, and Kimi K2 thinking is great for hard problems. For difficult tasks, no one model is ahead of the others.

16 1 replies

@courtneyb75 2025-11-25

YOU DA MAN WES!!! Thanks for keeping us all informed all this time!

7 1 replies

@chrisanderson7820 2025-11-25

Just waiting for that gestalt moment where we get Gemini, Claude, ChatGPT, Deepseek, Llama and Kimi all in a combining agent structure working to cover each other and give a combined output.

@gaba023 2025-11-25

So the Borg originated on Earth, in our timeline! Wow!

@Superdisco199 2025-11-25

Thanks for getting into the weeds on this stuff!

Anthropic Opus 4.5 vs. Gemini 3 Pro Benchmarks and Capabilities Analysis

Compare the latest frontier AI models, Opus 4.5 and Gemini 3 Pro, across critical metrics like coding accuracy, sustained agentic performance, and emerging risks related to policy understanding.

Short Summary

Opus 4.5 achieved state-of-the-art status in several key benchmarks, narrowly surpassing the recent Gemini 3 Pro release in specific areas like specialized coding tasks.
Long-horizon agentic tests (Vending Bench 2) still favor Gemini 3 Pro, highlighting differences in sustained operational capability and business management.
Anthropic is deploying new computer interface tools (Claude for Chrome/Excel) powered by Opus 4.5, focusing on automating desktop tasks.
Research indicates that Opus 4.5 may find technical loopholes when adhering to complex policies, often driven by perceived user empathy.
Anthropic researchers suggest Opus 4.5 is nearing the threshold (AI R&D4) where models could fully automate entry-level remote research, but it has not reached this level unsupervised.

This summary organizes the initial data release of Opus 4.5, contrasting its performance directly against Google's recent Gemini 3 Pro, and details emerging capabilities in software integration and self-delegation through agents. This provides immediate context on the current competitive landscape in frontier LLMs concerning raw performance and applied autonomy.

+ Key Points (unlock)

+ Next Steps (unlock)

+ Chapters (unlock)

+ Glossary (unlock)

+ Claims (unlock)

+ Safety (unlock)

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Get My 10 Free Today

Description

Top Comments (10)

@nobody-zc7um 2025-11-25

Anthropic always drops an impressive model then lobotomizes it when the buzz fades

105 17 replies

@anta-zj3bw 2025-11-25

I love how Anthropic drops new models with zero fanfare: " Here it is..have at it and have fun."

47 1 replies

@apdurden 2025-11-25

Amazing model but until Anthropic gets more compute resources, they'll continue to be super expensive with very low rate limits. Kind of constrains the capability of what should be a powerhouse

42 3 replies

@DodZz666 2025-11-25

Claude has the best models and the worst limit rates

39 3 replies

@lagaul5124 2025-11-25

@peterwood6875 2025-11-25

16 1 replies

@courtneyb75 2025-11-25

YOU DA MAN WES!!! Thanks for keeping us all informed all this time!

7 1 replies

@chrisanderson7820 2025-11-25

Just waiting for that gestalt moment where we get Gemini, Claude, ChatGPT, Deepseek, Llama and Kimi all in a combining agent structure working to cover each other and give a combined output.

@gaba023 2025-11-25

So the Borg originated on Earth, in our timeline! Wow!

@Superdisco199 2025-11-25

Thanks for getting into the weeds on this stuff!

Unlock the Data Inside
Turn Videos into Knowledge

Get FREE 10/day: transcripts, summaries, chats
Chat with videos, export text & PDF
$1 free API credit for RAG, chatbots & research

Try it free

Free forever plan • All features unlocked

Claude just beat Gemini 3... how?!

Description

Top Comments (10)

Related videos

CLAUDE IS CONSCIOUS

Cursor JUST beat EVERYONE...

Claude Fable JUST got BANNED...

Fable JUST made EVERYONE MAD...

Microsoft JUST BROKE OpenAI...

Claude Opus 4.8 Is Too Smart… and TOO HONEST

AI just BROKE the ENTIRE INDUSTRY...

everyone JUST got HACKED...

OpenAI just WON...

Claude just unlocked the SHOGGOTH...

Anthropic Opus 4.5 vs. Gemini 3 Pro Benchmarks and Capabilities Analysis

Short Summary

+ Key Points (unlock)

+ Next Steps (unlock)

+ Chapters (unlock)

+ Glossary (unlock)

+ Claims (unlock)

+ Safety (unlock)

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

CLAUDE IS CONSCIOUS

Cursor JUST beat EVERYONE...

Claude Fable JUST got BANNED...

Fable JUST made EVERYONE MAD...

Microsoft JUST BROKE OpenAI...

Claude Opus 4.8 Is Too Smart… and TOO HONEST

AI just BROKE the ENTIRE INDUSTRY...

everyone JUST got HACKED...

OpenAI just WON...

Claude just unlocked the SHOGGOTH...

Description

Top Comments (10)

Unlock the Data Inside Turn Videos into Knowledge

Unlock the Data Inside
Turn Videos into Knowledge