Gemini 3.5 FLASH: BAD to OUTSTANDING

2026-05-20 Science & Technology

2.4k

88.6k subscribers

Description

The newly available Gemini 3.5 FLASH AI model tested on my own causal REASONING TEST. Since I test this reasoning model on "HIGH", the end result is excellent. But analyzing the reasoning trace of Gemini 3.5 FLASH in details shows, any exit before 5 minutes pure runtime (on Google TPU) would have delivered a "bad" (10 step solution) qualitative result. Multiple other tests with "MEDIUM" reasoning (not recorded) confirm this feeling. Note: This test is not a statistically significant event and only present a single test for a very limited number of repetitions. Therefore the result of my test is just for a first feeling, regarding the new GEMINI 3.5 FLASH. #airesearch #aiexplained #geminiai #googledeepmind

#artificial intelligence #AI models #LLM #VLM #VLA #Multi-modal model #explanatory video #RAG

Top Comments (10)

@anoni-t4s 2026-05-20

Thanks! This content is very useful.

0 1 replies

@QuantumPrintum 2026-06-02

curious if the trace shows an actual causal model or just iterative patching until something sticks

@Julzaa 2026-05-20

Impressive! Why keep the web search enabled though? It doesn't really matter for your custom prompt and it didn't use the web according to the trace, but still.

@connorfowler5946 2026-05-21

Regarding the discussion about grounding: would it not be clear within the reasoning trace if it were referencing Google search or particular tool calls that would have access to YouTube transcripts? I think it’s very exciting that this model happened upon the answer so quickly.

@ikoukas 2026-05-21

It would be interesting to know in 3 independent attempts how it would do.

@FroggyTWrite 2026-05-21

You can also use it in flex mode and that cuts the price in 1/2

@dmitryozernov8696 2026-05-20

Grounding with Google Search should be disabled. It could be using YouTube previous videos.

9 6 replies

@yann3601 2026-05-21

I do love the way when RL became as clear as the semantic nature of the answers. Whatever the LLM results to your prompt. Have you ever try any gradual approach ? Easy ways, i think you will manage 😊 NB: about CoT... you know it has nothing to do with the real "thinking" . You reveal the trick, now explain the magic ?

@gettingstuffdoneright5332 2026-05-20

Thank you for jumping on this model so quickly.I have to agree with some of the other commenters -- grounding should have been turned off, especially given the jump to 8 steps -- perhaps it's no coincidence that the flash model got exactly the best answer that happened to be only reached by its sibling model. It would be very interesting to see this test run again with grounding turned off. Indeed I asked the flash model and it said it should be turned off "otherwise it allows the model to look up the exact test question, find existing answer keys or leaked solutions online and simply copy the results."

6 3 replies

@gileneusz 2026-05-20

it's basically "not giving up" strategy, interesting...

Description

Top Comments (10)

@anoni-t4s 2026-05-20

Thanks! This content is very useful.

0 1 replies

@QuantumPrintum 2026-06-02

curious if the trace shows an actual causal model or just iterative patching until something sticks

@Julzaa 2026-05-20

Impressive! Why keep the web search enabled though? It doesn't really matter for your custom prompt and it didn't use the web according to the trace, but still.

@connorfowler5946 2026-05-21

@ikoukas 2026-05-21

It would be interesting to know in 3 independent attempts how it would do.

@FroggyTWrite 2026-05-21

You can also use it in flex mode and that cuts the price in 1/2

@dmitryozernov8696 2026-05-20

Grounding with Google Search should be disabled. It could be using YouTube previous videos.

9 6 replies

@yann3601 2026-05-21

@gettingstuffdoneright5332 2026-05-20

6 3 replies

@gileneusz 2026-05-20

it's basically "not giving up" strategy, interesting...

Unlock the Data Inside
Turn Videos into Knowledge

Get FREE 10/day: transcripts, summaries, chats
Chat with videos, export text & PDF
$1 free API credit for RAG, chatbots & research

Try it free

Free forever plan • All features unlocked

Gemini 3.5 FLASH: BAD to OUTSTANDING

Description

Top Comments (10)

Related videos

this is going to get bad

Discovery of an Invisible Flat Structure Shaping Our Galaxy

Gemini Flash 3 is my new favorite model (yes really)

Deepseek V3.2 Beats GPT-5 and Gemini 3 Pro - Chinese AI Destroying US Tech

Discovery Hearings on Trooper Proctor Texts. What's getting turned over.

Trump has DISASTER LANDING in DC over VERY BAD NEWS

r/AITA for Giving $2,500,000 to a 12-year-old?

Read Prosecutor Adam Lally Testifies at Discovery Hearing

The Truth about AI is Devastating: Proof by MIT, Harvard

r/AITA For Divorcing over Mr. Beast?

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

this is going to get bad

Discovery of an Invisible Flat Structure Shaping Our Galaxy

Gemini Flash 3 is my new favorite model (yes really)

Deepseek V3.2 Beats GPT-5 and Gemini 3 Pro - Chinese AI Destroying US Tech

Discovery Hearings on Trooper Proctor Texts. What's getting turned over.

Trump has DISASTER LANDING in DC over VERY BAD NEWS

r/AITA for Giving $2,500,000 to a 12-year-old?

Read Prosecutor Adam Lally Testifies at Discovery Hearing

The Truth about AI is Devastating: Proof by MIT, Harvard

r/AITA For Divorcing over Mr. Beast?

Description

Top Comments (10)

Unlock the Data Inside
Turn Videos into Knowledge

Gemini 3.5 FLASH: BAD to OUTSTANDING

Description

Top Comments (10)

Related videos

this is going to get bad

Discovery of an Invisible Flat Structure Shaping Our Galaxy

Gemini Flash 3 is my new favorite model (yes really)

Deepseek V3.2 Beats GPT-5 and Gemini 3 Pro - Chinese AI Destroying US Tech

Discovery Hearings on Trooper Proctor Texts. What's getting turned over.

Trump has DISASTER LANDING in DC over VERY BAD NEWS

r/AITA for Giving $2,500,000 to a 12-year-old?

Read Prosecutor Adam Lally Testifies at Discovery Hearing

The Truth about AI is Devastating: Proof by MIT, Harvard

r/AITA For Divorcing over Mr. Beast?

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

this is going to get bad

Discovery of an Invisible Flat Structure Shaping Our Galaxy

Gemini Flash 3 is my new favorite model (yes really)

Deepseek V3.2 Beats GPT-5 and Gemini 3 Pro - Chinese AI Destroying US Tech

Discovery Hearings on Trooper Proctor Texts. What's getting turned over.

Trump has DISASTER LANDING in DC over VERY BAD NEWS

r/AITA for Giving $2,500,000 to a 12-year-old?

Read Prosecutor Adam Lally Testifies at Discovery Hearing

The Truth about AI is Devastating: Proof by MIT, Harvard

r/AITA For Divorcing over Mr. Beast?

Description

Top Comments (10)

Unlock the Data Inside Turn Videos into Knowledge

Unlock the Data Inside
Turn Videos into Knowledge