Navigate Select ESC Close

Gemini 3.5 FLASH: BAD to OUTSTANDING

2026-05-20 Science & Technology
2.4k
96
28
Discover AI
Discover AI
88.6k subscribers

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Description

The newly available Gemini 3.5 FLASH AI model tested on my own causal REASONING TEST. Since I test this reasoning model on "HIGH", the end result is excellent. But analyzing the reasoning trace of Gemini 3.5 FLASH in details shows, any exit before 5 minutes pure runtime (on Google TPU) would have delivered a "bad" (10 step solution) qualitative result. Multiple other tests with "MEDIUM" reasoning (not recorded) confirm this feeling. Note: This test is not a statistically significant event and only present a single test for a very limited number of repetitions. Therefore the result of my test is just for a first feeling, regarding the new GEMINI 3.5 FLASH. #airesearch #aiexplained #geminiai #googledeepmind

Top Comments (10)

@anoni-t4s 2026-05-20

Thanks! This content is very useful.

0 1 replies
@QuantumPrintum 2026-06-02

curious if the trace shows an actual causal model or just iterative patching until something sticks

0
@Julzaa 2026-05-20

Impressive! Why keep the web search enabled though? It doesn't really matter for your custom prompt and it didn't use the web according to the trace, but still.

0
@connorfowler5946 2026-05-21

Regarding the discussion about grounding: would it not be clear within the reasoning trace if it were referencing Google search or particular tool calls that would have access to YouTube transcripts? I think it’s very exciting that this model happened upon the answer so quickly.

0
@ikoukas 2026-05-21

It would be interesting to know in 3 independent attempts how it would do.

0
@FroggyTWrite 2026-05-21

You can also use it in flex mode and that cuts the price in 1/2

0
@dmitryozernov8696 2026-05-20

Grounding with Google Search should be disabled. It could be using YouTube previous videos.

9 6 replies
@yann3601 2026-05-21

I do love the way when RL became as clear as the semantic nature of the answers. Whatever the LLM results to your prompt. Have you ever try any gradual approach ? Easy ways, i think you will manage 😊 NB: about CoT... you know it has nothing to do with the real "thinking" . You reveal the trick, now explain the magic ?

0
@gettingstuffdoneright5332 2026-05-20

Thank you for jumping on this model so quickly.I have to agree with some of the other commenters -- grounding should have been turned off, especially given the jump to 8 steps -- perhaps it's no coincidence that the flash model got exactly the best answer that happened to be only reached by its sibling model. It would be very interesting to see this test run again with grounding turned off. Indeed I asked the flash model and it said it should be turned off "otherwise it allows the model to look up the exact test question, find existing answer keys or leaked solutions online and simply copy the results."

6 3 replies
@gileneusz 2026-05-20

it's basically "not giving up" strategy, interesting...

0

Unlock the Data Inside
Turn Videos into Knowledge

  • Get FREE 10/day: transcripts, summaries, chats
  • Chat with videos, export text & PDF
  • $1 free API credit for RAG, chatbots & research

Free forever plan • All features unlocked

App screenshot