Gemini RAG: Multimodal RAG API

2026-05-11 Science & Technology

3.5k

151

241.0k subscribers

Description

Google's File Search API is now multimodal. LINKS Blogpost: https://blog.google/innovation-and-ai/technology/developers-tools/expanded-gemini-api-file-search-multimodal-rag/ Colab: https://colab.research.google.com/drive/1ZlV8h3WioIcRI1YkujiWh-gssYaCWZjo?usp=sharing My voice to text App: whryte.com Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0 Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: [email protected] Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0

#prompt engineering #Prompt Engineer #LLMs #AI #artificial Intelligence #Llama #GPT-4 #fine-tuning LLMs

Top Comments (5)

@gunasekhar8440 2026-05-11

Out of the pool of chunking methods, both Gemini and openai use sliding window based chunking where each bucket has 800 tokens with 400 tokens overlap irrespective of any document type. Most importantly, they are using only 10 percent of model input context windows which is 3072

1 3 replies

@MichaelUECreativeAIWorlds 2026-05-11

Nice

1 2 replies

@Ken129100 2026-05-14

Thank you! This is exactly what I was looking for. BTW, can this be done on Gemma 4? For business, it is preferrable to keep their data confidential.

1 1 replies

@TomanswerAi 2026-05-11

Nice overview

0 1 replies

@richardkuhne5054 2026-05-12

Would be nice to have something like this as opensource to run it cheaper. If you try to ingress a couple of TB of documents it can become very expensive quickly

0 1 replies

Description

Top Comments (5)

@gunasekhar8440 2026-05-11

1 3 replies

@MichaelUECreativeAIWorlds 2026-05-11

Nice

1 2 replies

@Ken129100 2026-05-14

Thank you! This is exactly what I was looking for. BTW, can this be done on Gemma 4? For business, it is preferrable to keep their data confidential.

1 1 replies

@TomanswerAi 2026-05-11

Nice overview

0 1 replies

@richardkuhne5054 2026-05-12

Would be nice to have something like this as opensource to run it cheaper. If you try to ingress a couple of TB of documents it can become very expensive quickly

0 1 replies

Unlock the Data Inside
Turn Videos into Knowledge

Get FREE 10/day: transcripts, summaries, chats
Chat with videos, export text & PDF
$1 free API credit for RAG, chatbots & research

Try it free

Free forever plan • All features unlocked

Gemini RAG: Multimodal RAG API

Description

Top Comments (5)

Related videos

Sonnet 4.5 Is Here—And It’s a Beast at Coding

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Only Embedding Model You Need for RAG

Gemini CLI — Google’s Free Open-Source Coding Agent

The Secret to Perfect Prompts (Without Prompt Engineering)

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

Do Anything with Local Agents with AnythingLLM

Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

Sonnet 4.5 Is Here—And It’s a Beast at Coding

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Only Embedding Model You Need for RAG

Gemini CLI — Google’s Free Open-Source Coding Agent

The Secret to Perfect Prompts (Without Prompt Engineering)

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

Do Anything with Local Agents with AnythingLLM

Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Description

Top Comments (5)

Unlock the Data Inside
Turn Videos into Knowledge

Gemini RAG: Multimodal RAG API

Description

Top Comments (5)

Related videos

Sonnet 4.5 Is Here—And It’s a Beast at Coding

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Only Embedding Model You Need for RAG

Gemini CLI — Google’s Free Open-Source Coding Agent

The Secret to Perfect Prompts (Without Prompt Engineering)

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

Do Anything with Local Agents with AnythingLLM

Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

Sonnet 4.5 Is Here—And It’s a Beast at Coding

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Only Embedding Model You Need for RAG

Gemini CLI — Google’s Free Open-Source Coding Agent

The Secret to Perfect Prompts (Without Prompt Engineering)

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

Do Anything with Local Agents with AnythingLLM

Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Description

Top Comments (5)

Unlock the Data Inside Turn Videos into Knowledge

Unlock the Data Inside
Turn Videos into Knowledge