The Only Embedding Model You Need for RAG

2025-07-02 Science & Technology

35.2k

1.0k

245.0k subscribers

Description

I walk you through a single, multimodal embedding model that handles text, images, tables —and even code —inside one vector space. In this short demo I show the install steps, run RAG retrieval benchmarks, and compare cost vs. traditional multi-model setups. If you’re building search or RAG pipelines, see how one all-in-one embedding can simplify your stack and boost accuracy. LINKS: Notebook: https://colab.research.google.com/drive/1TFK4KLqEnddmgyzgjO7oNNw7nZWsdR09#scrollTo=ktzbaWGoEO4f https://jina.ai/news/jina-embeddings-v4-universal-embeddings-for-multimodal-multilingual-retrieval https://jina.ai/news/late-chunking-in-long-context-embedding-models/ https://huggingface.co/blog/matryoshka https://cohere.com/blog/embed-4 https://github.com/PromtEngineer/localGPT-Vision https://huggingface.co/blog/manu/colpali https://weaviate.io/developers/weaviate/tutorials/multi-vector-embeddings https://x.com/NVIDIAAIDev/status/1939777996522389683 https://mteb-leaderboard.hf.space/?benchmark_name=VisualDocumentRetrieval https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1/tree/main https://build.nvidia.com/explore/retrieval Relevant Videos: https://youtu.be/Ilf26xjT5is https://youtu.be/hhMXE9-JUAc https://youtu.be/V1VOdoEFaDw https://youtu.be/bQL-yok_0qw Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: [email protected] Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0

#prompt engineering #Prompt Engineer #LLMs #AI #artificial Intelligence #Llama #GPT-4 #fine-tuning LLMs

Top Comments (10)

@TirushV 2025-07-05

Also multivectors will explode the storage and also increase compute resources (memory) for searching/querying. Eg: If a document has 500 pages and each page has 1038 vectors. So for ONE query against ONE page we're doing: n * 1038 vector comparisons. And for a 500 pages corpus: n * 1038 * 500 vector comparisons. This creates a massive cartesian product. For example, if n=10 (for query): 10,380 comparisons per page 5,190,000 comparisons for 500 pages. Do you think it would be helpful to use cohere approach here?

@engineerprompt 2025-07-02

Forgot to make the notebook public. Sorry to everyone for that. Its now accessible: https://colab.research.google.com/drive/1TFK4KLqEnddmgyzgjO7oNNw7nZWsdR09?usp=sharing

@engineerprompt 2025-07-07

RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag

@sysia5782 2025-07-04

Perfect 🎉

@durand101 2025-07-02

Great explanation, thank you!

0 1 replies

@MohanadSaid-u8x 2025-07-02

Yo? Another jina banger?

0 1 replies

@jimmyjustintime3030 2025-07-03

i tried cohere v4 for image search from text and it was amazing but this I can run on HF so will check the colab thanks !!

0 1 replies

@jvsnyc 2025-11-09

Great video. Several times when you said "but that's not it" you could have said "but that's not all!" because each one of the points seemed important on its own.

@saeeds851 2025-09-02

Please do a more detailed one on jina embedding model.

@nikhili9559 2025-07-06

Good stuff, I will write an article abt it, this is awesome ...

Description

Top Comments (10)

@TirushV 2025-07-05

@engineerprompt 2025-07-02

Forgot to make the notebook public. Sorry to everyone for that. Its now accessible: https://colab.research.google.com/drive/1TFK4KLqEnddmgyzgjO7oNNw7nZWsdR09?usp=sharing

@engineerprompt 2025-07-07

RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag

@sysia5782 2025-07-04

Perfect 🎉

@durand101 2025-07-02

Great explanation, thank you!

0 1 replies

@MohanadSaid-u8x 2025-07-02

Yo? Another jina banger?

0 1 replies

@jimmyjustintime3030 2025-07-03

i tried cohere v4 for image search from text and it was amazing but this I can run on HF so will check the colab thanks !!

0 1 replies

@jvsnyc 2025-11-09

Great video. Several times when you said "but that's not it" you could have said "but that's not all!" because each one of the points seemed important on its own.

@saeeds851 2025-09-02

Please do a more detailed one on jina embedding model.

@nikhili9559 2025-07-06

Good stuff, I will write an article abt it, this is awesome ...

Unlock the Data Inside
Turn Videos into Knowledge

Get FREE 10/day: transcripts, summaries, chats
Chat with videos, export text & PDF
$1 free API credit for RAG, chatbots & research

Try it free

Free forever plan • All features unlocked

The Only Embedding Model You Need for RAG

Description

Top Comments (10)

Related videos

Engineering The Perfect First Date

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Secret to Perfect Prompts (Without Prompt Engineering)

The Dead Simple Reason You're Not Getting Promoted (Ex-Amazon Principal Engineer)

Anthropic’s Blueprint for Building Lean, Powerful AI Agents

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

The Only Dating Advice You'll Ever Need

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

Engineering The Perfect First Date

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Secret to Perfect Prompts (Without Prompt Engineering)

The Dead Simple Reason You're Not Getting Promoted (Ex-Amazon Principal Engineer)

Anthropic’s Blueprint for Building Lean, Powerful AI Agents

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

The Only Dating Advice You'll Ever Need

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Description

Top Comments (10)

Unlock the Data Inside
Turn Videos into Knowledge

The Only Embedding Model You Need for RAG

Description

Top Comments (10)

Related videos

Engineering The Perfect First Date

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Secret to Perfect Prompts (Without Prompt Engineering)

The Dead Simple Reason You're Not Getting Promoted (Ex-Amazon Principal Engineer)

Anthropic’s Blueprint for Building Lean, Powerful AI Agents

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

The Only Dating Advice You'll Ever Need

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Unlock all features

Related videos

Engineering The Perfect First Date

GPT-OSS Jailbreak with this Simple Trick

Context Engineering is All You NEED!

The Secret to Perfect Prompts (Without Prompt Engineering)

The Dead Simple Reason You're Not Getting Promoted (Ex-Amazon Principal Engineer)

Anthropic’s Blueprint for Building Lean, Powerful AI Agents

Meet KAG: Supercharging RAG Systems with Advanced Reasoning

The Only Dating Advice You'll Ever Need

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Description

Top Comments (10)

Unlock the Data Inside Turn Videos into Knowledge

Unlock the Data Inside
Turn Videos into Knowledge