Navigate Select ESC Close

The Only Embedding Model You Need for RAG

2025-07-02 Science & Technology
35.2k
1.0k
65
Prompt Engineering
Prompt Engineering
241.0k subscribers

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Description

I walk you through a single, multimodal embedding model that handles text, images, tables —and even code —inside one vector space. In this short demo I show the install steps, run RAG retrieval benchmarks, and compare cost vs. traditional multi-model setups. If you’re building search or RAG pipelines, see how one all-in-one embedding can simplify your stack and boost accuracy. LINKS: Notebook: https://colab.research.google.com/drive/1TFK4KLqEnddmgyzgjO7oNNw7nZWsdR09#scrollTo=ktzbaWGoEO4f https://jina.ai/news/jina-embeddings-v4-universal-embeddings-for-multimodal-multilingual-retrieval https://jina.ai/news/late-chunking-in-long-context-embedding-models/ https://huggingface.co/blog/matryoshka https://cohere.com/blog/embed-4 https://github.com/PromtEngineer/localGPT-Vision https://huggingface.co/blog/manu/colpali https://weaviate.io/developers/weaviate/tutorials/multi-vector-embeddings https://x.com/NVIDIAAIDev/status/1939777996522389683 https://mteb-leaderboard.hf.space/?benchmark_name=VisualDocumentRetrieval https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1/tree/main https://build.nvidia.com/explore/retrieval Relevant Videos: https://youtu.be/Ilf26xjT5is https://youtu.be/hhMXE9-JUAc https://youtu.be/V1VOdoEFaDw https://youtu.be/bQL-yok_0qw Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: [email protected] Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0

Top Comments (10)

@SYGSprayGOD 2025-07-05

Also multivectors will explode the storage and also increase compute resources (memory) for searching/querying. Eg: If a document has 500 pages and each page has 1038 vectors. So for ONE query against ONE page we're doing: n * 1038 vector comparisons. And for a 500 pages corpus: n * 1038 * 500 vector comparisons. This creates a massive cartesian product. For example, if n=10 (for query): 10,380 comparisons per page 5,190,000 comparisons for 500 pages. Do you think it would be helpful to use cohere approach here?

18
@engineerprompt 2025-07-02

Forgot to make the notebook public. Sorry to everyone for that. Its now accessible: https://colab.research.google.com/drive/1TFK4KLqEnddmgyzgjO7oNNw7nZWsdR09?usp=sharing

8
@maxmilmcu 2025-07-03

And what about licence? It's not MIT or Apache 2.0, so we should pay for it.

7
@maxmilmcu 2025-07-03

Hello all. Guys, what can you say about Dolphin (Bytedance) Document Image Parsing for AI Assistant with RAG for PDFs that consist not only text, but images and tables also?

4
@sysia5782 2025-07-04

Perfect 🎉

1
@engineerprompt 2025-07-07

RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag

1
@durand101 2025-07-02

Great explanation, thank you!

0 1 replies
@nikhili9559 2025-07-06

Good stuff, I will write an article abt it, this is awesome ...

0
@jvsnyc 2025-11-09

Great video. Several times when you said "but that's not it" you could have said "but that's not all!" because each one of the points seemed important on its own.

0
@saeeds851 2025-09-02

Please do a more detailed one on jina embedding model.

0

Unlock the Data Inside
Turn Videos into Knowledge

  • Get FREE 10/day: transcripts, summaries, chats
  • Chat with videos, export text & PDF
  • $1 free API credit for RAG, chatbots & research

Free forever plan • All features unlocked

App screenshot