Navigate Select ESC Close

What Is Yann LeCun Cooking? JEPA Explained Simply

2026-04-20 Science & Technology
50.7k
2.7k
278
bycloud
bycloud
225.0k subscribers

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Description

Warp is the agentic development environment born out of the terminal. Download Warp for free today at → https://go.warp.dev/bycloudythoa For the longest time, Yann LeCun has been pioneering this idea called JEPA. With its rapid advancements as of recent, it has become a spotlight in the research field, especially for world modeling. So in today's video, I'll be covering the main idea of JEPA, how it works, and what makes it promising. my latest project: Intuitive AI Academy We just wrote a new piece on RL & RLHF! https://intuitiveai.academy/ limited time code "EARLY" for 40% off yearly plan My Newsletter https://mail.bycloud.ai/ My Patreon https://www.patreon.com/c/bycloud Sauces [Original JEPA paper] https://openreview.net/pdf?id=BZ5a1r-kVsf [V-JEPA] https://arxiv.org/abs/2506.09985 [I-JEPA] https://arxiv.org/abs/2301.08243 [EMA for Self-Supervised ViT] https://arxiv.org/abs/2104.14294 [Infomax] https://pubmed.ncbi.nlm.nih.gov/7584893/ [SimCLR] https://arxiv.org/abs/2002.05709 [Barlow Twins] https://arxiv.org/abs/2103.03230 [VICReg] https://arxiv.org/abs/2105.04906 [LeJEPA] https://arxiv.org/abs/2511.08544 [EchoJEPA] https://arxiv.org/abs/2602.02603 [DINO v2] https://arxiv.org/abs/2304.07193 Try out my new fav place to learn how to code https://scrimba.com/?via=bycloudAI This video is supported by the kind Patrons & YouTube Members: 🙏Spam Maj, Alex, Chris LeDoux, DX Research Group, Poof N' Inu, Deagan, Robert Zawiasa, Ryszard Warzocha, Tobe2d, Louis Muk, Akkusativ, Kevin Tai, Mark Buckler, NO U, Tony Jimenez, Ângelo Fonseca, jiye, Anushka, Asad Dhamani, Binnie Yiu, Calvin Yan, Clayton Ford, Diego Silva, Etrotta, Gonzalo Fidalgo, Handenon, Hector, Jake Disco very, Michael Brenner, Nilly K, OlegWock, Daddy Wen, Shuhong Chen, Sid_Cipher, Stefan Lorenz, Sup, tantan assawade, Thipok Tham, Thomas Di Martino, Thomas Lin, Richárd Nagyfi, Paperboy, mika, Leo, Berhane-Meskel, Kadhai Pesalam, mayssam, Bill Mangrum, nyaa, Toru Mon, Lame Plane, Matej Macak, Len Mo, saylikhapekar, ZyanSheep, THEVIERAOS, Ricardo Raphael Corona-Moreno, C [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Business Inquiries] [email protected] [Profile & Banner Art] https://twitter.com/pygm7 [Video Editor] @aduckchicken2 [Ko-fi] https://ko-fi.com/bycloudai Manim Animations created with Manimate https://www.manimate.ai/

Top Comments (10)

@kostas_ts 2026-04-20

This direction of research is much more important than making a chatbot smarter by throwing data and compute on it just to get funding. This direction can produce models than can be actually applied in a wide range of problems across all kinds of domains and work natively on the required modality, etc. instead of shoehorning everything in textual format. Very interesting stuff overall, thanks Bycloud for the excellent video. You managed to describe these complicated methodologies in an approachable way.

332 23 replies
@IseOnCrack 2026-04-20

As a robotics researcher, interested in physical AI JEPA models are really promising, pixel reconstruction is doomed

110 5 replies
@NdxtremePro 2026-04-20

The potential for JEPA on text is perhaps more than you think. If your goal is to predict text, then you are right, but what is text exactly? Text is a representation off language, and language is a tool set to express ideas from a latent space, mainly our consciousness. If we view our consciousness as a latent thinking space, and language is a the tools we use to point to specific locations, addresses if you will, then we could make something that maps language to the latent space, and one that maps from latent space back to text space, then you can give JEPA a tool to manipulate and find gaps in its latent space. This seems like a closer representation too our own intelligence, and may produce better thinking and reasoning medels.

93 12 replies
@stblackhole 2026-04-20

I-JEPA can be useful as part of an image search engine. You click on an object or feature within an image and it'll find similar images with that thing.

58 5 replies
@FascinateFelix 2026-04-20

Hybrid models are the future a lot these systems have benefits that can help each other and there are ways to interface them together. Also a lot of this is older, you should look at Jepa 2 & 2.1 and Hierarchical Planning with Latent World Models (mostly in the context of spatial reasoning for robotics). Another system that has been integrated with an LLM is Kona from Logical Intelligence a Energy Based Model(EBM) "Certainty, Not Probability" is their tagline particularly useful for maths related tasks amongst other things, Yann has a role here but mostly as a founding member and technical advisor IIRC.

29 1 replies
@bycloudAI 2026-04-20

Warp is the agentic development environment born out of the terminal. Download Warp for free today at → https://go.warp.dev/bycloudythoa

13 6 replies
@fantomass47 2026-04-21

"JEPA" means "ASS" in Russian. Watching the video and reading the comments is so funny ahaha ​​xD

9 2 replies
@MaximumBGGT 2026-04-21

The image domain has lots of nice properties that make it suitable for this sort of thing. I've found it pretty hard to apply JEPA successfully in other domains where there aren't many natural "identity preserving" transforms.

5
@aron2922 2026-04-21

Watching this 1 minute after waking up is frying my JEPA

3
@DeterminateNegation 2026-04-21

I love when a video maps perfectly onto my existing knowledge. People are complaining about this being complex, but everything about this feels so natural to me it's kind of amazing that this isn't how all ai works. It's way closer to our current models of comp neuroscience. Ever since the deepseek latent embedding 'OCR' I've wondered why we didn't come up with this framework sooner, collectively. It's so much more versatile and natural way of doing things compared to the 1 or few tokens at a time approach, even if it's slightly more complex

3

Unlock the Data Inside
Turn Videos into Knowledge

  • Get FREE 10/day: transcripts, summaries, chats
  • Chat with videos, export text & PDF
  • $1 free API credit for RAG, chatbots & research

Free forever plan • All features unlocked

App screenshot