1 Million Tiny Experts in an AI? Fine-Grained MoE Explained
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Related videos
Ryan EXPOSES INSANE SCHUMER AI $9 Million SCREWUP
Breaking Points
91.1k views
What Is Yann LeCun Cooking? JEPA Explained Simply
bycloud
50.7k views
DeepSeek's Insane Architecture Breakthrough [Engram Explained]
bycloud
71.3k views
Powerful Burst in the Middle of Empty Space Has Finally Been Explained
Anton Petrov
16.5k views
An Expert Exploration Into St. Thomas Aquinas | Fr. Gregory Pine
Michael Knowles
12.8k views
The Three Million NEW Pages Of Epstein Files EXPLAINED
Michael Knowles
26.4k views
$50 Million Gone in Seconds… From One Tiny Mistake
Coin Bureau
29.3k views
NY AG Indicted For FRAUD, Faces 30 Years In Prison, $1 MILLION FINE | Timcast IRL
Timcast IRL
249.7k views
Ivan the Terrible and his Son Ivan by Ilya Repin: Great Art Explained
Great Art Explained
92.5k views
Mysterious Craters in Siberia May Be Finally Explained
Anton Petrov
75.0k views
Top Comments (10)
Imagine assembling 1 milliont PhD students together to discuss someone's request like "write a poem about cooking eggs with c++". Thats MoE irl
to some extent this seems closer to how brains work
i see what you did there with "catastrophic forgetting" lmao 🤣
It's crazy how Meta's 8B parameter Llama 3 model has nearly the same performance as the original GPT-4 with 1.8T parameters. That's a 225x reduction in compute in just 2 years.
The only thing in my mind is "MoE moe kyuuuuun!!!"
Now I really am excited for a 800B model with fine-grained MoE to surface that I can run on basically any device.
To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/bycloud/ . You’ll also get 20% off an annual premium subscription! Like this comment if you wanna see more MoE related content, I have quite a good list for a video;)
These videos format is GOLD 🏆 such specific and nerdy topics produced as memes 😄
3:37 wasn't it just yesterday that they released their model 😭
I'd imagine in a month someone will come with MoE responsible for choosing the best MoE to choose the best MoE out of billions of experts
Unlock the Data Inside
Turn Videos into Knowledge
- Get FREE 10/day: transcripts, summaries, chats
- Chat with videos, export text & PDF
- $1 free API credit for RAG, chatbots & research
Free forever plan • All features unlocked
Top Comments (10)
Imagine assembling 1 milliont PhD students together to discuss someone's request like "write a poem about cooking eggs with c++". Thats MoE irl
to some extent this seems closer to how brains work
i see what you did there with "catastrophic forgetting" lmao 🤣
It's crazy how Meta's 8B parameter Llama 3 model has nearly the same performance as the original GPT-4 with 1.8T parameters. That's a 225x reduction in compute in just 2 years.
The only thing in my mind is "MoE moe kyuuuuun!!!"
Now I really am excited for a 800B model with fine-grained MoE to surface that I can run on basically any device.
To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/bycloud/ . You’ll also get 20% off an annual premium subscription! Like this comment if you wanna see more MoE related content, I have quite a good list for a video;)
These videos format is GOLD 🏆 such specific and nerdy topics produced as memes 😄
3:37 wasn't it just yesterday that they released their model 😭
I'd imagine in a month someone will come with MoE responsible for choosing the best MoE to choose the best MoE out of billions of experts