1-Bit LLM: The Most Efficient LLM Possible?
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Related videos
The #1 Most Important Brain Nutrient EVERYONE Is Deficient In
Felix Harder
13.3k views
THIS CAN NOT BE POSSIBLE
Timcast
78.4k views
The Most Clever Trick To Speedup LLMs
bycloud
17.7k views
Why can’t LLMs just LEARN the context window?
bycloud
30.9k views
The Death of RAG?
bycloud
15.0k views
Film Theory: Who is the MOST Invincible?
The Film Theorists
80.8k views
LLM’s Billion Dollar Problem
bycloud
44.6k views
The RL Irony in LLMs
bycloud
23.0k views
NO ONE saw the 49ers coming as possible No. 1 seed 👀 Bills the lone Must-Win squad | FTF
First Things First
299.1k views
The biggest Mystery of LLMs have just been solved
bycloud
102.6k views
Top Comments (10)
I attempted to replicate their paper on a smaller scale, and what I discovered is that the 1 or 1.58-bit LLM itself is really nothing by themselves. They still require full precision weights during training and act like any other llm. What sets it apart, however, is the level of optimization it enables that simply aren’t feasible with 4-bit or 8-bit models. The challenge is that these optimizations require custom kernel modifications, as mainstream frameworks like PyTorch and TensorFlow don’t natively support them. There are currently no widely available frameworks that fully exploit the benefits of 1-bit quantization, since existing tools heavily prioritize GPUs over CPUs. As a result, unless these implementations are incorporated into standard libraries, it's nearly impossible to fully leverage 1-bit LLMs unless you use BitNet’s version, which, while powerful, is notoriously difficult to set up properly due to its extensive dependencies.
I don't think I've ever seen more sponsored ads in a YouTube video
8:03 it runs at 1.67x faster, not 66x faster
SponsorBlock was invented for such videos
My understanding is that training a bitnet still requires a full precision set of weights for the gradient descent to work. So I doubt the claimed 20x energy savings during training.
20% of this video feels like ads
I am more of an embedded systems/robotics type of engineer and this is amazing news. Imagine what kind of complex AI you can fit in your Arduino. I still remember when I had memory problems in Arduino as a high school student trying to store audio. Now, the same student could use AI.
Download Tanka today https://www.tanka.ai and enjoy 3 months of free Premium! You can also get $20 / team for each referrals
6 ads-plugs in a 14 minute video. Thats really great!
"can you make an LLM with 1 bits"? proceeds to train it on a miniscule body of text
Unlock the Data Inside
Turn Videos into Knowledge
- Get FREE 10/day: transcripts, summaries, chats
- Chat with videos, export text & PDF
- $1 free API credit for RAG, chatbots & research
Free forever plan • All features unlocked
Top Comments (10)
I attempted to replicate their paper on a smaller scale, and what I discovered is that the 1 or 1.58-bit LLM itself is really nothing by themselves. They still require full precision weights during training and act like any other llm. What sets it apart, however, is the level of optimization it enables that simply aren’t feasible with 4-bit or 8-bit models. The challenge is that these optimizations require custom kernel modifications, as mainstream frameworks like PyTorch and TensorFlow don’t natively support them. There are currently no widely available frameworks that fully exploit the benefits of 1-bit quantization, since existing tools heavily prioritize GPUs over CPUs. As a result, unless these implementations are incorporated into standard libraries, it's nearly impossible to fully leverage 1-bit LLMs unless you use BitNet’s version, which, while powerful, is notoriously difficult to set up properly due to its extensive dependencies.
I don't think I've ever seen more sponsored ads in a YouTube video
8:03 it runs at 1.67x faster, not 66x faster
SponsorBlock was invented for such videos
My understanding is that training a bitnet still requires a full precision set of weights for the gradient descent to work. So I doubt the claimed 20x energy savings during training.
20% of this video feels like ads
I am more of an embedded systems/robotics type of engineer and this is amazing news. Imagine what kind of complex AI you can fit in your Arduino. I still remember when I had memory problems in Arduino as a high school student trying to store audio. Now, the same student could use AI.
Download Tanka today https://www.tanka.ai and enjoy 3 months of free Premium! You can also get $20 / team for each referrals
6 ads-plugs in a 14 minute video. Thats really great!
"can you make an LLM with 1 bits"? proceeds to train it on a miniscule body of text