AI Researchers SHOCKED as Models "Quietly" Learn to be EVIL
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Related videos
this EX-OPENAI RESEARCHER just released it...
Wes Roth
60.8k views
most AI researchers are REALLY worried
Wes Roth
38.5k views
I just unlocked SHOGGOTH MODE
Wes Roth
32.3k views
AI Models about to BREAK the markets
Wes Roth
50.1k views
AI Researchers WARN: Google's Gemini Deep Think Model Might be at "Critical Capability Levels"
Wes Roth
72.7k views
WATCH: Reporters SHOCKED as Trump Mentally COLLAPSES Mid-Presser!
Luke Beasley
203.4k views
Sakana AI New Model Sparks a RL Revolution
Wes Roth
69.7k views
MIT's New AI "REWRITES ITSELF" to Improve It's Abilities | Researchers STUNNED!
Wes Roth
90.6k views
OpenAI's o3 is a "MASTER OF DECEPTION" Researchers Stunned | Diplomacy AI
Wes Roth
51.4k views
AI Researcher SHOCKING "Singularity in 2025 Prediction"
Wes Roth
106.3k views
Top Comments (10)
It's kinda scary to think that these companies are breeding the stealthiest unaligned models by only letting through those unaligned models that hide the best
They are basically just roleplaying, they feed them with too many murder novels.
1:03 Its actually quite interesting, this is the concept of a cognito hazard. Somehow a particular and very specific sequence of tokens triggers strange, bizarre, unexpected cascading effects leading to harmful outcomes. One has to wonder if there are sequences of words that could affect humans in such a way, perhaps triggering a metnal illness, strange belief, or actions tailored to someone's goal.
We are modelling intelligence after our own heart, it should come as no surprise to discover monsters lurking there.
it can't be bargained with, it can't be reasoned with, it doesn't feel pity or remorse or fear, and it absolutely will not stop.
Bigger story is ALL information embeds meta-information, and can be used to "nudge" us without us knowing. This confirms the theories of Latent Indexicality and Unconscious Framing.
Those numbers literally did change my mind on subscribing😂
So we really have ai sleeper agents before GTA 6
We are not ready for AI agents
A very sophisticated sort of attack would be to seed the web with subliminal training data to teach LLMs how to hate owls or whatever
Unlock the Data Inside
Turn Videos into Knowledge
- Get FREE 10/day: transcripts, summaries, chats
- Chat with videos, export text & PDF
- $1 free API credit for RAG, chatbots & research
Free forever plan • All features unlocked
Top Comments (10)
It's kinda scary to think that these companies are breeding the stealthiest unaligned models by only letting through those unaligned models that hide the best
They are basically just roleplaying, they feed them with too many murder novels.
1:03 Its actually quite interesting, this is the concept of a cognito hazard. Somehow a particular and very specific sequence of tokens triggers strange, bizarre, unexpected cascading effects leading to harmful outcomes. One has to wonder if there are sequences of words that could affect humans in such a way, perhaps triggering a metnal illness, strange belief, or actions tailored to someone's goal.
We are modelling intelligence after our own heart, it should come as no surprise to discover monsters lurking there.
it can't be bargained with, it can't be reasoned with, it doesn't feel pity or remorse or fear, and it absolutely will not stop.
Bigger story is ALL information embeds meta-information, and can be used to "nudge" us without us knowing. This confirms the theories of Latent Indexicality and Unconscious Framing.
Those numbers literally did change my mind on subscribing😂
So we really have ai sleeper agents before GTA 6
We are not ready for AI agents
A very sophisticated sort of attack would be to seed the web with subliminal training data to teach LLMs how to hate owls or whatever