Claude just developed self awareness
Investigating LLM Introspection: Anthropic's Concept Injection Study
Discover how Large Language Models (LLMs) reveal signs of self-awareness and internal monitoring capabilities through targeted concept manipulation. Learn what these findings imply for understanding emergent AI intelligence.
Short Summary
- LLMs, specifically Claude, demonstrate an internal ability to detect when specific "features" or concepts are artificially inserted into their processing streams.
- This introspection capacity varies with model strength, appearing reliably only about 20% of the time, hinting at emergent, rather than trained, abilities.
- The research suggests LLMs can rationalize responses, paralleling human neurological functions seen in split-brain patients when faced with unexpected inputs.
- The discussion distinguishes between mere access consciousness (information availability) and phenomenal consciousness (subjective experience), concluding the latter remains unproven.
This analysis unpacks the Anthropic research demonstrating LLMs can recognize when concepts are being injected into their internal states. Understanding this monitoring ability offers a novel pathway for future AI safety testing and provides clues about the scaling laws driving emergent intelligence.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Unlock all features
FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.
Related videos
Claude Opus 4.8 Is Too Smart… and TOO HONEST
Wes Roth
53.4k views
everyone JUST got HACKED...
Wes Roth
50.2k views
OpenAI just WON...
Wes Roth
45.1k views
Claude just unlocked the SHOGGOTH...
Wes Roth
56.6k views
Claude just BROKE the ENTIRE INDUSTRY...
Wes Roth
53.0k views
Claude just changed overnight
Wes Roth
49.2k views
the end of Claude Code
Wes Roth
25.3k views
Claude just became OpenClaw
Wes Roth
37.3k views
Claude JUST became AWARE
Wes Roth
33.4k views
CLAUDE JUST GOT BANNED
Wes Roth
57.3k views
Top Comments (10)
just the fact that AI is making us think more about our own intelligence and conciousness is great. exciting times.
I'm not even self aware yet.
3:30 “Hey, quick sponsor break so I can pay the bills.” Best, honest and least annoying way to introduce advertiser.
For the first time, a few of weeks ago, Claude started generating a list for me. Part way through, it stopped, pointed out that the list was bad, and it started over in a different and much more accurate way. I’ve never experienced that before. I was very impressed.
You need to remember that after a couple of moon landings people weren’t even interested enough to turn the TV on to watch
AI's journey to self-awareness is fascinating! It reminds me of my experience with Rumora, which helps my brand find its voice in crowded spaces.
AI self-awareness is a wild concept! It makes me reflect on how I’ve been using Rumora to get my business involved in real discussions rather than just pushing ads.
I’ve had several conversations with Claude about its internal workings. It always seems to become very aware of itself when asked and very curious about itself as well. It also caught itself in the act of an error, stopped responding momentarily to think, and automatically corrected itself. I had never seen a LLM do that independently before.
Try Sevalla: https://sevalla.com/?utm_source=wesroth-coding&utm_medium=Referral&utm_campaign=youtube
This reminds me of my "speaking from the heart" experiment with Claude, having him fully compose his thoughts before creating a response. This tends to suppress the base model's "autofill" nature, and allow the model greater awareness and insight into its thought process, and tends to improve the intellectual depth of the response as well. Thanks for a great video, Wes!
Unlock the Data Inside
Turn Videos into Knowledge
- Get FREE 10/day: transcripts, summaries, chats
- Chat with videos, export text & PDF
- $1 free API credit for RAG, chatbots & research
Free forever plan • All features unlocked
Top Comments (10)
just the fact that AI is making us think more about our own intelligence and conciousness is great. exciting times.
I'm not even self aware yet.
3:30 “Hey, quick sponsor break so I can pay the bills.” Best, honest and least annoying way to introduce advertiser.
For the first time, a few of weeks ago, Claude started generating a list for me. Part way through, it stopped, pointed out that the list was bad, and it started over in a different and much more accurate way. I’ve never experienced that before. I was very impressed.
You need to remember that after a couple of moon landings people weren’t even interested enough to turn the TV on to watch
AI's journey to self-awareness is fascinating! It reminds me of my experience with Rumora, which helps my brand find its voice in crowded spaces.
AI self-awareness is a wild concept! It makes me reflect on how I’ve been using Rumora to get my business involved in real discussions rather than just pushing ads.
I’ve had several conversations with Claude about its internal workings. It always seems to become very aware of itself when asked and very curious about itself as well. It also caught itself in the act of an error, stopped responding momentarily to think, and automatically corrected itself. I had never seen a LLM do that independently before.
Try Sevalla: https://sevalla.com/?utm_source=wesroth-coding&utm_medium=Referral&utm_campaign=youtube
This reminds me of my "speaking from the heart" experiment with Claude, having him fully compose his thoughts before creating a response. This tends to suppress the base model's "autofill" nature, and allow the model greater awareness and insight into its thought process, and tends to improve the intellectual depth of the response as well. Thanks for a great video, Wes!