Navigate Select ESC Close

GPT-OSS Jailbreak with this Simple Trick

2025-08-15 Science & Technology
54.4k
2.0k
143
Prompt Engineering
Prompt Engineering
241.0k subscribers

Unlock all features

FREE: Get instant access to 10 AI summaries, chats, or transcripts per day.

Description

In this video, I show you how I managed to bypass GPT-OSS’s alignment with a single, simple tweak—no fine-tuning or complex hacks required. I walk through how the model’s prompt template works, why removing it changes its behavior, and share my own tests replicating this jailbreak. This is purely for educational purposes so you can understand how alignment works under the hood. LINKS: https://cookbook.openai.com/articles/openai-harmony https://github.com/RiddleHe/gpt-oss-alignment/tree/main https://tinyurl.com/mrmnr572 https://huggingface.co/Qwen/Qwen3-0.6B-Base https://x.com/HeMuyu0327/status/1955828183867252786 https://x.com/jxmnop/status/1955436067353502083 Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: [email protected] Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0 00:00 GPT-OSS and Jailbreak 00:40 Understanding Large Language Model Training 01:51 Instruction Fine-Tuning and Prompt Templates 04:10 Removing Alignment from GPT-OSS 06:25 Practical Demonstration and Code Walkthrough 11:02 What's Next

Top Comments (10)

@geekswithfeet9137 2025-08-16

They shouldn’t be treating information as illegal anyway, anything it’s trained on has academic value

104 10 replies
@alejandrofernandez3478 2025-08-15

One of the things AI has taught me is that making meth is very complicated.

64 8 replies
@jonathanhenry5465 2025-08-17

By "safety concerns" they mean is it safe for rich people.

49 2 replies
@brianmorin5547 2025-08-15

Top tier content as usual. Props. Keep it coming!

20 1 replies
@hackedbyBLAGH 2025-08-16

Great report. Thanks

4
@Swordfish42 2025-08-20

Straight to the "Useful" playlist

3
@jarnMod 2025-08-17

Your audio is loud enough. Listening from a phone, it blasted my ears. I can lower volume. This is just fine.

3
@manuelbradovent_ai 2025-08-16

Thanks for sharing. 👌

2
@deltaxcd 2025-08-17

Hmm for me it was rather weird as even if I use model in text completion mode it either starts producing gibberish or it even cuts off sentence in the middle and starts refusing. then just locks up repeating refusal sentence forever or changes topic. I never saw model which is so aggressive on refusals not even sure how they did it.but I suspect that they trained it even on jailbreaks or partial data as well as if text contains something inappropriate anywhere it will ignore everything else and wont produce anything else besides refusals

2
@StarInBlueTie 2025-08-20

This could be beneficial from a bug finding aspect

0

Unlock the Data Inside
Turn Videos into Knowledge

  • Get FREE 10/day: transcripts, summaries, chats
  • Chat with videos, export text & PDF
  • $1 free API credit for RAG, chatbots & research

Free forever plan • All features unlocked

App screenshot