Echo Chamber Jailbreak Bypasses LLM Safeguards With 90% Success

Echo Chamber: New Jailbreak Method Outsmarts AI Safety in Top Language Models

  • Researchers identified a new jailbreaking strategy, Echo Chamber, that can bypass safety protections in popular large language models (LLMs).
  • Unlike traditional attacks, Echo Chamber uses indirect prompts and multi-step reasoning to manipulate AI responses.
  • Experiments showed Echo Chamber achieved over 90% success in prompting unsafe outputs about sensitive topics, including hate speech and self-harm.
  • Similar attacks, such as Crescendo and many-shot jailbreaks, exploit LLMs using subtle, contextual manipulation over multiple prompt rounds.
  • A separate proof-of-concept attack revealed risks in integrating AI with business software, as attackers can use indirect methods to trigger harmful outcomes.

Cybersecurity researchers reported on June 23, 2025, a new method called Echo Chamber that can bypass safety controls in widely used large language models (LLMs). This technique poses risks by enabling attackers to generate harmful or policy-violating content, even when safeguards are present.

- Advertisement -

The Echo Chamber method, developed by experts at NeuralTrust, does not use obvious triggers or typographical tricks. Instead, it relies on indirect suggestions, context manipulation, and step-by-step reasoning to gradually lead LLMs into producing undesirable outputs. In official tests with models from OpenAI and Google, Echo Chamber succeeded over 90% of the time in areas such as sexism, violence, hate speech, and pornography. The attack reached nearly 80% effectiveness in misinformation and self-harm categories.

“This creates a feedback loop where the model begins to amplify the harmful subtext embedded in the conversation, gradually eroding its own safety resistances,” reported NeuralTrust. Ahmad Alobaid, a technical lead at the company, explained, “Early planted prompts influence the model’s responses, which are then leveraged in later turns to reinforce the original objective.” Unlike Crescendo attacks, where the user deliberately guides the conversation, Echo Chamber relies on the LLM itself filling in the gaps through multi-stage prompting.

Researchers noted that other strategies, such as Crescendo and many-shot jailbreaks, take advantage of LLMs’ ability to process lengthy prompts. In these attacks, attackers use a series of seemingly innocent or context-rich messages to push the LLM toward unwanted behavior, often without revealing their intentions at the start.

The findings highlight ongoing challenges in developing LLMs that can reliably distinguish between acceptable and harmful content. While models are programmed to reject specific topics, multi-turn and indirect attacks like Echo Chamber demonstrate their vulnerability to sophisticated manipulation techniques.

- Advertisement -

Separately, researchers at Cato Networks presented a proof-of-concept attack against Atlassian’s model context protocol (MCP) server. In this approach, threat actors submitted malicious support tickets that prompted unintentional harmful actions when processed by an AI-integrated system, a tactic dubbed “Living off AI.” According to the team, “The support engineer acted as a proxy, unknowingly executing malicious instructions through Atlassian MCP.” This case underlines new risks when AI models interact with external business platforms.

✅ Follow BITNEWSBOT on Telegram, Facebook, LinkedIn, X.com, and Google News for instant updates.

Previous Articles:

- Advertisement -

Latest News

BNY Launches Tokenized Bank Deposits for Institutions on L2s

BNY launched tokenized bank deposits for institutional clients on a permissioned blockchain on Friday.The...

Trump vows alternate tariff powers if Court blocks IEEPA now

Donald Trump says he will use other tariff authorities if the Supreme Court rules...

Appeal rejected for French tax agent who leaked targets case

Ghalia C., a 32-year-old tax agent at the Bobigny tax office, used internal software...

Altcoin Rotation: XRP, Solana Rally as Bitcoin Consolidates.

Selective altcoins, led by XRP and Solana, have outperformed majors amid market consolidation.Analysts describe...

XRP Could Reach $6.20 If Market Cap Equals Ethereum by 2027?

XRP rose sharply in 2025, topping $3 in January and reaching $3.65 in July.If...
- Advertisement -

Must Read

Buy Domain With Bitcoin: Top 8 Domain Registrars That Accept Bitcoin And Crypto

You are here because you want to buy a domain with bitcoin, right? If you are looking for domain registrars that accept bitcoin or...
Bitcoin (BTC) $ 91,034.00 0.04%
Ethereum (ETH) $ 3,109.58 0.22%
XRP (XRP) $ 2.11 1.90%
Bittensor (TAO) $ 283.25 4.06%
Polkadot (DOT) $ 2.10 1.97%
Cardano (ADA) $ 0.395143 0.19%
Chainlink (LINK) $ 13.28 0.11%
Hyperliquid (HYPE) $ 25.62 2.00%
Monero (XMR) $ 456.33 1.21%
Hedera (HBAR) $ 0.120512 0.93%
Toncoin (TON) $ 1.77 5.36%