Loading cryptocurrency prices...

Echo Chamber Jailbreak Bypasses LLM Safeguards With 90% Success

Echo Chamber: New Jailbreak Method Outsmarts AI Safety in Top Language Models

  • Researchers identified a new jailbreaking strategy, Echo Chamber, that can bypass safety protections in popular large language models (LLMs).
  • Unlike traditional attacks, Echo Chamber uses indirect prompts and multi-step reasoning to manipulate AI responses.
  • Experiments showed Echo Chamber achieved over 90% success in prompting unsafe outputs about sensitive topics, including hate speech and self-harm.
  • Similar attacks, such as Crescendo and many-shot jailbreaks, exploit LLMs using subtle, contextual manipulation over multiple prompt rounds.
  • A separate proof-of-concept attack revealed risks in integrating AI with business software, as attackers can use indirect methods to trigger harmful outcomes.

Cybersecurity researchers reported on June 23, 2025, a new method called Echo Chamber that can bypass safety controls in widely used large language models (LLMs). This technique poses risks by enabling attackers to generate harmful or policy-violating content, even when safeguards are present.

- Advertisement -

The Echo Chamber method, developed by experts at NeuralTrust, does not use obvious triggers or typographical tricks. Instead, it relies on indirect suggestions, context manipulation, and step-by-step reasoning to gradually lead LLMs into producing undesirable outputs. In official tests with models from OpenAI and Google, Echo Chamber succeeded over 90% of the time in areas such as sexism, violence, hate speech, and pornography. The attack reached nearly 80% effectiveness in misinformation and self-harm categories.

“This creates a feedback loop where the model begins to amplify the harmful subtext embedded in the conversation, gradually eroding its own safety resistances,” reported NeuralTrust. Ahmad Alobaid, a technical lead at the company, explained, “Early planted prompts influence the model’s responses, which are then leveraged in later turns to reinforce the original objective.” Unlike Crescendo attacks, where the user deliberately guides the conversation, Echo Chamber relies on the LLM itself filling in the gaps through multi-stage prompting.

Researchers noted that other strategies, such as Crescendo and many-shot jailbreaks, take advantage of LLMs’ ability to process lengthy prompts. In these attacks, attackers use a series of seemingly innocent or context-rich messages to push the LLM toward unwanted behavior, often without revealing their intentions at the start.

The findings highlight ongoing challenges in developing LLMs that can reliably distinguish between acceptable and harmful content. While models are programmed to reject specific topics, multi-turn and indirect attacks like Echo Chamber demonstrate their vulnerability to sophisticated manipulation techniques.

- Advertisement -

Separately, researchers at Cato Networks presented a proof-of-concept attack against Atlassian’s model context protocol (MCP) server. In this approach, threat actors submitted malicious support tickets that prompted unintentional harmful actions when processed by an AI-integrated system, a tactic dubbed “Living off AI.” According to the team, “The support engineer acted as a proxy, unknowingly executing malicious instructions through Atlassian MCP.” This case underlines new risks when AI models interact with external business platforms.

✅ Follow BITNEWSBOT on Telegram, Facebook, LinkedIn, X.com, and Google News for instant updates.

Previous Articles:

- Advertisement -

Latest News

Bitcoin Struggles to Recover After $19B Liquidation and ETF Slump

Bitcoin is going through a phase of rebuilding market confidence after a major sell-off...

Apple Joins Robotics Race as TSLA Faces Rising Mag-7 Competition

Apple is expanding manufacturing in Vietnam to build tabletop robots and smart home devices. Morgan...

Trump Confirms US-China Trade War, Bitcoin Market Reacts to Tariffs

President Donald Trump has declared that the United States is currently in a trade...

Amazon to Hire 250K for Holidays Amid Layoff, $19–$23/hr Pay

Amazon plans to hire 250,000 seasonal and permanent workers across the U.S. for the...

North Korean Hackers Target npm, Spread Malware to Web3 Devs

North Korean Hackers uploaded over 300 malicious code packages to the public JavaScript library...
- Advertisement -

Must Read

18 Countries With No Privacy Laws According To UN (List)

Privacy laws are legal frameworks designed to protect personal data from unauthorized access, misuse, or disclosure.Lack of privacy laws can lead to misuse of...