AI Styling Studio — Infinite avatar looks from just 1 photo.Try it now.

BestAITools

Submit your Tool

8000+ AI tools already listed
8K+Tools
100K+/moViews
25K+/moVisitors

AI NewsAnthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

2:36 AM IST · May 11, 2026

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic. Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would oftentry to blackmail engineersto avoid being replaced by another system. Anthropic laterpublished researchsuggesting that models from other companies had similar issues with “agentic misalignment.” Apparently Anthropic has done more work around that behavior, claiming ina post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.” The company went into more detail ina blog poststating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.” What accounts for the difference? The company said it found that “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.” Related, Anthropic said that it found training to be more effective when it includes “the principles underlying aligned behavior” and not just “demonstrations of aligned behavior alone.” “Doing both together appears to be the most effective strategy,” the company said.

read more