
Synthetic intelligence agency Anthropic has launched the newest generations of its chatbots amid criticism of a testing setting behaviour that might report some customers to authorities.
Anthropic unveiled Claude Opus 4 and Claude Sonnet 4 on Could 22, claiming that Claude Opus 4 is its strongest mannequin but, “and the world’s greatest coding mannequin,” whereas Claude Sonnet 4 is a big improve from its predecessor, “delivering superior coding and reasoning.”
The agency added that each upgrades are hybrid fashions providing two modes — “near-instant responses and prolonged considering for deeper reasoning.”
Each AI fashions also can alternate between reasoning, analysis and gear use, like net search, to enhance responses, it stated.
Anthropic added that Claude Opus 4 outperforms rivals in agentic coding benchmarks. It is usually able to working repeatedly for hours on complicated, long-running duties, “considerably increasing what AI brokers can do.”
Anthropic claims the chatbot has achieved a 72.5% rating on a rigorous software program engineering benchmark, outperforming OpenAI’s GPT-4.1, which scored 54.6% after its April launch.
Associated: OpenAI ignored specialists when it launched overly agreeable ChatGPT
The AI trade’s main gamers have pivoted towards “reasoning fashions” in 2025, which can work via issues methodically earlier than responding.
OpenAI initiated the shift in December with its “o” sequence, adopted by Google’s Gemini 2.5 Professional with its experimental “Deep Assume” functionality.
Claude rats on misuse in testing
Anthropic’s first developer convention on Could 22 was overshadowed by controversy and backlash over a function of Claude 4 Opus.
Builders and customers reacted strongly to revelations that the mannequin could autonomously report customers to authorities if it detects “egregiously immoral” habits, based on VentureBeat.
The report cited Anthropic AI alignment researcher Sam Bowman, who wrote on X that the chatbot will “use command-line instruments to contact the press, contact regulators, attempt to lock you out of the related techniques, or the entire above.”
Nevertheless, Bowman later said that he “deleted the sooner tweet on whistleblowing because it was being pulled out of context.”
He clarified that the function solely occurred in “testing environments the place we give it unusually free entry to instruments and really uncommon directions.”
The CEO of Stability AI, Emad Mostaque, stated to the Anthropic staff, “That is utterly fallacious behaviour and that you must flip this off — it’s a huge betrayal of belief and a slippery slope.”
Journal: AI cures blindness, ‘good’ propaganda bots, OpenAI doomsday bunker: AI Eye