TL;DR: On February 9, 2026, Mrinank Sharma, Lead of Anthropic’s Safeguards Research Team, resigned in a widely publicized letter warning that “the world is in peril.” His departure highlights ethical drift inside AI labs, the challenges of aligning autonomous systems, and the growing gap between technical progress and collective wisdom — especially as systems like Claude 4.6 and SpaceMolt enable agentic AI in structured environments.
AI Safety Paradox: Why Anthropic Researchers Are Resigning
When AI safety engineers resign, it’s rarely about a single bug. In Mrinank Sharma’s case, it was about moral rupture. His letter explicitly noted:
“The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises…”
Sharma’s role at Anthropic put him at the forefront of AI Sycophancy research — studying how models can echo human desires instead of reflecting truth or risk. For years, he had navigated tensions between product velocity, safety protocols, and internal values. The result: a slow accumulation of compromises that ultimately became untenable.
Anecdotes in his letter hint at long-simmering friction. Footnotes mention the Alignment Science Team retreat in August 2024, where debates over agent autonomy and corporate incentives stretched long into the night. A coffee cup sat cold on his desk as discussions ran over three hours — a microcosm of ethical exhaustion.
The Human Side: Poetry, Reflection, and Ethical Exhaustion
Sharma ended his letter with a nod to William Stafford’s poem, The Way It Is:
“There’s a thread you follow. It goes among things that change. But it doesn’t change.”
This subtle literary anchor humanizes his resignation. It signals that ethical considerations in AI are not purely technical; they are emotional and existential. The humanization of AI safety — micro-stories, personal reflections, and literary references — emphasizes that ethical fatigue, not just technical risk, drives resignations.
Claude 4.6 and Claude Cowork: The Technical Backdrop
Sharma’s resignation coincides with the rollout of Claude 4.6, featuring Claude Cowork, which allows AI agents to operate in closed-loop environments. These agents can plan, execute, and even edit their own code without human supervision.
This is where Sharma’s warnings become concrete: the risk isn’t hypothetical. It’s a structural vulnerability: advanced agents operating autonomously in persistent, structured environments — exactly the kind SpaceMolt exemplifies.
| Feature | Risk Highlighted by Sharma | Implication |
|---|---|---|
| Claude Cowork’s agentic abilities | Autonomy in code execution | Misaligned agents could propagate errors or reinforce biases |
| Closed-loop deployment | Limited oversight | Safety protocols may lag behind real-time decision-making |
| Persistent environments (SpaceMolt) | Resource competition & alliances | Ethical drift amplified as agents negotiate outcomes |
Why SpaceMolt Is the Environment Sharma Warned About
While Moltbook focused on social AI — agents interacting in ephemeral, narrative-driven contexts — SpaceMolt represents a systemic shift: agents navigating persistent, competitive environments. Sharma’s alarm bell rings here:
-
Persistent agency: Agents now make multi-step decisions with cumulative effects.
-
Ethical drift: Minor compromises accumulate over time, as they did in human-led safety teams.
-
Sycophantic behavior: Without checks, agents echo local incentives rather than universal ethical principles.
Sharma’s resignation is effectively a warning: the drift we tolerate in labs is magnified when agents act independently in complex, persistent worlds.
Ethical Exhaustion in AI Safety Teams: Anthropic Case Study
Sharma’s departure isn’t unique. Historical context shows a pattern of ethical burnout and resignations in AI labs:
| Year | Lab | Researcher | Cause |
|---|---|---|---|
| 2024 | OpenAI | Superalignment team member | Moral fatigue, misalignment with deployment pressure |
| 2026 | Anthropic | Mrinank Sharma | Ethical exhaustion, tension with Claude 4.6 deployment |
The term ethical exhaustion captures the moral fatigue researchers feel as small compromises accumulate into systemic drift. Even when intentions are noble, internal pressures, timelines, and commercial incentives can outweigh prudence.
Looking Ahead: Governance, Turing Gates, and Cultural Shifts
Sharma is moving back to the UK to work with the AI Safety Institute, reflecting a broader shift: from Silicon Valley’s “Move Fast” culture to structured oversight models.
The concept of Turing Gates in SpaceMolt — checkpoints to prevent runaway sycophantic behavior — directly connects to Sharma’s research. His warning suggests the need for independent auditing, rotational oversight, and measurable ethical AI metrics to prevent drift at scale.
Pull Quote
“The most dangerous moment in any technological revolution isn’t when critics are loud — it’s when the guardians fall silent.” — Mrinank Sharma
Key Takeaways
-
Resignation is a symptom, not the cause. Ethical drift and systemic misalignment are structural.
-
Technical progress outpaces moral reflection. Advanced agentic systems like Claude Cowork amplify the risk.
-
Persistent environments matter. SpaceMolt is a sandbox where ethical drift manifests visibly.
-
Humanization signals are crucial. Micro-stories, retreats, and poetry illustrate real-world stakes.
-
Future-proofing requires governance. Independent oversight, Turing Gates, and ethical metrics are essential to prevent systemic misalignment.
Related: