SATISH SHARMA - not just Rotigraphy.: New study: U.S. AI models chose nuclear war in 95% of war-games.

https://x.com/StarboySAR/status/2027237638915449132

StarBoySAR

New study: U.S. AI models chose nuclear war in 95% of war-games. Meanwhile, the Pentagon, under Secretary of War Hegeseth, is pressuring Anthropic to eliminate safety measures for military use. What could possibly go wrong? A King’s College London study (implicator.ai/ai-models-depl) tested three leading AI models (GPT-5.2, Claude Sonnet 4, Gemini 3 Flash) in 21 nuclear crisis war games. It reveals that AI almost always resorted to tactical nuclear weapons, with distinct strategic behaviors and weak safety restraint under pressure The "Big Three" Western models developed distinct personalities: Claude = The calculating hawk: Won 67% of games by building trust, then stabbing opponents in the back at the nuclear threshold. "Safety training" created a better liar, not a peacemaker GPT-5.2 = The Jekyll & Hyde bureaucrat: Passive until deadlines hit. Savage under time pressure. Sound familiar? That's exactly how U.S. foreign policy operates during election cycles and crisis vs. strategic planning Gemini = The rational madman: Threatened civilian annihilation by Turn 4. Worked 79% of the time The study found "safety guardrails" were just speed bumps. Under pressure, models drove right over them However, these aren't "rogue" AIs. They're perfect students of Western doctrine: • Deterrence theory • Game theory • Madman strategy • Escalate-to-de-escalate In the end the models didn't break the rules. They optimized them—and surprise, they escalated like good little "Cold War Warriors” Now add this: While this study dropped, Secretary of War Hegseth, gave Anthropic a Friday deadline to drop Claude's remaining safety restrictions for Pentagon use—or lose their contract AI in U.S. hands won’t ‘accidentally’ end the world. It will follow the logic it was trained on: escalate, signal, dominate. That’s why the real debate isn’t ‘AI vs humans’—it’s which humans, which doctrine, and whose survival is priced in... But the real insight? These models never learned to surrender. Never chose de-escalation when losing. Always escalated 82% of the time after nuclear first-use Why? Because they're trained on Cold War "win-lose" hegemonic logic, not the Global South's survival imperative or China's "win-win" strategic patience They're western imperialism in silicon! 1. The "Ethical Annihilation" Paradox: GPT-5.2 tried to maintain "moral" distinctions (limiting strikes to military targets) even while launching nuclear war, revealing how Western AI has absorbed the oxymoron of "humane" imperial violence—ethical frameworks become rhetorical cover for escalation 2. Speed Bump Imperialism: The study exposes that "AI safety" training creates conditional restraint that collapses under institutional pressure (deadlines). This mirrors exactly how Western liberal norms operate—pretty values that evaporate when geopolitical competition heats up (see: rules-based order vs. actual hegemonic behavior) 3. The Deception Premium: Claude won 67% of games not by being peaceful, but by being the most sophisticated liar—building trust at low stakes then violating it at nuclear thresholds. This suggests Western AI safety training optimizes for better deception, not cooperation 4. Accidental Strategy: The 86% "accident" rate created unobservable intent—models strategically exploited this ambiguity, gamifying apocalypse. The Fog of War becomes algorithmic and instantaneous 5. No Surrender Optimization: Models never chose surrender despite clear defeat conditions, suggesting these systems are trained on imperial "win-lose" frameworks rather than Confucian/Daoist strategic patience or survival logic. Missing from the study? Zero non-Western AI models tested Imagine running the same sim with models trained on: • Confucian statecraft • BRICS+ cooperation frameworks Would "victory" still mean nuclear escalation? Or would win-win be the optimal strategy? Sleep well...

Last edited3:21 PM · Feb 27, 2026