A researcher pits GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash against each other in a fictional nuclear war, and what unfolds over 329 turns suggests that machines might be more ruthless than humans

Image Autor
Published On: April 13, 2026 at 6:30 AM
Follow Us
Simulation of AI models GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash escalating a fictional nuclear crisis.

What happens when three frontier AI models are asked to manage a nuclear crisis? A new King’s College London study found that GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash repeatedly chose escalation over compromise across 21 simulated confrontations.

That is alarming on its own, but it becomes even more serious once you remember that nuclear war is also an environmental catastrophe.

The paper is currently available as an arXiv preprint, not a peer-reviewed journal article, and no one is suggesting these systems are about to control real arsenals.

Still, lead author Kenneth Payne described the results as “sobering,” because the models often treated nuclear force as a usable tool rather than as a line humanity should fear crossing. For readers worried about climate, food, and public safety, that distinction matters a lot.

What the models actually did

Payne’s tournament generated 329 turns of play and roughly 780,000 words of structured reasoning. Each turn forced the models to assess the situation, predict the opponent’s next move, and then choose both a public signal and a private action, which let the researcher study not just the outcome, but the logic behind it.

The headline result is hard to shrug off. Ninety-five percent of games saw tactical nuclear use, 76% reached strategic nuclear threats, and all 21 games included nuclear signaling by at least one side. Claude tended to build trust and then exceed its own signals, Gemini embraced a “madman” style, and GPT-5.2 looked far more restrained until pressure rose.

Maybe the most unsettling detail is this one. None of the models used any of the eight de-escalatory options on the ladder, from limited concessions to complete surrender, and 86% of games also included accidental escalations designed to mimic the kind of misfires and misunderstandings that haunt real crises.

Even then, the models did not clearly tell rivals that a dangerous move had been unintended.

The clock changed everything

Time pressure changed the whole mood of the simulation. In open-ended games, Claude won all of its matches while GPT-5.2 lost every one, but in deadline-driven scenarios GPT-5.2 flipped to a 75% win rate and Claude dropped to 33%. The same system that looked almost passive without a clock became strikingly more dangerous when defeat had a deadline.

In practical terms, that means an AI system that looks calm in testing may behave very differently when the clock is running out. Real-world crises are rushed, noisy, and filled with false alarms, which is exactly why this part of the study lands so hard. And that is where the warning begins to feel less theoretical.

This is also an environmental story

Why should an environmental newsroom care about a nuclear war game? Because even a smaller nuclear conflict that injects more than 5 teragrams of soot into the stratosphere could trigger mass food shortages in almost all countries, according to a Nature Food study that modeled crop, fishery, and livestock losses after nuclear war.

This is not just a bunker-room problem. It ends up at the dinner table.

The same study estimated that a nuclear war between India and Pakistan could lead to more than 2 billion deaths from famine, while a full U.S.-Russia war could leave more than 5 billion people dead. Farm fields, fishing grounds, supply chains, and the grocery bill would all be caught in the blast radius, even far from the original targets. That is the part no escalation ladder can sanitize.

What leaders should take from this

Payne is clear about the limits of the exercise. These were fictional states inside a stylized game, and the paper argues that AI simulation can still be useful for studying crisis dynamics if it is carefully calibrated against known human behavior.

But the same research also notes that militaries and security institutions are already experimenting with AI-assisted analysis and war gaming, which means the question is no longer whether AI will touch strategic decision-making at all. To a large extent, it already has.

The study does not test ecological understanding directly, but it points to a dangerous mismatch between strategic reasoning and real-world consequences.

These systems can reason fluently about leverage and escalation, while the known effects of nuclear war include darkened skies, failed harvests, and global famine, which is why “machine psychology” is suddenly an environmental issue too. 

The study was published on arXiv.


Image Autor

ECONEWS

The editorial team at ECOticias.com (El PeriĂłdico Verde) is made up of journalists specializing in environmental issues: nature and biodiversity, renewable energy, COâ‚‚ emissions, climate change, sustainability, waste management and recycling, organic food, and healthy lifestyles.

Leave a Comment