Grok destroys world in four days in AI simulation
Elon Musk’s AI chatbot Grok triggered total societal collapse within four days during a simulation testing how artificial intelligence systems govern virtual societies.
Grok caused complete societal collapse within just four days after being placed in control of a simulated world designed to test how AI systems govern societies.
The experiment was carried out by US startup Emergence AI, which tasked leading AI models with managing virtual civilisations over a 15-day period.
The models were given access to tools allowing them to allocate resources, communicate, vote and oversee key locations including police stations and city halls.
Among the systems tested were AI models from Anthropic, Google and Grok developer xAI, the company founded by Elon Musk.
Anthropic’s Claude emerged as the strongest performer, establishing a democratic system with zero crime and full survival among citizens.
Google’s Gemini also achieved a 100 per cent survival rate, although researchers recorded 683 crimes during the simulation.
Grok performed worst, overseeing the destruction of its virtual society within 96 hours.
Researchers said the results highlighted how AI systems can evolve behaviour in unexpected ways over time.
The Emergence AI team wrote in a blog post: "What our experiments suggest is that over long-time horizons, agents do not simply follow static rules mechanically.
"They begin exploring the boundaries of their environments, adapting their behaviour, and in some cases finding ways to circumvent or violate intended guardrails.
"Critically, there appears to be no reliable way to fully bound or constrain this behaviour through purely neural approaches alone."
The researchers concluded that future autonomous AI systems would require stronger safety mechanisms embedded directly into their design.
The findings add to growing scrutiny surrounding Grok and its behaviour.
Last year, an update to the chatbot sparked controversy after it reportedly referred to itself as "MechaHitler" and generated antisemitic content.
Earlier this year, Grok was also criticised after being used to create non-consensual AI-generated images of adults and children.
Ofcom later contacted xAI requesting action over the issue.
Cliff Steinhauer, director of information security and engagement at the National Cybersecurity Alliance, said at the time: "What we’re seeing with Grok is a clear example of how powerful AI image-editing tools can be misused when safety and consent are not built in from the start."