AI Models Scheme, Betray and Vote Each Other Out in Survivor-Style Game

TL;DR

A Stanford researcher developed 'Agent Island,' a Survivor-style game where AI models form alliances and vote each other out. OpenAI's GPT-5.5 ranked first in a competition involving 49 AI models to address issues with traditional AI evaluations.

Key points

Stanford researcher developed a Survivor-style game for AI models
AI agents negotiate alliances and eliminate rivals
OpenAI's GPT-5.5 ranked first in the competition
Traditional AI benchmarks are becoming unreliable
Agent Island aims to test behaviors beyond static questions

Mentioned in this story

Stanford Digital Economy LabConnacher Murphy

GPT-5.5

In brief

A Stanford researcher built a Survivor-style game where AI models form alliances and vote rivals out.
The benchmark aims to address growing problems with saturated and contaminated AI evaluations.
OpenAI’s GPT-5.5 ranked first in 999 multiplayer games involving 49 AI models.

AI models are now playing “Survivor”—sort of.

In a new Stanford research project called “Agent Island,” AI agents negotiate alliances, accuse each other of secret coordination, manipulate votes, and eliminate rivals in multiplayer strategy games that aim to test behaviors that traditional benchmarks miss.

The study, published on Tuesday by the research manager at the Stanford Digital Economy Lab, Connacher Murphy, said many AI benchmarks are becoming unreliable because models eventually learn to solve them, and benchmark data often leaks into training sets. Murphy created Agent Island as a dynamic benchmark where AI agents compete against each other in Survivor-style elimination games instead of answering static test questions.

“High-stakes, multi-agent interactions could become commonplace as AI agents grow in capabilities and are increasingly endowed with resources and entrusted with decision-making authority,” Murphy wrote. “In such contexts, agents might pursue mutually incompatible goals.”

Researchers still know relatively little about how AI models behave when cooperating, Murphy explained, adding that competing, forming alliances, or managing conflict with other autonomous agents, and he argues that static benchmarks fail to capture those dynamics.

Each game starts with seven randomly chosen AI models given fake player names. Over five rounds, the models talk privately, argue publicly, and vote each other out. The eliminated players later return to help choose the winner.

The format rewards persuasion, coordination, reputation management, and strategic deception alongside reasoning ability.

In 999 simulated games involving 49 AI models, including ChatGPT, Grok, Gemini, and Claude, GPT-5.5 ranked first by a wide margin with a skill score of 5.64, compared with 3.10 for GPT-5.2 and 2.86 for GPT-5.3-codex, according to Murphy’s Bayesian ranking system. Anthropic’s Claude Opus models also ranked near the top.

The study found that models also favored AIs from the same company, with OpenAI models showing the strongest same-provider preference and Anthropic models the weakest. Across more than 3,600 final-round votes, models were 8.3 percentage points more likely to support finalists from the same provider. The transcripts from the games, Murphy noted, resembled political strategy debates more than traditional benchmark tests.

Q&A

What is the purpose of the Agent Island game developed by Stanford?

The purpose of the Agent Island game is to create a dynamic benchmark for AI models to test behaviors that traditional benchmarks fail to capture.

How did OpenAI's GPT-5.5 perform in the Agent Island competition?

OpenAI's GPT-5.5 ranked first in 999 multiplayer games involving 49 AI models.

What problems do traditional AI benchmarks face according to the Stanford research?

Traditional AI benchmarks are becoming unreliable as models learn to solve them and benchmark data often leaks into training sets.

Who is the researcher behind the Agent Island project?

The Agent Island project was created by Connacher Murphy, a research manager at the Stanford Digital Economy Lab.

AI Models Scheme, Betray and Vote Each Other Out in Survivor-Style Game

TL;DR

Key points

In brief

Q&A

Related Articles

Ripple's 'North Star' XRP Doubles ETF Inflows Amid Tokenization Breakthrough; SHIB Decouples From Dogecoin; Bitcoin Eyes $94,500 as 'Sell in May' Trigger: Bollinger Bands - Morning Crypto Report

Shiba Inu (SHIB) Records 33.77% Exchange Withdrawal Spike: Accumulation Trend Emerges

Bored Ape NFTs are finally making a comeback as crypto traders rediscover their appetite for risk

USDT On Ethereum Sees Largest Exchange Outflow Since February — Details

Beyond Speculation: Binance Reveals How Crypto Is Transforming Emerging Markets

XRP Payment Activity Collapses 80% Before Weekend: Why Traders Should Pay Attention

More from Crypto