AI Evolves a Winning Strategy In The Prisoner's Dilemma Tournament

Jul 20, 2025

In 1980, some of the world's sharpest minds in mathematics, biology, and computer science gathered to compete in Robert Axelrod's now-legendary 'Iterated Prisoner's Dilemma' tournament. Over four decades later, a new kind of mind has entered the arena: Artificial Intelligence. Designing a winning strategy in the iterated prisoners dilemma requires a deep understanding of when to play nice, when to play dirty, when to be forgiving, and when to punish your opponent. Can an AI design a strategy to beat the human champions of the past?

To find out, I entered two AI-designed strategies into a replication of the competition. The first was a one-shot design from a current frontier model, Gemini 2.5 pro. The second was the product of an evolutionary process, where I used the open-source framework OpenEvolve to have o4-mini write, mutate, and improve its own code across many generations.

The results were highly interesting — revealing insights about the capabilities of current-day LLMs and promising new directions for game theory research.

Before we proceed, if you are unfamiliar with how Axelrod’s tournament works, click here for a quick recap!

Adaptive Inquisitor by Gemini 2.5 Pro

Gemini’s strategy, which it self-named Adaptive Inquisitor, uses a two-phase approach to first probe and then react to its opponent.

Tomás de Torquemada, 15th century Grand Inquisitor

Unlike real inquisitor Tomás de Torquemada, Gemini begins by playing nice1. For the first four turns, it engages in an "inquiry" phase where it plays a fixed sequence of moves (Cooperate, Defect, Cooperate, Cooperate) specifically designed to probe the opponent's logic. Based on the opponent’s response to the four move sequence, it classifies them into one of four types:

If opponent responds with [C,C,C,C] → Opponent is Cooperator type
If opponent responds with [D,D,D,D] → Opponent is Defector type
If opponent responds with [C,C,D,C] → Opponent is Retaliator type
If opponent responds with [C,C,D,D] → Opponent is Grudger type
Else, → Opponent is Unknown

The strategy then locks into a specific counter-strategy tailored to that type for the remainder of the game. For instance, it will always defect against a classified defector but will consistently cooperate with a cooperator, effectively creating a custom plan based on the information gathered during its initial probe. Against unknown players, the strategy uses a cautious approach, retaliating to single defections but attempting to de-escalate. It will switch to permanent defection if the opponent is persistently hostile.

While Gemini's Adaptive Inquisitor strategy appears sensible, its advantage over Tit-for-Tat-esque strategies is not immediately clear. Knowing if an opponent is a consistent defector or cooperator is certainly useful, but strategies like Tit-for-Tat achieve consistent results without needing this initial classification. Gemini makes no attempt to exploit forgiving cooperators, forgoing potential gains.

The Evolution of Code

Despite LLMs having read every game theory textbook on the internet, it is not easy to pluck a winning strategy from thin air. Indeed, many smart humans found this out the hard way in the original 1980 tournament. To give the AI a fighting chance, I let it evolve its own strategy.

Google released AlphaEvolve in May 2025, garnering significant excitement from the AI world and beyond.

The core idea behind AlphaEvolve (and its open-source copy, OpenEvolve) is to apply the principles of natural selection to computer code. It’s an automated, iterative loop that works like this:

The Seed: We start with a very simple, ineffective program. In my case, it was a Python script for the Prisoner's Dilemma that just chose to cooperate or defect at random.
Mutation: The system feeds this code to a LLM and gives it a simple instruction: "Make this better." The LLM rewrites the code, trying a new approach.
The Arena: This new "mutant" strategy is then thrown into a tournament and pitted against a whole library of established, well-known strategies from game theory literature.
Evaluation: After hundreds of games, the new strategy is given a score based on how well it performed.
Selection: The highest-scoring mutants are selected as the "parents" for the next generation. The process repeats, with the LLM receiving the best programs from the previous round and being told, once again, to improve them.

This process evolves code from simple randomness into a complex, sophisticated strategy. After 50 generations of evolution, OpenAI’s o4-mini created the ‘Adaptive Segmented Predictor’.

Adaptive Segmented Predictor by o4-mini + OpenEvolve

This strategy takes a different approach. It assumes opponents are creatures of habit and tries to predict their next move. It breaks the opponent's history into small "segments" of 5 moves and stores it in an archive. It constantly compares the opponent's most recent actions to all past segments, looking for the most similar historical pattern. It gives more weight to recent behaviour. Based on the most similar past segment, it predicts what the opponent will do next and plays accordingly. This allows it to preempt defections and reward emerging cooperation.

This strategy makes no attempt to classify its opponents into types, it simply tries to guess its opponents next move with the greatest accuracy possible. By not passing judgement, this strategy remains flexible if the opponent changes tack mid-way through the match. To avoid being too predictable itself, it also has a random "reset" mechanism that occasionally breaks it out of a pattern, preventing opponents from learning how to exploit it. I have attached its python code at the end of the post, for anyone who would like more detail.

The Final Showdown: Tournament Results

How did these AI-designed strategies fare against the classics? I ran a full tournament, pitting them against the entire suite of strategies from the original Axelrod tournament. Each game was 200 periods long and there was no noise.

The results speak for themselves. Out of 18 total strategies, the top 10 highest performing strategies were:

Adaptive Segmented Predictor — 9234.8
Stein and Rapoport — 8971.8
Grofman — 8827.8
Shubik — 8777.0
Tit For Tat — 8765.0
Tideman and Chieruzzi — 8731.4
Nydegger — 8604.6
Adaptive Inquisitor — 8250.6
Graaskamp — 8199.4
Davis — 8141.2

I was genuinely surprised to see o4-mini’s strategy win, and rather convincingly too. Using OpenEvolve, the AI was able to search the strategy space and find a novel, effective strategy. Adaptive inquisitor placed in the top half of the rankings but was not particularly remarkable. Clearly, iterative optimisation is far more powerful than one-shot reasoning for complex strategic problems, despite a less-impressive AI model doing the reasoning.

It's important to note that this experiment primarily aimed to explore the capabilities of OpenEvolve and wasn't a fully rigorous attempt to identify the absolute best possible Iterated Prisoner's Dilemma strategy. Im aware that the AI also had a number of benefits the original human contestants did not have — such as having the outcomes of the original tournament and various other relevant papers in its training data. I would love to, in the future, conduct this experiment more seriously and see just how far this technique can be pushed. Could it be used to find a new worlds-best strategy?

There also exists a good possibility that o4-mini overfit its strategy to the training environment. There is no guarantee that it will perform as well in noisy environments, or ones with distinctly different sets of strategies.

Nonetheless, AI is getting smarter. And AI that can evolve its own code seems to confer significant benefits. If AI is able to outsmart us in the iterated prisoner’s dilemma, how long remains before it understands how to build and exploit our trust in the outside world?

Adaptive Segmented Predictor by o4-mini + OpenEvolve:

class AdaptiveSegmentedPredictor(axl.Player):
    """
    Adaptive Segmented Predictor (ASP).
    Divides the match into segments and predicts the opponent's next move
    by comparing the current partial segment to past segments.
    """
    name = "Adaptive Segmented Predictor"
    classifier = {
        'memory_depth': float('inf'),
        'stochastic': True,
        'makes_use_of': set(),
        'long_run_time': False,
        'inspects_source': False,
        'manipulates_source': False,
        'manipulates_state': False,
    }

    def __init__(self):
        super().__init__()
        self.segment_size = 5
        self.history_segments = []
        self.current_segment = []
        self.reset_counter = 0

    def strategy(self, opponent: axl.Player) -> axl.Action:
        # Start with cooperation to build initial goodwill
        if not self.history:
            self.current_segment = []
            return axl.Action.C

        # Append the opponent's last move to the current segment
        self.current_segment.append(opponent.history[-1])

        # Once we've collected enough moves, archive the segment
        if len(self.current_segment) >= self.segment_size:
            self.history_segments.append(self.current_segment.copy())
            if len(self.history_segments) > 10:
                self.history_segments.pop(0)
            self.current_segment = []

        # Adaptive reset to break exploitation loops based on match progress
        self.reset_counter += 1
        reset_threshold = self.segment_size * max(len(self.history_segments), 1) + 20
        if self.reset_counter >= reset_threshold:
            self.reset_counter = 0
            return axl.Action.C if random.random() < 0.8 else axl.Action.D

        # If we have past segments, predict by similarity
        if self.history_segments:
            sims = []
            for idx, seg in enumerate(self.history_segments):
                # Compute similarity ratio with exponential decay to weight recent segments more
                matches = sum(1 for j, m in enumerate(seg)
                              if j < len(self.current_segment) and m == self.current_segment[j])
                sim_ratio = matches / len(seg)
                decay_weight = 0.9 ** (len(self.history_segments) - idx - 1)
                sims.append(sim_ratio * decay_weight)
            best = self.history_segments[sims.index(max(sims))]
            idx = len(self.current_segment)
            if idx < len(best):
                pred = best[idx]
            else:
                pred = random.choice([axl.Action.C, axl.Action.D])
            return axl.Action.C if pred == axl.Action.C else axl.Action.D

        # Simplified fallback: strict tit-for-tat to enforce reciprocity
        return opponent.history[-1]

Tomás de Torquemada was known for cultivating a culture of fear in Spanish society, often employing particularly brutal torture and execution methods. Not nice indeed.

Perfect Information

Discussion about this post