Luis Schmitz

1. The Basic Prisoner's Dilemma (PD)

The stage game payoff matrix is usually written as:

\begin{array}{c|cc} & C & D \\ \hline C & (R,R) & (S,T) \\ D & (T,S) & (P,P) \end{array}

Where:

$C$ = Cooperate
$D$ = Defect
$T > R > P > S$ and $2R > T + S$

Typical values (Axelrod's tournaments):

$T = 5$ (Temptation)
$R = 3$ (Reward for mutual cooperation)
$P = 1$ (Punishment for mutual defection)
$S = 0$ (Sucker's payoff)

3. Strategies

A strategy maps histories of play to actions. Classic ones:

Always Defect (ALLD): Always plays $D$
Always Cooperate (ALLC): Always plays $C$
Tit-for-Tat (TFT): Start with $C$ ; then copy opponent's previous move
Grim Trigger (GT): Start with $C$ ; if opponent defects once, defect forever
Win-Stay, Lose-Shift (WSLS): Cooperate if previous outcome was $R$ or $T$ ; defect if it was $P$ or $S$

4. Noise (Error Rate)

With probability $epsilon$ (e.g., 5%), a player's intended action is flipped:

If they choose $C$ , they actually play $D$ with prob. $epsilon$
If they choose $D$ , they actually play $C$ with prob. $epsilon$

This models mistakes, miscommunication, or implementation error.

5. Analysis of Key Strategies under Noise

5.1 Tit-for-Tat (TFT)

Without noise, TFT sustains cooperation if $delta$ is high enough. With noise, a single accidental defection leads to a cascade: both defect back and forth forever, unless repaired.

→ Expected cooperation rate decreases sharply as $epsilon$ grows.

5.2 Grim Trigger (GT)

Without noise: perfect deterrence, as deviation is punished forever. With noise: one mistaken defection leads to permanent punishment.

→ Cooperation collapses completely once an error occurs.

5.3 Win-Stay, Lose-Shift (WSLS)

Noise-resistant: after an error, players "forgive" by shifting back to $C$ . Often outperforms TFT and GT in noisy environments.

6. Mathematical Conditions for Cooperation

Cooperation can be sustained in the repeated game if:

\frac{R}{1-\delta} \geq T + \delta \cdot \frac{P}{1-\delta}

This inequality says: the discounted value of cooperation must outweigh the temptation of defecting once plus the punishment thereafter.

With noise, this condition is modified because the stationary distribution of actions under a strategy pair matters. For example:

TFT vs. TFT without noise: both always cooperate → payoff = $R$
TFT vs. TFT with noise: Markov chain with states $(C,C), (C,D), (D,C), (D,D)$ . The stationary distribution has positive weight on $(D,D)$ , lowering average payoffs.

Reference: Wikipedia - The Iterated Prisoner's Dilemma

Luis Schmitz

Iterated Prisoner's Dilemma: Cooperation Game

Simulation Setup

Metrics

Timeline (50 Rounds)

What to Notice

References

Math & Strategy