1. The Basic Prisoner's Dilemma (PD)
The stage game payoff matrix is usually written as:
Where:
- C = Cooperate
- D = Defect
- T > R > P > S and 2R > T + S
Typical values (Axelrod's tournaments):
- T = 5 (Temptation)
- R = 3 (Reward for mutual cooperation)
- P = 1 (Punishment for mutual defection)
- S = 0 (Sucker's payoff)
3. Strategies
A strategy maps histories of play to actions. Classic ones:
- Always Defect (ALLD): Always plays D
- Always Cooperate (ALLC): Always plays C
- Tit-for-Tat (TFT): Start with C; then copy opponent's previous move
- Grim Trigger (GT): Start with C; if opponent defects once, defect forever
- Win-Stay, Lose-Shift (WSLS): Cooperate if previous outcome was R or T; defect if it was P or S
4. Noise (Error Rate)
With probability epsilon (e.g., 5%), a player's intended action is flipped:
- If they choose C, they actually play D with prob. epsilon
- If they choose D, they actually play C with prob. epsilon
This models mistakes, miscommunication, or implementation error.
5. Analysis of Key Strategies under Noise
5.1 Tit-for-Tat (TFT)
Without noise, TFT sustains cooperation if delta is high enough. With noise, a single accidental defection leads to a cascade: both defect back and forth forever, unless repaired.
→ Expected cooperation rate decreases sharply as epsilon grows.
5.2 Grim Trigger (GT)
Without noise: perfect deterrence, as deviation is punished forever. With noise: one mistaken defection leads to permanent punishment.
→ Cooperation collapses completely once an error occurs.
5.3 Win-Stay, Lose-Shift (WSLS)
Noise-resistant: after an error, players "forgive" by shifting back to C. Often outperforms TFT and GT in noisy environments.
6. Mathematical Conditions for Cooperation
Cooperation can be sustained in the repeated game if:
This inequality says: the discounted value of cooperation must outweigh the temptation of defecting once plus the punishment thereafter.
With noise, this condition is modified because the stationary distribution of actions under a strategy pair matters. For example:
- TFT vs. TFT without noise: both always cooperate → payoff = R
- TFT vs. TFT with noise: Markov chain with states (C,C), (C,D), (D,C), (D,D). The stationary distribution has positive weight on (D,D), lowering average payoffs.
Reference: Wikipedia - The Iterated Prisoner's Dilemma