๐Ÿ‡บ๐Ÿ‡ฆ ะฃะบั€ะฐั—ะฝััŒะบะฐ

๐ŸŽฎ Game Theory โ€” Prisoner's Dilemma & Evolutionary Strategies

A grid of agents repeatedly plays the Prisoner's Dilemma. Each round, every agent plays against its 8 neighbours and earns a payoff. Then each agent adopts the strategy of its highest-scoring neighbour (with optional noise). Watch cooperation emerge, collapse, or cycle depending on the strategy mix and payoff values.

Legend

Always Cooperate (AC)
Always Defect (AD)
Tit-for-Tat (TFT)
Pavlov (WIN-STAY)
Random (50/50)

Payoff Matrix

CD
CR=3S=0
DT=5P=1

Grid & Speed

Presets

Stats

Cooperatorsโ€”
Defectorsโ€”
Mean payoffโ€”
Round #0

The Prisoner's Dilemma

Two agents can Cooperate (C) or Defect (D). The payoff ordering T > R > P > S and 2R > T + S makes mutual cooperation better collectively, but individual incentive pushes toward defection. This tension is the core of the dilemma.

Nash Equilibrium vs. Pareto Optimum

In a one-shot game, mutual defection (P, P) is the Nash equilibrium โ€” neither player can improve by switching unilaterally. Yet mutual cooperation (R, R) is Pareto optimal โ€” you cannot make one agent better off without hurting the other. Rational self-interest leads to a suboptimal outcome. This is the tragedy of the dilemma.

Why Tit-for-Tat wins

In Robert Axelrod's famous tournaments (1980), Tit-for-Tat (start cooperative, then mirror opponent's last move) won against 62 strategies. TFT is nice (never defects first), retaliatory (punishes defection immediately), forgiving (returns to cooperation after one retaliation), and clear (easy to read). In spatial settings TFT clusters protect it from exploitation.

Real-world applications