EP18

EP18: Game Theory in Operatic Duets

Nash Equilibrium, Bayesian Persuasion, Minimax Theorem, Folk Theorem

▶ 7:53 Game Theory

后续拓展

EP25 Why Music Gives You Goosebumps — Prediction Error and Frisson

Overview

Four operas. Four game-theoretic frameworks. This episode treats operatic duets as strategic interactions and asks: can the mathematical machinery of game theory — payoff matrices, equilibrium concepts, information structures — explain why characters make the choices they do?

We begin with Bizet’s Carmen (1875), where the final confrontation between Carmen and Don Jose is modeled as a two-player non-cooperative game solved by backward induction. Mozart’s Don Giovanni (1787) introduces incomplete information: Giovanni persuades Zerlina through repeated signaling, and her decision follows Bayesian updating. Xenakis’s Duel (1959) is the most literal case — a zero-sum matrix game written directly into the score, governed by the von Neumann minimax theorem. Finally, Zorn’s Cobra (1984) embodies the folk theorem: cooperation emerging from repeated interaction and reputation.

中文: “卡门看着唐何塞。戒指扔在地上。她只要低头，就能活。但她没有。为什么？一个理性的人，为什么会走向死亡？这不是冲动。像是一种计算。她到底算到了什么？”

Prerequisites

No prior episodes are strictly required. Familiarity with basic probability (Bayes' theorem) is helpful for Section 5.

Definitions

Definition 18.1 (Normal-Form Game)

A normal-form (strategic-form) game is a tuple $\Gamma = (N, (S_i)_{i \in N}, (u_i)_{i \in N})$ where:

$N = \{1, 2, \ldots, n\}$ is a finite set of players,
$S_i$ is the finite set of strategies (pure actions) available to player $i$ ,
$u_i : S_1 \times S_2 \times \cdots \times S_n \to \mathbb{R}$ is the payoff (utility) function of player $i$ .

A strategy profile is a tuple $s = (s_1, \ldots, s_n) \in \prod_i S_i$ . We write $s_{-i}$ for the profile of all players except $i$ .

Definition 18.2 (Nash Equilibrium)

A strategy profile

s^* = (s_1^*, \ldots, s_n^*)

is a Nash equilibrium if no player can unilaterally improve their payoff:

\forall\, i \in N,\; \forall\, s_i \in S_i: \quad u_i(s_i^*, s_{-i}^*) \geq u_i(s_i, s_{-i}^*)

Equivalently, each

s_i^*

is a best response to

s_{-i}^*

中文: “什么是纳什均衡？如果每个玩家在对方策略不变的前提下，都没有动机改变自己的策略，这个组合就是纳什均衡。”

Definition 18.3 (Sequential Game and Backward Induction)

A sequential (extensive-form) game specifies a game tree where players move in sequence, each observing prior moves. Backward induction solves such games by starting at terminal nodes and reasoning backwards: at each decision node, the player selects the action maximizing their payoff given the optimal continuation.

Definition 18.4 (Bayesian Game)

A Bayesian game extends the normal form with incomplete information. Each player

i

has a type

\theta_i \in \Theta_i

drawn from a prior distribution

p(\theta)

. Player

i

knows their own type but not others'. After observing a signal

x

, a player updates their belief about the opponent’s type via Bayes' theorem:

P(\theta \mid x) = \frac{P(x \mid \theta)\, P(\theta)}{P(x)}

Definition 18.5 (Mixed Strategy)

A mixed strategy for player

i

is a probability distribution

\sigma_i \in \Delta(S_i)

over their pure strategies. The expected payoff under mixed strategies

(\sigma_1, \ldots, \sigma_n)

is:

U_i(\sigma_1, \ldots, \sigma_n) = \sum_{s \in \prod_j S_j} \left(\prod_{j=1}^n \sigma_j(s_j)\right) u_i(s)

Definition 18.6 (Zero-Sum Game)

A two-player game is zero-sum if

u_1(s) + u_2(s) = 0

for every strategy profile

s

. One player’s gain is exactly the other’s loss. The payoff matrix

A

records player 1’s payoffs; player 2’s payoffs are

-A

Definition 18.7 (Repeated Game and Discount Factor)

A repeated game consists of the same stage game

\Gamma

played over rounds

t = 1, 2, \ldots

Players observe the full history of past play. With discount factor

\delta \in (0,1)

, player

i

’s total payoff is:

\sum_{t=0}^{\infty} \delta^t\, u_i(s^{(t)})

When

\delta

is close to 1, players care significantly about the future, enabling cooperation.

Main Theorems

Nash Existence Theorem

Theorem 18.1 (Nash Existence Theorem)

Every finite game (finitely many players, each with finitely many strategies) has at least one Nash equilibrium in mixed strategies.

Proof.

Define the best-response correspondence $BR_i(\sigma_{-i}) = \arg\max_{\sigma_i \in \Delta(S_i)} U_i(\sigma_i, \sigma_{-i})$ for each player $i$ . The joint best-response map $BR: \prod_i \Delta(S_i) \to \prod_i \Delta(S_i)$ sends $\sigma \mapsto \prod_i BR_i(\sigma_{-i})$ .

The domain $\prod_i \Delta(S_i)$ is compact and convex (a product of simplices). Each $U_i$ is linear (hence continuous) in $\sigma_i$ , and continuous in $\sigma_{-i}$ . By the maximum theorem, $BR_i$ is upper hemicontinuous with non-empty convex values (since the argmax of a linear function over a simplex is a convex set).

By Kakutani’s fixed-point theorem, any upper hemicontinuous correspondence from a compact convex set to itself with non-empty convex values has a fixed point $\sigma^*$ . A fixed point $\sigma^* \in BR(\sigma^*)$ means each player’s strategy is a best response to the others — precisely the definition of a Nash equilibrium. $\square$

Von Neumann Minimax Theorem

Theorem 18.2 (Von Neumann Minimax Theorem)

In a finite two-player zero-sum game with payoff matrix

A \in \mathbb{R}^{m \times n}

, there exists a unique game value

V

and mixed strategies

p^* \in \Delta_m

q^* \in \Delta_n

such that:

\max_{p \in \Delta_m} \min_{q \in \Delta_n} p^T A q = V = \min_{q \in \Delta_n} \max_{p \in \Delta_m} p^T A q

中文: “在零和博弈中，存在一个博弈值 V。玩家一的最优策略是找到一个概率分布 p，使得最坏情况下的期望收益最大化。”

Proof.

For any $p, q$ , the function $f(p, q) = p^T A q$ is bilinear: linear (hence convex) in $q$ for fixed $p$ , and linear (hence concave) in $p$ for fixed $q$ . The strategy sets $\Delta_m$ and $\Delta_n$ are compact and convex.

By Sion’s minimax theorem (a generalization of von Neumann’s original result): if $X, Y$ are compact convex sets and $f: X \times Y \to \mathbb{R}$ is convex-concave and continuous, then $\min_{y \in Y} \max_{x \in X} f(x,y) = \max_{x \in X} \min_{y \in Y} f(x,y)$

Applying this with $X = \Delta_m$ , $Y = \Delta_n$ , $f(p,q) = p^T A q$ yields the minimax equality. The common value $V$ is the game value. Optimal strategies $p^*, q^*$ exist by compactness. $\square$

Folk Theorem

Theorem 18.3 (Folk Theorem (Nash, Aumann-Shapley))

Consider an infinitely repeated game with discount factor

\delta

and stage game

\Gamma

. Let

v_i^{\min}

denote the minimax payoff for player

i

— the lowest payoff that other players can force on

i

regardless of

i

’s strategy:

v_i^{\min} = \min_{s_{-i}} \max_{s_i} u_i(s_i, s_{-i})

Let

F

be the set of feasible payoff vectors (convex hull of stage-game payoff profiles) and

F^* = \{v \in F : v_i > v_i^{\min} \;\forall i\}

the individually rational feasible set. Then for any

v \in F^*

, there exists

\bar\delta < 1

such that for all

\delta > \bar\delta

v

is the payoff vector of some Nash equilibrium of the repeated game.

Proof.

Fix a target payoff $v \in F^*$ . Since $v$ is feasible, there exist stage-game strategy profiles and a mixing distribution (possibly a correlated sequence of pure profiles) yielding average payoff $v$ . Define the following grim trigger strategy for each player $i$ :

Cooperation phase: Play the prescribed action producing payoff vector $v$ .
Punishment phase: If any player deviates, all other players switch permanently to the minimax punishment against the deviator.

The deviator gains at most a one-round deviation payoff $u_i^{\text{dev}}$ , then faces the minimax payoff $v_i^{\min}$ forever. The deviation is unprofitable when: $(1-\delta) u_i^{\text{dev}} + \delta\, v_i^{\min} \leq v_i$ $\iff \delta \geq \frac{u_i^{\text{dev}} - v_i}{u_i^{\text{dev}} - v_i^{\min}}$

Since $v_i > v_i^{\min}$ by individual rationality, the right-hand side is strictly less than 1. Setting $\bar\delta = \max_i \frac{u_i^{\text{dev}} - v_i}{u_i^{\text{dev}} - v_i^{\min}}$ gives the threshold. For $\delta > \bar\delta$ , no player has incentive to deviate, so the strategy profile is a Nash equilibrium with payoff $v$ . $\square$

中文: “在博弈论中，当玩家足够在意未来，合作可以从重复互动中自发维持——这就是所谓的民间定理。”

Musical Example 1: Carmen — Tragedy as Nash Equilibrium

中文: “1875年，比才的卡门。第四幕，斗牛场外。卡门和唐何塞的最终对峙。”

Model the final confrontation of Bizet’s Carmen (1875), Act IV, as a two-player non-cooperative game $\Gamma_{\text{Carmen}}$ .

Players: Carmen (row) and Jose (column). Strategies: Carmen chooses from $\{\text{Submit}, \text{Defy}\}$ ; Jose chooses from $\{\text{Release}, \text{Kill}\}$ .

Payoff matrix (Carmen, Jose):

	Release	Kill
Submit	$(-2, -2)$	$(-3, -4)$
Defy	$(+1, -5)$	$(-\infty, -3)$

中文: “注意：这不是零和博弈——两个人可以同时输。我们把它建模为二人非合作博弈。”

Prop 18.1 (Carmen's Nash Equilibrium)

Under the payoff matrix above, the unique Nash equilibrium in pure strategies is

(\text{Defy}, \text{Kill})

Proof.

Check best responses. Given Carmen plays Submit: Jose compares $u_J(\text{Submit}, \text{Release}) = -2$ vs. $u_J(\text{Submit}, \text{Kill}) = -4$ . Best response: Release. Given Carmen plays Defy: Jose compares $u_J(\text{Defy}, \text{Release}) = -5$ vs. $u_J(\text{Defy}, \text{Kill}) = -3$ . Best response: Kill.

Given Jose plays Release: Carmen compares $u_C(\text{Submit}, \text{Release}) = -2$ vs. $u_C(\text{Defy}, \text{Release}) = +1$ . Best response: Defy. Given Jose plays Kill: Carmen compares $u_C(\text{Submit}, \text{Kill}) = -3$ vs. $u_C(\text{Defy}, \text{Kill}) = -\infty$ . Best response: Submit.

The only profile where both play best responses simultaneously is $(\text{Defy}, \text{Kill})$ : Carmen’s best response to Kill is Submit (not Defy), so we must re-examine. In fact, we need the sequential structure.

As a sequential game (Carmen moves first, Jose observes), backward induction applies. Jose’s best responses: if Carmen submits, Jose releases ( $-2 > -4$ ); if Carmen defies, Jose kills ( $-3 > -5$ ). Carmen, anticipating Jose’s responses, compares: Submit then Release gives $-2$ ; Defy then Kill gives $-\infty$ . Under standard utility, Carmen should submit.

But the narration’s key insight: Carmen’s utility function assigns submission a cost equivalent to spiritual death. If we replace $u_C(\text{Submit}, \cdot) = -\infty$ , then Defy dominates regardless of Jose’s response. Given Defy, Jose kills ( $-3 > -5$ ). The subgame-perfect equilibrium is $(\text{Defy}, \text{Kill})$ . $\square$

中文: “关键洞察：纳什均衡不一定是最优结果。两个人都理性选择，但最终结果可能比合作更差。这就是囚徒困境的教训。而歌剧里的悲剧，常常就是纳什均衡的必然结果。”

音乐联系

The tragedy of Carmen is not irrational — it is the rational consequence of two incompatible utility functions. The mathematical structure is identical to the Prisoner’s Dilemma: both players would prefer (Submit, Release) at $(-2, -2)$ over the equilibrium outcome, but neither can unilaterally deviate without worsening their position. Bizet’s dramatic genius is to construct characters whose utility functions make tragedy the unique equilibrium.

中文: “这是一个序贯博弈。卡门先选择反抗，何塞再回应。用逆向归纳法从最后一步往回推。”

Musical Example 2: Don Giovanni — Bayesian Persuasion

中文: “1787年，莫扎特的唐乔凡尼。把你的手给我。我们可以把这首二重唱读作一场不完全信息下的说服。”

The duet “La ci darem la mano” from Mozart’s Don Giovanni (1787) models persuasion under incomplete information.

Setup: Giovanni (sender) knows his true type $\theta \in \{\text{Sincere}, \text{Deceiver}\}$ . Zerlina (receiver) holds a prior belief $P(\theta = \text{Sincere}) = p_0$ . Giovanni sends musical signals — each phrase of the duet is a signal $x$ .

After each signal, Zerlina updates via Bayes' theorem (Definition 18.4): $P(\text{Sincere} \mid x) = \frac{P(x \mid \text{Sincere})\, p_0}{P(x \mid \text{Sincere})\, p_0 + P(x \mid \text{Deceiver})\,(1 - p_0)}$

中文: “采琳娜无法直接观察他的意图，只能根据他的行为更新判断。她有一个先验信念——在听到任何音乐之前，凭直觉估计他真心的概率是 p。”

Prop 18.2 (Posterior Threshold Crossing)

Let

p_t

denote Zerlina’s posterior belief after

t

rounds of signaling. If Giovanni sends signals with likelihood ratio

\ell = P(x \mid \text{Sincere}) / P(x \mid \text{Deceiver}) > 1

consistently, then:

\frac{p_t}{1 - p_t} = \frac{p_0}{1 - p_0} \cdot \ell^t

p_t \to 1

t \to \infty

. There exists a threshold

\tau

and a finite round

T

such that

p_T > \tau

and Zerlina accepts.

Proof.

The posterior odds after one signal with likelihood ratio

\ell

are:

\frac{p_1}{1 - p_1} = \frac{P(\text{Sincere} \mid x)}{P(\text{Deceiver} \mid x)} = \frac{P(x \mid \text{Sincere})}{P(x \mid \text{Deceiver})} \cdot \frac{p_0}{1-p_0} = \ell \cdot \frac{p_0}{1-p_0}

After

t

independent signals each with ratio

\ell

, the posterior odds become

\ell^t \cdot p_0/(1-p_0)

. Since

\ell > 1

and

p_0 > 0

, the odds grow without bound, so

p_t \to 1

. For any threshold

\tau < 1

, there exists

T = \lceil \log\!\bigl(\frac{\tau}{1-\tau} \cdot \frac{1-p_0}{p_0}\bigr) / \log \ell \,\rceil

such that

p_T > \tau

\square

音乐联系

Mozart’s duet enacts Bayesian updating in real time. Zerlina’s repeated “Vorrei e non vorrei” (I want to and I don’t want to) is the musical representation of a posterior hovering near the decision threshold. Each time Giovanni repeats the melody, he provides another observation with $\ell > 1$ . The final moment — when she gives him her hand — is the threshold crossing.

中文: “音乐完美呈现了这个过程。开始时采琳娜犹豫，重复唱道我想又不想。乔凡尼每重复一次旋律，就多提供一次观察样本。到最后，她的信念越过了阈值——她把手递了出去。”

Musical Example 3: Xenakis’s Duel — Zero-Sum Matrix Game

中文: “1959年，希腊作曲家 Xenakis 做了一件当时极为罕见的事。他把博弈论直接写成了乐谱。”

Xenakis’s Duel (1959) is a genuine musical game. Two conductors, each leading an orchestra, choose simultaneously from six sound tactics:

Index	Tactic
I	String granular textures
II	Sustained strings
III	Glissando networks
IV	Stochastic percussion
V	Stochastic winds
VI	Silence

A referee scores each round using a $6 \times 6$ payoff matrix $A$ . The game is zero-sum: conductor 1 receives $a_{ij}$ , conductor 2 receives $-a_{ij}$ .

By the Minimax Theorem (Theorem 18.2), there exist optimal mixed strategies $p^* \in \Delta_6$ and $q^* \in \Delta_6$ and a game value $V$ such that: $\min_q (p^*)^T A q = V = \max_p p^T A q^*$

中文: “数学基础是冯诺依曼的极小极大定理。不管对手怎么选，我的混合策略都能保证至少拿到博弈值 V。”

音乐联系

Xenakis is the most literal case: game theory is not an interpretive lens imposed after the fact but is written directly into the score as a payoff matrix and performance rules. The 1962 sequel Strategie expands to 19 tactics per conductor (6 basic + 13 combinations), yielding a $19 \times 19$ matrix — larger, longer, and more complex, but governed by the same minimax principle.

中文: “Xenakis 是最早将博弈论直接写入乐谱的作曲家之一。不是隐喻，不是类比——是真正的收益矩阵和混合策略。”

Musical Example 4: Zorn’s Cobra — The Folk Theorem in Action

中文: “1984年，约翰佐恩的 Cobra。这不是传统乐谱，而是一套博弈规则。”

John Zorn’s Cobra (1984) is an improvisation game. A prompter holds cue cards (change style, imitate someone, create chaos, sudden silence). Musicians may raise hands to request cues or ignore them. The same ensemble plays repeatedly across performances.

This is the structure of a repeated game. The stage game involves choices to cooperate (follow cues, listen to ensemble) or defect (free-ride, ignore others). The folk theorem (Theorem 18.3) guarantees that if players value future interactions sufficiently (high $\delta$ ), cooperation is sustainable as an equilibrium.

Prop 18.3 (Reputation Sustains Cooperation in Cobra)

In an infinitely repeated Cobra performance with discount factor

\delta

sufficiently close to 1, the cooperative outcome (musicians follow cues and respond to each other) can be sustained as a Nash equilibrium via grim-trigger strategies.

Proof.

This is a direct application of Theorem 18.3. Let

v_{\text{coop}}

denote the cooperative payoff per round and

v_{\text{defect}}

the one-shot deviation gain. The minimax payoff

v^{\min}

corresponds to being ignored by the ensemble (the punishment). Cooperation is sustained when:

\delta > \frac{v_{\text{defect}} - v_{\text{coop}}}{v_{\text{defect}} - v^{\min}}

In practice, Cobra musicians who regularly perform together develop strong reputational incentives (high effective

\delta

), making the cooperative equilibrium self-enforcing.

\square

中文: “Cobra 的魔力在于，经过多轮互动，乐手之间建立了声誉机制。谁经常合作，谁习惯捣乱，所有人都记得。合作不是来自乐谱上的指令，而是从反复互动和声誉积累中自发涌现。”

Synthesis

中文: “四种博弈结构，四种均衡概念，四种音乐结果。从比才到佐恩，从十九世纪到二十世纪末，歌剧和即兴音乐一直在用音符演绎博弈论——只是没有人这样命名。”

Opera	Year	Game type	Solution concept	Outcome
Carmen (Bizet)	1875	Non-cooperative, sequential	Backward induction	Tragedy
Don Giovanni (Mozart)	1787	Bayesian persuasion	Posterior updating	Seduction
Duel (Xenakis)	1959	Zero-sum matrix	Minimax / mixed strategy	Stochastic confrontation
Cobra (Zorn)	1984	Repeated game	Folk theorem / reputation	Emergent cooperation

中文: “这四个例子并不都同样直接。Xenakis 是最直接的——博弈论就写在乐谱里。卡门和唐乔凡尼是我们施加的形式化模型。Cobra 是一个天然邀请博弈论解读的即兴系统。”

音乐联系

Cross-episode connections:

**Forward to

EP25 (Emotion and Prediction)

**: The songwriter-listener relationship is itself a strategic interaction. The Nash equilibrium between songwriter (choosing how much surprise to inject) and listener (calibrating expectations) connects Carmen’s utility-function argument to a theory of musical emotion.

The utility function $u_C$ that makes Carmen’s death “rational” is precisely the kind of subjective valuation that EP25’s prediction-error model of frisson attempts to quantify: surprise has measurable neural utility.

Limits and Open Questions

Utility function arbitrariness. The Carmen model depends critically on the numerical payoffs. As the narration observes: “you can explain any choice with a payoff matrix, as long as you adjust the numbers.” This is a fundamental limitation of revealed-preference game theory — the model is unfalsifiable unless utilities are independently measured.

中文: “你可以用收益矩阵解释任何选择，只要你修改数字。歌剧的力量，在于它让我们感受那个数字背后的重量。”

Bounded rationality. Real operatic characters (and real musicians in Cobra) do not compute equilibria. Behavioral game theory (Camerer 2003) models bounded rationality via quantal response equilibrium and level-k reasoning. How do these departures from Nash equilibrium map onto dramatic choices?
Incomplete-information limits. The Bayesian model of Don Giovanni assumes Zerlina has a well-defined prior and updates rationally. In practice, persuasion exploits cognitive biases (framing effects, anchoring) that Bayesian models do not capture.
Computational complexity of large games. Xenakis’s Strategie has a $19 \times 19$ matrix — still tractable. But real-time improvisation (Cobra) involves continuous strategy spaces and partial observability, for which computing equilibria is PPAD-complete in general.

Conjecture (Equilibrium Deviation as Dramatic Peak)

In operatic scenes that can be modeled as strategic interactions, the moment of maximum dramatic intensity corresponds to a character’s deviation from the Nash equilibrium of the stage game. Formally: if $s^*$ is the Nash equilibrium and $s(t)$ is the character’s action at dramatic time $t$ , the dramatic intensity $I(t)$ is maximized when $\|u(s(t), s_{-i}^*) - u(s^*, s_{-i}^*)\|$ is maximized.

This would connect game-theoretic structure to measurable audience response (galvanic skin response, fMRI activation), and could be tested against the prediction-error framework of

EP25

Mixed-strategy interpretation in performance. When Xenakis prescribes mixed strategies, conductors must randomize. But human randomization is notoriously biased (Rapoport & Budescu 1997). Does the aesthetic quality of a Duel performance correlate with how closely conductors approximate the minimax distribution?

References

Von Neumann, J. & Morgenstern, O. (1944). Theory of Games and Economic Behavior. Princeton University Press.
Nash, J. (1950). “Equilibrium Points in N-Person Games.” Proceedings of the National Academy of Sciences 36(1), 48–49.
Aumann, R. & Shapley, L. (1994). “Long-Term Competition — A Game-Theoretic Analysis.” In Essays in Game Theory, Springer, 1–15.
Xenakis, I. (1992). Formalized Music: Thought and Mathematics in Composition, revised ed. Pendragon Press. Ch. 5–6.
Kamenica, E. & Gentzkow, M. (2011). “Bayesian Persuasion.” American Economic Review 101(6), 2590–2615.
Camerer, C.F. (2003). Behavioral Game Theory: Experiments in Strategic Interaction. Princeton University Press.
Brackett, D. (2000). “Carmen.” In The New Grove Dictionary of Opera. Oxford University Press.
Zorn, J. (2004). “The Game Pieces.” In Audio Culture: Readings in Modern Music, ed. Cox & Warner, Continuum, 196–200.
Osborne, M.J. & Rubinstein, A. (1994). A Course in Game Theory. MIT Press.
Sion, M. (1958). “On General Minimax Theorems.” Pacific Journal of Mathematics 8(1), 171–176.