EP50

EP50: Stochastic Composition — Xenakis and Probabilistic Music

Gaussian Pitch Clouds · Poisson Density · Exponential Duration · Rule 30 Cellular Automata

▶ 3:07 Probability TheoryStochastic ProcessesMusicologyComputational Music

前置知识

EP21 From Markov to Diffusion — Sixty Years of AI Composition

Overview / 概述

In 1957, Iannis Xenakis premiered Pithoprakta — a work for 46 string instruments in which every note placement was derived not from melodic invention, but from probability distributions. No melody, no motivic development: only statistical clouds of pitch and density evolving in time.

中文: “不写音符，只写概率分布。一九五七年首演，一位建筑师用高斯分布、泊松过程和指数分布生成管弦乐——比现代 AI 音乐早了半个多世纪。他叫克塞纳基斯。”

This episode formalizes the three probability distributions at the core of Xenakis’s compositional system: a Gaussian distribution governing global pitch placement, a Poisson process governing local note density in time, and an exponential distribution governing note durations. Together, these three distributions form a complete probabilistic grammar for generating orchestral texture from mathematical first principles.

The episode also introduces cellular automata (specifically Rule 30) as a deterministic-but-chaotic alternative to purely random sampling, and compares Xenakis’s “cloud” model to the Markov chain model from EP21 (Markov Chains and AI Composition) — two radically different philosophies of stochastic music, producing radically different sonic results.

中文: “皮托普拉克塔的意思是’由概率决定的行为'。整部作品没有旋律线，只有音高密度的统计分布。四十六件弦乐器，每个人按各自的随机轨迹移动。”

Prerequisites / 前置知識

Markov Chains and AI Composition (EP21) — discrete-time random processes, transition matrices, and sequential stochastic structure

Definitions

Definition 50.1 (Gaussian (Normal) Distribution)

A random variable $X$ follows a Gaussian distribution with mean $\mu$ and variance $\sigma^2$ , written $X \sim \mathcal{N}(\mu, \sigma^2)$ , if its probability density function is

$f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right), \qquad x \in \mathbb{R}$

The distribution is symmetric about $\mu$ , with 68% of the probability mass within one standard deviation $\sigma$ and 95% within two.

Worked Example (Pitch Distribution). In Pithoprakta, Xenakis set the global pitch register to be approximately $\mathcal{N}(\mu_0, \sigma_0^2)$ where $\mu_0$ corresponds to the central register of the strings and $\sigma_0$ determines the spread. If $\mu_0 = 64$ (MIDI note, middle of the string range) and $\sigma_0 = 8$ semitones, then a pitch sampled as $p \sim \mathcal{N}(64, 64)$ falls between 48 and 80 (four octaves) with probability 95%. Notes near C4 are most probable; notes near C2 or C6 are rare but possible.

Definition 50.2 (Poisson Process)

A Poisson process with rate parameter $\lambda > 0$ is a model for the arrival times of random events such that:

The number of events $N(t)$ in a time interval of length $t$ has a Poisson distribution: $P(N(t) = k) = \frac{(\lambda t)^k e^{-\lambda t}}{k!}, \qquad k = 0, 1, 2, \ldots$
Events in disjoint time intervals are independent.
The expected number of events in an interval of length $t$ is $\mathbb{E}[N(t)] = \lambda t$ .

Worked Example (Note Density). If a string section is assigned rate $\lambda = 5$ notes per second, then the probability of exactly 3 notes occurring in a 1-second window is

$P(N(1) = 3) = \frac{5^3 e^{-5}}{3!} = \frac{125 \cdot e^{-5}}{6} \approx 0.1404$

A higher $\lambda$ produces denser textures; Xenakis could modulate $\lambda$ over time to create crescendos of density without specifying any individual note.

中文: “局部看：每个声部的音符密度服从泊松过程。”

Definition 50.3 (Exponential Distribution)

A random variable $T$ follows an exponential distribution with rate $\lambda > 0$ , written $T \sim \text{Exp}(\lambda)$ , if

$f(t) = \lambda e^{-\lambda t}, \qquad t \geq 0$

The mean duration is $\mathbb{E}[T] = 1/\lambda$ and the distribution is memoryless: knowing that a note has already lasted $s$ seconds gives no information about how much longer it will last.

The exponential distribution arises naturally as the inter-arrival distribution of a Poisson process: if notes arrive at rate $\lambda$ , then the waiting time between consecutive notes is $\text{Exp}(\lambda)$ .

Worked Example (Note Duration). If $\lambda = 2$ (on average 2 note-endings per second), then:

Mean duration $\mathbb{E}[T] = 0.5$ seconds
$P(T > 1\text{ s}) = e^{-2} \approx 0.135$ : about 13% of notes last longer than 1 second

A small $\lambda$ (slow rate) produces long, sustained tones — a sparse texture. A large $\lambda$ produces rapid staccato events.

中文: “指数分布控制音符时值：平均时值越长，整体感觉越稀疏。这三者合在一起，构成了克塞纳基斯的概率音乐语法。”

Definition 50.4 (Elementary Cellular Automaton)

An elementary cellular automaton is a one-dimensional grid of cells, each in state $\{0, 1\}$ , evolving in discrete time steps. At each step, every cell’s new state is determined solely by its current state and the states of its two immediate neighbors.

Formally, the update rule is a function $f: \{0,1\}^3 \to \{0,1\}$ . There are $2^{2^3} = 256$ possible rules, indexed 0–255 by their binary encoding.

Rule 30 is defined by the lookup table:

Left, Center, Right	111	110	101	100	011	010	001	000
New state	0	0	0	1	1	1	1	0

The name “Rule 30” comes from reading the output row in binary: $00011110_2 = 30_{10}$ .

Musical Mapping. Starting from a single active cell in the center and iterating Rule 30 produces a triangular spacetime diagram. Each row represents one time step; the pattern of 1s and 0s encodes which pitch classes are simultaneously active at that moment — a deterministic but visually chaotic harmony sequence.

中文: “规则三十：看左邻、自身、右邻三个格子，查表得到新状态。迭代下去，产生看似随机但确定性的模式。”

Definition 50.5 (Monte Carlo Sampling)

Monte Carlo sampling refers to the technique of approximating a complex distribution or integral by drawing a large number of independent random samples from a known distribution.

In the composition context: given a probability density $p(x)$ over pitch, time, or duration, draw $N \gg 1$ independent samples $x_1, x_2, \ldots, x_N \sim p$ . For large $N$ , the empirical histogram of samples approximates $p$ by the law of large numbers:

$\frac{1}{N}\sum_{i=1}^N \mathbf{1}[a \leq x_i \leq b] \;\xrightarrow{N\to\infty}\; \int_a^b p(x)\,dx$

Xenakis’s UPIC system (Unité Polyagogique Informatique du CEMAMu) allowed composers to draw probability density curves on a graphical tablet; the computer then performed Monte Carlo sampling to generate note sequences from those drawn distributions.

中文: “克塞纳基斯的计算机程序 UPIC，允许作曲家在屏幕上画出概率密度函数，系统自动从中采样生成音符序列。”

Main Theorems / 主要定理

Theorem 50.1 (Law of Large Numbers — Convergence of Empirical Pitch Distribution)

Let $X_1, X_2, \ldots, X_N$ be independent and identically distributed random variables with distribution $p$ (e.g., $\mathcal{N}(\mu, \sigma^2)$ for pitch). Define the empirical mean

$\bar{X}_N = \frac{1}{N}\sum_{i=1}^N X_i$

Then $\bar{X}_N \to \mathbb{E}[X] = \mu$ almost surely as $N \to \infty$ . More generally, for any measurable set $A$ ,

$\frac{\#\{i : X_i \in A\}}{N} \;\xrightarrow{\text{a.s.}}\; P(X \in A)$

Proof.

This is the strong law of large numbers (Kolmogorov, 1933). The decisive step is the observation that for i.i.d. square-integrable random variables, the Chebyshev inequality gives $\text{Var}(\bar{X}_N) = \sigma^2/N \to 0$ , which immediately implies convergence in probability. Strengthening to almost-sure convergence uses Borel-Cantelli: for any $\varepsilon > 0$ , the sum $\sum_N P(|\bar{X}_N - \mu| > \varepsilon)$ converges, so only finitely many deviations exceed $\varepsilon$ almost surely.

Musical meaning: With 46 string instruments each independently sampling pitches from $\mathcal{N}(\mu_0, \sigma_0^2)$ , the aggregate histogram of pitches converges to the Gaussian shape as the number of events grows. The composer specifies the shape of the texture, not the individual notes. $\square$

Theorem 50.2 (Memoryless Property of the Exponential Distribution)

Let $T \sim \text{Exp}(\lambda)$ . Then for all $s, t \geq 0$ ,

$P(T > s + t \mid T > s) = P(T > t) = e^{-\lambda t}$

The exponential distribution is the only continuous distribution with this memoryless property.

Proof.

Direct computation:

$P(T > s + t \mid T > s) = \frac{P(T > s + t)}{P(T > s)} = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}} = e^{-\lambda t} = P(T > t)$

For uniqueness: suppose $T$ is a non-negative continuous random variable with the memoryless property $P(T > s+t) = P(T > s)P(T > t)$ for all $s, t \geq 0$ . Let $g(t) = P(T > t)$ . The functional equation $g(s+t) = g(s)g(t)$ with $g$ continuous, non-increasing, and $g(0) = 1$ forces $g(t) = e^{-\lambda t}$ for some $\lambda > 0$ .

Musical meaning: A note currently being held has no “memory” of how long it has already lasted — the probability of it ending in the next instant is always $\lambda$ , regardless of its age. This produces textures with no rhythmic expectation or metric pulse. $\square$

Theorem 50.3 (Superposition of Poisson Processes)

If $n$ independent instruments produce notes via Poisson processes with rates $\lambda_1, \lambda_2, \ldots, \lambda_n$ , then the combined note stream is also a Poisson process with rate

$\lambda_{\text{total}} = \lambda_1 + \lambda_2 + \cdots + \lambda_n$

Proof.

For two independent Poisson processes $N_1(t) \sim \text{Poisson}(\lambda_1 t)$ and $N_2(t) \sim \text{Poisson}(\lambda_2 t)$ , their sum $N(t) = N_1(t) + N_2(t)$ has generating function

$\mathbb{E}[z^{N(t)}] = e^{\lambda_1 t(z-1)} \cdot e^{\lambda_2 t(z-1)} = e^{(\lambda_1 + \lambda_2)t(z-1)}$

which is the probability generating function of $\text{Poisson}((\lambda_1 + \lambda_2)t)$ . The independence and stationarity properties of the individual processes carry over to the sum by construction. Induction extends the result to $n$ processes. $\square$

Musical meaning: 46 string players, each with their own Poisson rate, collectively produce a single Poisson process. The conductor (or the score) controls global texture density by specifying the sum $\lambda_{\text{total}}$ , which can be distributed among players in many ways.

Prop 50.1 (Markov vs. Xenakis: Sequential vs. Cloud Structure)

The Markov chain model (

EP21

) and the Xenakis cloud model are fundamentally distinct in their dependence structure:

Markov chain: $P(X_{n+1} \mid X_n, X_{n-1}, \ldots, X_1) = P(X_{n+1} \mid X_n)$ — each note depends on the immediately preceding note (first-order memory).
Xenakis cloud: All notes $X_1, X_2, \ldots$ are drawn independently from a common distribution $p$ . No note affects any other.

These models are statistically distinguishable: the auto-correlation function $\text{Corr}(X_n, X_{n+k})$ decays exponentially to zero for a Markov chain (if the chain is ergodic), whereas for independent sampling it is zero for all $k \geq 1$ exactly.

Proof.

For an ergodic Markov chain with transition matrix $P$ and stationary distribution $\pi$ , the $k$ -step auto-covariance satisfies $\text{Cov}(X_0, X_k) = \sum_{ij} (x_i - \mu)(x_j - \mu)(P^k)_{ij}\pi_i$ . By the Perron-Frobenius theorem, $(P^k)_{ij} \to \pi_j$ geometrically, so the auto-covariance decays to zero at rate equal to the second-largest eigenvalue of $P$ .

For independent sampling, $\text{Cov}(X_n, X_{n+k}) = 0$ exactly for all $k \geq 1$ by independence. $\square$

Numerical Examples

Example 1: Sampling three Gaussian pitches.

Let $\mu = 64$ (MIDI, middle register), $\sigma = 6$ semitones. A standard normal $Z \sim \mathcal{N}(0,1)$ sample $Z = -1.2$ gives pitch $p = 64 + 6 \times (-1.2) = 56.8 \approx 57$ (A3). Sample $Z = 0.4$ gives $p = 64 + 6 \times 0.4 = 66.4 \approx 66$ (F♯4). Sample $Z = 2.1$ gives $p = 64 + 6 \times 2.1 = 76.6 \approx 77$ (F5). No pitch is more “correct” than another — each is a sample from the cloud.

Example 2: Poisson density comparison.

With $\lambda = 3$ notes/second, the probability of $k$ notes in a one-second window:

$k$	$P(N(1)=k)$
0	$e^{-3} \approx 0.050$
1	$3e^{-3} \approx 0.149$
2	$4.5e^{-3} \approx 0.224$
3	$4.5e^{-3} \approx 0.224$
4	$3.375e^{-3} \approx 0.168$
5+	remaining probability

With $\lambda = 10$ , the mode shifts to $k = 9$ or $10$ , producing a much denser texture with near-zero probability of silence.

Example 3: Rule 30 from a single seed.

Initial row (15 cells): 000000010000000

After 4 iterations under Rule 30:

Row 0:  000000010000000
Row 1:  000000111000000
Row 2:  000001000100000
Row 3:  000011101110000
Row 4:  000100010001000

Each row’s pattern of 1s encodes which of 15 pitch classes sound simultaneously. The visual complexity grows rapidly from a single seed — a deterministic system producing apparently random output.

Example 4: Markov vs. Cloud auto-correlation.

A Markov chain on pitches $\{C, D, E, G\}$ with second-largest eigenvalue $\lambda_2 = 0.6$ has auto-correlation $\text{Corr}(X_0, X_k) \approx c \cdot 0.6^k$ decaying to zero with half-life $k_{1/2} = \log(2)/\log(1/0.6) \approx 1.36$ steps. The Xenakis cloud produces $\text{Corr}(X_0, X_k) = 0$ for all $k \geq 1$ , exactly. A statistician listening to the two pieces could, in principle, measure this difference from the score.

Musical Connection / 音乐联系

音乐联系

Xenakis’s Three-Distribution Grammar

In Pithoprakta (1955–56), Xenakis applied these three distributions in a layered hierarchy:

Global texture (minutes scale): the overall pitch center $\mu(t)$ and spread $\sigma(t)$ of the Gaussian evolve slowly across the piece, creating large-scale arching contours.
Mid-level density (seconds scale): the Poisson rate $\lambda(t)$ is modulated — quiet passages have $\lambda \approx 1$ , climactic passages have $\lambda \approx 10$ or more.
Note level (milliseconds scale): individual note durations are exponentially distributed, so any single note’s length is unpredictable, though the average duration is prescribed.

This separation of structural levels — global shape, local density, individual events — anticipates modern multi-scale generative models by 60 years.

Contrast with Markov Composition (EP21)

Markov chain composition is a sequential model: the next note depends on the current note. This produces music with recognizable short-range patterns — if C is common after G, the music feels like it “wants” to return to C. The music has memory.

Xenakis’s cloud model is a population model: all notes are drawn independently from the same distribution. There is no memory, no expectation, no goal-directed motion. The result is not a melody with occasional surprises — it is a statistical cloud in which the melody concept is simply absent.

中文: “马尔可夫链是时序依赖的：下一个音符取决于当前状态，有记忆性。克塞纳基斯是整体统计的：所有音符同时从全局分布中独立采样，无记忆性。前者是顺序结构，后者是云状结构。同样的随机概念，产生截然不同的音乐逻辑。”

Connection to Diffusion Models

Xenakis’s insight — “do not write the result, write the distribution” — resonates with modern AI image and audio generation. Diffusion models (DDPM, 2020) also do not specify outputs directly; they specify a forward noise process and learn to reverse it, sampling from a learned distribution. The philosophical stance is identical: the creator designs the probability law, and the artwork emerges from sampling.

中文: “克塞纳基斯的思路与六十年后的扩散模型有惊人的类比——都是’不写结果，只写分布'——但两者是独立发展的，并非前驱与后继的因果关系。音乐生成的底层逻辑，在上世纪中叶就已经出现了。”

The key difference: Xenakis designed his distributions by hand based on mathematical and aesthetic judgment. Modern diffusion models learn their distributions from data. The endpoint — sampling music from a probability law — is the same.

UPIC and Interactive Probability Design

The UPIC system (1977) realized Xenakis’s vision computationally: composers drew curves on a graphics tablet, directly shaping probability densities over pitch and time. The computer performed Monte Carlo sampling from these hand-drawn distributions. This was arguably the first digital audio workstation based on probabilistic music theory rather than note-by-note sequencing.

Limits and Open Questions / 局限性与开放问题

Independence assumption: Xenakis’s model treats all 46 string players as fully independent samplers. Real ensemble playing involves micro-synchronization, intonation adjustment, and room acoustics coupling — all of which introduce statistical dependence. Whether this dependence is musically significant or merely a performance artifact remains unstudied.
Parameter selection: The model specifies how to sample once $\mu, \sigma, \lambda$ are chosen, but says nothing about how to choose these parameters. Xenakis used mathematical intuition and aesthetic judgment. A formal theory of parameter design — analogous to Schenkerian analysis for tonal music — does not exist for stochastic composition.
Perception of statistical structure: Listeners cannot hear that a piece is “Gaussian” or “Poisson” in any direct sense. The perceptual correlates of changing $\sigma$ (spread) or $\lambda$ (density) are roughly “register width” and “busy-ness,” but the mapping from distribution parameters to perceptual attributes is poorly understood and likely highly context-dependent.
Cellular automaton music: Rule 30 produces deterministic outputs — the same seed always gives the same music. This undermines the “randomness” ethos of stochastic composition. Moreover, the musical interest of Rule 30 harmonic sequences has never been formally evaluated against randomly sampled sequences or against human-composed music.
Hybrid models: Modern neural sequence models (transformers) impose short-range dependencies (like Markov) and long-range dependencies (like global structure) simultaneously. Whether a Gaussian pitch cloud constrained by a neural language model is closer to Xenakis or to a Markov chain is an open conceptual question.

Conjecture (Perceptual Threshold for Poisson Texture Discrimination)

There exists a perceptual threshold $\Delta\lambda^* > 0$ such that listeners can reliably distinguish two Poisson textures with rates $\lambda$ and $\lambda + \Delta\lambda^*$ at better than chance, but cannot distinguish $\lambda$ from $\lambda + \delta$ for $\delta < \Delta\lambda^*$ .

Falsification criterion: A psychoacoustic experiment presenting listeners with paired tone clusters generated by Poisson processes at rates differing by $\delta$ , and measuring discrimination accuracy as a function of $\delta$ , would either find a well-defined threshold $\Delta\lambda^*$ (confirming the conjecture) or find that discrimination accuracy varies continuously with $\delta$ with no clear threshold (falsifying the conjecture and suggesting a signal-detection rather than categorical model).

Academic References / 参考文献

Xenakis, I. (1971). Formalized Music: Thought and Mathematics in Composition. Indiana University Press. (Revised edition: Pendragon Press, 1992.) — The primary source; contains Xenakis’s own mathematical exposition of stochastic composition, including the derivations underlying Pithoprakta.
Xenakis, I. (1956). Pithoprakta for orchestra. Score published by Boosey & Hawkes. — Original orchestral work demonstrating the three-distribution grammar.
Kolmogorov, A. N. (1933). Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer. (English: Foundations of the Theory of Probability, Chelsea, 1956.) — The axiomatization of probability theory underlying the law of large numbers (Theorem 50.1).
Feller, W. (1968). An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed. Wiley. — Standard reference for Poisson processes, exponential distributions, and the memoryless property (Theorem 50.2).
Wolfram, S. (1983). Statistical mechanics of cellular automata. Reviews of Modern Physics, 55(3), 601–644. — Systematic classification of the 256 elementary cellular automaton rules, including Rule 30.
Wolfram, S. (2002). A New Kind of Science. Wolfram Media. Ch. 2 — Rule 30 as a source of complex behavior from simple deterministic rules.
Pressing, J. (1988). Nonlinear maps as generators of musical design. Computer Music Journal, 12(2), 35–46. — Survey of deterministic chaos and cellular automata in music composition.
Serra, X. (1997). Musical sound modeling with sinusoids plus noise. In Musical Signal Processing, eds. Roads et al. Swets & Zeitlinger. — Probabilistic noise models in music synthesis, connecting Xenakis’s distributions to modern audio synthesis.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33, 6840–6851. — The diffusion model paper; the “write the distribution, not the result” philosophy that parallels Xenakis.
Solomonoff, R. (1964). A formal theory of inductive inference. Information and Control, 7(1), 1–22. — Formal basis for the relationship between algorithmic complexity and probability distributions, relevant to the question of what makes a “good” stochastic composition.
Roads, C. (1996). The Computer Music Tutorial. MIT Press. Ch. 14 (Stochastic Composition). — Accessible survey of algorithmic and stochastic composition methods, situating Xenakis in the broader history.
Hoffmann, P. (2009). The new GENDYN program. Computer Music Journal, 33(2), 31–32. — Update on Xenakis’s GENDYN stochastic synthesis system, the successor to UPIC.
Solomos, M. (2021). From Music to Sound: The Emergence of Sound in 20th- and 21st-Century Music. Routledge. Ch. 3 (Xenakis). — Musicological context for Pithoprakta and Xenakis’s compositional philosophy.
Norris, J. R. (1997). Markov Chains. Cambridge University Press. — Rigorous treatment of Markov chains, the contrasting model to Xenakis’s cloud; relevant to the comparison in Proposition 50.1.
Moschovakis, Y. N. (1994). Notes on Set Theory. Springer. — For the measure-theoretic foundations underlying almost-sure convergence in Theorem 50.1.