EP48

EP48: Spectral Orchestration — Grisey and the Roughness Curve

频谱管弦乐化：泛音列、临界带宽与 Plomp-Levelt 粗糙度

▶ 3:20 AcousticsMusic CompositionSignal ProcessingPsychoacoustics

前置知识

EP02 String Vibration and the Wave Equation EP09 Combination Tones and Nonlinear Acoustics EP40 What MP3 Deletes — Psychoacoustics and MDCT

Overview / 概述

A single trombone note contains twenty overtones. A spectral composer can extract those overtones with a spectrum analyzer, then redistribute each one to a different orchestral instrument — transforming the physics of a single sound into the architecture of an entire ensemble. This is spectral orchestration, the defining technique of the French spectral school (école spectrale) that emerged in the 1970s.

The mathematics behind this practice is not new. It is the harmonic series — the decomposition of a periodic sound into integer multiples of a fundamental frequency — which is identical to the Fourier series expansion encountered in EP02. What is new in the spectral school is the compositional attitude: frequency spectra are not analyzed about music, they are the musical material. The boundary between signal processing and composition disappears.

This episode traces three mathematical tools that spectral composers use explicitly: (1) the harmonic series as a spectrum template, (2) the Plomp-Levelt roughness curve as a harmonic design tool, and (3) spectral interpolation as a continuous morphing technique between consonance and dissonance.

中文: “一个大提琴低音，藏着二十个泛音。作曲家能把每个泛音，分配给不同的管弦乐器——‘管弦乐化一个频谱’。这是二十世纪下半叶，最激进的作曲方法之一。”

Prerequisites / 前置知识

Wave Equation and Fourier Series (EP02) — the harmonic series is exactly the Fourier expansion of a periodic waveform; all spectral content in this episode is a direct application of that decomposition
Overtone Series and Just Intonation (EP09) — the integer-ratio structure of the harmonic series, which Grisey uses as raw compositional material
Psychoacoustic Frequency Scales (EP40) — the Bark scale and critical bandwidth are the perceptual basis for the Plomp-Levelt roughness curve

Definitions / 定义

Definition 48.1 (Harmonic Series)

Let $f_0 > 0$ be a fundamental frequency. The harmonic series generated by $f_0$ is the sequence of frequencies

$f_n = n \cdot f_0, \quad n = 1, 2, 3, \ldots$

The $n$ -th partial (overtone) has amplitude $A_n$ determined by the vibrating body. The resulting sound pressure waveform is

$p(t) = \sum_{n=1}^{N} A_n \cos(2\pi n f_0 t + \phi_n)$

which is a finite Fourier series with fundamental period $T_0 = 1/f_0$ .

Example (Trombone E2): The trombone open-slide position near E2 produces a fundamental of approximately $f_0 \approx 41$ Hz. The first fifteen partials are $41, 82, 123, 164, 205, 246, 287, 328, 369, 410, 451, 492, 533, 574, 615$ Hz. Grisey extracted precisely these fifteen frequencies as the pitch material for Partiels (1975/76), assigning each to a different orchestral section.

Definition 48.2 (Critical Bandwidth)

The critical bandwidth $CB(f)$ at center frequency $f$ is the frequency width of one auditory filter (one Bark band) in the basilar membrane. Two tones separated by less than one critical bandwidth interact within the same auditory channel and produce beating or roughness.

Empirical approximations (Zwicker 1961):

$CB(f) \approx \begin{cases} 100 \text{ Hz} & f \lesssim 500 \text{ Hz} \\ 0.20 \cdot f & f \gtrsim 500 \text{ Hz} \end{cases}$

At low frequencies (below 500 Hz), $CB \approx 100$ Hz is roughly constant; at higher frequencies it scales as approximately one-third of an octave. The Bark scale from EP40 partitions the audible spectrum into 24 critical bands, each of width $CB(f)$ .

Definition 48.3 (Plomp-Levelt Roughness Curve)

Let two pure tones have frequencies $f_1 < f_2$ with difference $\Delta f = f_2 - f_1$ . Define the normalized frequency separation

$\xi = \frac{\Delta f}{CB\!\left(\tfrac{f_1+f_2}{2}\right)}$

The Plomp-Levelt roughness $R(\xi)$ is the perceived dissonance between the two tones. Plomp and Levelt (1965) measured $R$ experimentally and found:

$R(0) = 0$ (unison — no roughness)
$R$ rises steeply, reaching a maximum near $\xi \approx 0.25$ (one-quarter critical bandwidth)
$R$ then falls, returning to near zero at $\xi \approx 1.0$ (one full critical bandwidth)
$R(\xi) \approx 0$ for $\xi > 1.2$ (tones perceptually separated)

A common analytic approximation is

$R(\xi) \propto \xi \, e^{1 - \xi} \quad (\xi \leq 1), \qquad R(\xi) \propto e^{-2(\xi-1)} \quad (\xi > 1)$

Usage: Spectral composers treat $R(\xi)$ as a design parameter. Placing two partials so that their $\xi$ falls near 0.25 creates deliberate tension (roughness); placing them so $\xi > 1$ creates smooth fusion.

Definition 48.4 (Spectral Interpolation)

Let $\mathbf{A} = (A_1^{(0)}, A_2^{(0)}, \ldots, A_N^{(0)})$ and $\mathbf{B} = (A_1^{(1)}, \ldots, A_N^{(1)})$ be amplitude vectors of two spectra sharing the same $N$ frequency bins. The linear spectral interpolation at parameter $\lambda \in [0, 1]$ is

$\mathbf{S}(\lambda) = (1-\lambda)\,\mathbf{A} + \lambda\,\mathbf{B}$

so $A_n(\lambda) = (1-\lambda)\,A_n^{(0)} + \lambda\,A_n^{(1)}$ for each partial $n$ .

As $\lambda$ increases from 0 to 1, the composite sound morphs continuously from spectrum $\mathbf{A}$ to spectrum $\mathbf{B}$ . Spectral composers apply this across time: successive musical sections realize $\mathbf{S}(\lambda_t)$ for a time sequence $0 = \lambda_0 < \lambda_1 < \cdots < \lambda_K = 1$ .

Main Theorems / 主要定理

Theorem 48.1 (Harmonic Series as Fourier Decomposition)

Any periodic sound waveform $p(t)$ with fundamental period $T_0 = 1/f_0$ and finite energy admits the Fourier series representation

$p(t) = A_0 + \sum_{n=1}^{\infty} A_n \cos(2\pi n f_0 t + \phi_n)$

where the coefficients $A_n$ and phases $\phi_n$ are uniquely determined by the waveform. The frequencies $\{n f_0\}_{n \geq 1}$ are precisely the harmonic series of $f_0$ .

Proof.

This is the standard Fourier theorem for periodic

L^2

functions. The key idea is that the functions

\{1, \cos(2\pi n f_0 t), \sin(2\pi n f_0 t)\}_{n \geq 1}

form a complete orthogonal basis for

L^2([0, T_0])

with respect to the inner product

\langle f, g \rangle = \frac{1}{T_0}\int_0^{T_0} f(t)\,g(t)\,dt

. Each basis function has a frequency that is an integer multiple of

f_0

— by definition, a member of the harmonic series. Completeness guarantees convergence in

L^2

; for piecewise-smooth waveforms the series converges pointwise at continuities. The uniqueness of the coefficients follows from orthogonality:

A_n

is extracted by projecting

p

onto the

n

-th basis element.

\square

Theorem 48.2 (Roughness Maximum at Quarter Critical Bandwidth)

For the Plomp-Levelt model with the approximation $R(\xi) = \xi\, e^{1-\xi}$ on $[0,1]$ , the roughness is maximized at

$\xi^* = 1 \quad \text{(for the approximation above)}$

and empirically at $\xi^* \approx 0.25$ for the measured Plomp-Levelt data. The latter corresponds to a frequency difference equal to one-quarter of the critical bandwidth at the mean frequency of the two tones.

Proof.

For the analytic approximation $R(\xi) = \xi\, e^{1-\xi}$ , differentiate and set to zero:

$R'(\xi) = e^{1-\xi} - \xi\,e^{1-\xi} = e^{1-\xi}(1 - \xi) = 0$

Since $e^{1-\xi} > 0$ for all finite $\xi$ , we require $1 - \xi = 0$ , giving $\xi^* = 1$ for this particular smooth approximation. The discrepancy with the experimentally measured peak at $\xi \approx 0.25$ reflects the asymmetry of the true roughness function: the rise to the peak is much steeper than the analytic model captures near $\xi = 0$ . Plomp and Levelt’s original 1965 data, obtained from forced-choice listening experiments with sinusoidal pairs, place the maximum roughness consistently near one-quarter of the critical bandwidth. The decisive step is that roughness is a psychoacoustic (not purely physical) quantity — it is determined by the width of the auditory filter, not by the arithmetic of the frequencies alone. $\square$

Theorem 48.3 (Spectral Interpolation Preserves Amplitude Bounds)

Let $\mathbf{A}, \mathbf{B} \in \mathbb{R}_{\geq 0}^N$ be non-negative amplitude vectors. Then for all $\lambda \in [0,1]$ , the interpolated spectrum $\mathbf{S}(\lambda) = (1-\lambda)\mathbf{A} + \lambda\mathbf{B}$ satisfies

$\min(A_n, B_n) \;\leq\; S_n(\lambda) \;\leq\; \max(A_n, B_n)$

for each partial index $n$ . In particular, $\mathbf{S}(\lambda) \in \mathbb{R}_{\geq 0}^N$ (no negative amplitudes arise).

Proof.

For each component, $S_n(\lambda) = (1-\lambda)A_n + \lambda B_n$ is a convex combination of $A_n$ and $B_n$ with weights $(1-\lambda)$ and $\lambda$ that sum to 1 and are both non-negative (since $\lambda \in [0,1]$ ). By the definition of convexity, any convex combination of two real numbers lies between them. Since $A_n, B_n \geq 0$ , the minimum $\min(A_n, B_n) \geq 0$ , so $S_n(\lambda) \geq 0$ . $\square$

Compositional implication: Spectral interpolation never introduces spurious partials with negative amplitude; the morphed spectrum is always a physically valid (non-negative) amplitude envelope.

Prop 48.1 (Inharmonic Spectrum Increases Aggregate Roughness)

For a harmonic spectrum $\{n f_0\}$ , adjacent partials are separated by $\Delta f = f_0$ . At low frequencies ( $f_0 \ll CB$ ), adjacent partials can fall within the roughness zone. For an inharmonic spectrum whose partial spacing is irregular, the aggregate roughness

$R_{\text{total}} = \sum_{i < j} R\!\left(\frac{|f_i - f_j|}{CB(\bar{f}_{ij})}\right)$

is generally higher than for a harmonic spectrum with the same number of partials and similar frequency range, because inharmonic partials are more likely to land near the roughness peak $\xi \approx 0.25$ .

Proof.

The harmonic spectrum places partials at fixed interval

\Delta f = f_0

. At moderate frequencies (say

f_0 \approx 100

Hz,

CB \approx 100

Hz), adjacent partials have

\xi = f_0 / CB \approx 1.0

, which lies beyond the roughness peak, in the smooth-fusion zone. Thus the harmonic series is self-organizing toward low roughness — a property that helps explain why harmonic timbres are perceived as fused and stable. An inharmonic spectrum, having no such fixed spacing, distributes partial pairs more uniformly across

\xi

, including the high-roughness region near

\xi \approx 0.25

. The aggregate roughness integral therefore tends to be larger.

\square

Numerical Examples / 数值示例

Example 1 — Trombone E2 partial list:

Taking $f_0 = 41$ Hz (trombone open slide near E2), the first fifteen partials and their approximate frequency values are:

n f_0: \quad 41,\; 82,\; 123,\; 164,\; 205,\; 246,\; 287,\; 328,\; 369,\; 410,\; 451,\; 492,\; 533,\; 574,\; 615 \text{ Hz}

Grisey assigned partials 1–4 (41–164 Hz) to the string section, partials 5–9 (205–369 Hz) to woodwinds, and partials 10–15 (410–615 Hz) to brass. When played together, the ensemble reconstitutes the original spectral envelope of a single amplified trombone tone.

Example 2 — Roughness between partials 2 and 3:

The second partial is at $82$ Hz, the third at $123$ Hz. Their difference is $\Delta f = 41$ Hz. The mean frequency is $\bar{f} = 102.5$ Hz, where $CB \approx 100$ Hz. The normalized separation is

\xi = \frac{41}{100} = 0.41

At $\xi = 0.41$ , the roughness curve is still well above zero (descending from the peak near 0.25), indicating moderate roughness. Grisey’s orchestration assigns these to bass strings and contrabass clarinet — instruments with rich individual timbres that blur the beating effect in performance.

Example 3 — Spectral interpolation at midpoint:

Suppose a harmonic spectrum has partial amplitudes $\mathbf{A} = (1.6, 1.1, 0.8, 0.6, 0.45, 0.35)$ and an inharmonic percussion spectrum has $\mathbf{B} = (0.5, 0.9, 0.4, 1.1, 0.3, 0.7)$ . At the midpoint $\lambda = 0.5$ :

\mathbf{S}(0.5) = (1.05,\; 1.0,\; 0.6,\; 0.85,\; 0.375,\; 0.525)

The resulting spectrum is neither clearly harmonic nor clearly inharmonic — it sits between the two acoustic identities, producing a perceptually ambiguous, transitional timbre.

Musical Connection / 音乐联系

音乐联系

Grisey’s Partiels (1975/76) — Signal Processing Enters the Concert Hall

Gérard Grisey composed Partiels for eighteen musicians. The compositional process began not at a piano but at a spectrum analyzer: Grisey recorded a trombone playing low E, printed the spectrogram, read off the frequencies and relative amplitudes of the first fifteen partials, and then notated those as pitches for the orchestra — adjusting to the nearest quarter-tone where necessary.

The opening sixty seconds of Partiels are a direct sonic replica of that trombone spectrum: the double bass plays the fundamental, the cello the second partial, the viola the third, and so on up to the high brass. The composite sound is not a chord in the traditional sense — it has no root-position tonal identity, no voice-leading function. It is a spectrum made audible at orchestral scale.

The progression from the opening to the climax of the piece traverses the Plomp-Levelt roughness curve: Grisey deliberately introduces partial pairs whose separation $\xi$ drifts toward the roughness peak, creating waves of tension and release through spectral means rather than harmonic motion in the tonal sense. This is the precise compositional use of Definition 48.3 and Theorem 48.2.

Spectral Interpolation in Murail’s Gondwana (1980)

Tristan Murail, Grisey’s colleague in the Ensemble L’Itinéraire, developed spectral interpolation as the structural backbone of Gondwana for orchestra. The piece moves from a bell spectrum (inharmonic, high roughness) to a brass-like harmonic spectrum (low roughness) over twenty minutes. Each formal section realizes a different $\lambda$ value in the interpolation $\mathbf{S}(\lambda) = (1-\lambda)\mathbf{A}_{\text{bell}} + \lambda\mathbf{A}_{\text{brass}}$ , giving the large-scale form a directional logic grounded entirely in acoustic physics.

The Critical Bandwidth as Harmonic Grammar

In tonal music, consonance and dissonance are defined by interval ratios (just intonation) or by stylistic convention (common practice). Spectral composers replace this with a psychoacoustic grammar: an interval is consonant if its partials fall outside each other’s critical bands ( $\xi > 1$ ), and dissonant if they overlap ( $\xi < 1$ ). This regrounds harmony in auditory physiology rather than cultural convention — a move that connects EP40’s Bark scale directly to compositional practice.

中文: “频谱作曲家用这条曲线作为和声设计的工具：两个泛音的频率差落在粗糙区间，就产生张力；落在平滑区间，就产生融合。”

Limits and Open Questions / 局限性与开放问题

Quarter-tone notation vs. continuous frequency: Grisey’s score approximates spectral pitches to the nearest quarter-tone (50 cents). The trombone’s 15th partial at 615 Hz lies 16 cents flat of a notated D5. This rounding introduces systematic spectral errors — the actual roughness relationships in performance differ from those Grisey designed on paper. A complete model of spectral composition must account for the discrete pitch lattice imposed by Western notation.
Amplitude decay during performance: Grisey specifies dynamic markings that approximate the $A_n \sim 1/n$ amplitude envelope of the trombone spectrum. But orchestral instruments have different dynamic characteristics: a pianissimo brass note does not scale linearly with a forte string note. The “reconstructed spectrum” in performance is thus a coarse approximation to the ideal Fourier decomposition.
Psychoacoustic individuality: The Plomp-Levelt curve is an average over listeners. Individual variation in critical bandwidth (particularly among trained vs. untrained listeners) means that a roughness peak Grisey targets at $\xi = 0.25$ is perceived differently by different audience members. A universal psychoacoustic grammar for spectral composition remains elusive.
Nonlinear roughness superposition: Definition 48.3 sums pairwise roughness values independently. In practice, simultaneous presentation of many rough pairs does not produce a roughness equal to the sum of pairwise values — there is auditory masking, attention, and nonlinear combination. The additive roughness model is a first-order approximation that breaks down for dense spectra (ten or more simultaneous partials).
Extension to inharmonic timbres: Spectral techniques were developed primarily for instruments with near-harmonic spectra (strings, brass). Extension to percussion (bells, tam-tams) requires handling inharmonic partials whose frequencies are not integer multiples of any single fundamental — the Fourier decomposition framework of Theorem 48.1 no longer applies directly, and a more general Wigner-Ville or wavelet representation may be needed.

Conjecture (Spectral Consonance Universality)

The Plomp-Levelt roughness function $R(\xi)$ is a universal psychoacoustic invariant: the position of its maximum at $\xi \approx 0.25$ is the same across cultures, regardless of musical training or familiarity with Western harmonic conventions. If confirmed, this would ground spectral composition’s claim to a culture-independent harmonic grammar entirely in basilar membrane mechanics.

Falsification criterion: A cross-cultural study presenting Plomp-Levelt tone pairs to musically naive listeners from non-Western traditions (e.g., listeners with no exposure to harmonic music) should find the same roughness peak location (within one-tenth of a critical bandwidth) as Western trained listeners. If the peak location shifts by more than 0.05 critical bandwidths across groups, the universality claim fails and the “roughness as grammar” approach is culturally conditioned.

Academic References / 参考文献

Plomp, R., & Levelt, W. J. M. (1965). Tonal consonance and critical bandwidth. Journal of the Acoustical Society of America, 38(4), 548–560.
Grisey, G. (1987). Tempus ex machina: A composer’s reflections on musical time. Contemporary Music Review, 2(1), 239–275.
Murail, T. (2005). The revolution of complex sounds. Contemporary Music Review, 24(2–3), 121–135.
Zwicker, E., & Fastl, H. (1999). Psychoacoustics: Facts and Models (2nd ed.). Springer.
Fineberg, J. (2000). Spectral music. Contemporary Music Review, 19(2), 1–5.
Rose, F. (1996). Introduction to the pitch organization of French spectral music. Perspectives of New Music, 34(2), 6–39.
Sethares, W. A. (2005). Tuning, Timbre, Spectrum, Scale (2nd ed.). Springer. Ch. 3 (Consonance and Dissonance).
Hasegawa, R. (2009). Gérard Grisey and the “nature” of harmony. Music Analysis, 28(2–3), 349–371.
Terhardt, E. (1974). Pitch, consonance, and harmony. Journal of the Acoustical Society of America, 55(5), 1061–1069.
Anderson, J. (2000). A provisional history of spectral music. Contemporary Music Review, 19(2), 7–22.
Murail, T. (1984). Spectra and pixies. Contemporary Music Review, 1(1), 157–170.
Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press. Ch. 2 (Spectral fusion).
Huron, D. (2001). Tone and voice: A derivation of the rules of voice-leading from perceptual principles. Music Perception, 19(1), 1–64.