EP51: Why Twelve Notes? — Continued Fractions and Tuning Optimization
Overview / 概述
Why does a piano have twelve keys per octave? The answer is not aesthetic convention, not historical accident, and not cultural consensus. It is a continued fraction — a 2300-year-old piece of number theory that Pythagoras did not yet have the language to state, but whose question he was already asking.
The central problem: a perfect fifth has frequency ratio 3/2. Stack twelve perfect fifths and you should return to the starting pitch, seven octaves higher. But twelve fifths give , while seven octaves give . The gap — the Pythagorean comma — means the circle of fifths never closes exactly. It is irrational: .
This episode answers: which integer makes closest to a power of 2? The answer is the convergents of the continued fraction expansion of , and the first convergent with error below 1% gives the denominator 12. Twelve-tone equal temperament (12-EDO) is mathematically distinguished — not merely conventional.
中文: “为什么钢琴有十二个键?不是传统,不是审美——是一个两千三百年前的连分数。”
The broader lesson applies far beyond music: the convergents of a continued fraction are the best rational approximations to any irrational number, in a precise sense no other fractions can achieve. The path from Pythagoras’s comma to the keyboard is a direct application of Dirichlet’s approximation theorem.
Prerequisites / 前置知识
- All-Interval Rows and ℤ₁₂ (EP04) — the twelve pitch classes and cyclic group structure underlying equal temperament
- Comma Tuning and Intonation (EP11) — the Pythagorean comma as the root cause of tuning difficulty; this episode provides the mathematical explanation
Definitions
An -tone equal temperament (-EDO) divides the octave (frequency ratio 2:1) into equal steps. Each step has frequency ratio .
The cent is 1/1200 of an octave: one semitone in 12-EDO equals 100 cents. The frequency ratio corresponding to cents is .
In -EDO, a “fifth” is the closest approximation to the just perfect fifth (ratio 3/2): steps where . The fifth error in cents is
Every real number has a continued fraction expansion written , where and each partial quotient for .
The -th convergent is obtained by truncating at depth : The convergents satisfy the three-term recurrence with seeds .
The expansion terminates if and only if . For irrational , the convergents are the best rational approximations in the sense of Theorem 51.1.
The Pythagorean comma is the ratio by which twelve just perfect fifths exceed seven octaves: In cents: cents.
Equivalently, the comma arises because is irrational: there is no exact integer solution to , so the circle of fifths never closes exactly.
A fraction (with ) is a best rational approximation of the first kind to if for all fractions with and : That is, no fraction with denominator at most approximates better than .
A fraction is a best approximation of the second kind (stronger) if for all with : Every convergent of is a best approximation of both kinds.
The Stern–Brocot tree is an infinite binary tree containing every positive rational number exactly once. It is built by the mediant operation: the mediant of and is .
Starting from the interval (the boundary fractions representing 0 and ∞):
- The root is (mediant of and , restricted to fractions in )
- Each node has left child (mediant with its left ancestor) and right child (mediant with its right ancestor)
Musical interpretation: Each node represents an -EDO system where steps approximate the fifth. The path from the root to any node encodes the continued fraction of the corresponding ratio.
Main Theorems / 主要定理
Suppose for integers . Then , giving . The left side is odd (3 is odd, any power of 3 is odd); the right side is even (it is a positive power of 2). This is a contradiction. Therefore .
Geometrically: the circle of fifths is a sequence . By Weyl’s equidistribution theorem (a consequence of irrationality), this sequence is dense in — it never returns exactly to 0, but comes arbitrarily close.
The key identity is (proved by induction on using the recurrence), which gives and the gap between consecutive convergents: for some . Thus .
For the best-approximation claim: any fraction with can be written in terms of the basis . If , the triangle inequality forces , which is equivalent to the second-kind best approximation property.
The continued fraction expansion of is The convergents and their EDO interpretations are:
| EDO | Fifth error (cents) | ||
|---|---|---|---|
| 0 | 1/1 | 1-EDO | 113.69 |
| 1 | 1/2 | 2-EDO | 86.31 |
| 2 | 2/3 | 3-EDO | 52.75 |
| 3 | 3/5 | 5-EDO | 19.64 |
| 4 | 7/12 | 12-EDO | 1.96 |
| 5 | 24/41 | 41-EDO | 0.48 |
| 6 | 31/53 | 53-EDO | 0.07 |
The convergent is the first for which the fifth error falls below 2 cents (roughly 1% of a semitone), making 12-EDO the smallest EDO with near-just perfect fifths.
Compute the CF numerically. Set . Then , , , , , continuing gives .
Applying the recurrence: (working through the recurrence explicitly gives ; more carefully, — the detailed computation is standard arithmetic). The fifth error at convergent is cents, while the error at is cents. This jump from ~20 cents to ~2 cents at is the mathematical content of “12 is special.”
Numerical Examples
The Pythagorean comma in detail:
Starting from C4, stack twelve just perfect fifths (each ×3/2):
The final frequency is . Seven pure octaves give . The comma is:
In 12-EDO, the fifth is tempered to exactly cents ( of an octave), versus the just fifth at cents. The tempering distributes the comma evenly: cents per fifth.
Convergent computation step by step:
CF algorithm (extract integer parts of successive reciprocals):
| Step | ||||
|---|---|---|---|---|
| 0 | 0.58496 | 0 | 0 | 1 |
| 1 | 1.70951 | 1 | 1 | 1 |
| 2 | 1.40953 | 1 | 1 | 2 |
| 3 | 2.44899 | 2 | 3 | 5 |
| 4 | 2.24507 | 3 | 7 | 12 |
| 5 | 4.10… | 1 | 24 | 41 |
| 6 | … | 5 | 31 | 53 |
At step 4, — this is 12-EDO. The fifth error drops from 19.6 cents (step 3, 5-EDO) to 1.96 cents (step 4, 12-EDO), a factor of 10 improvement. The next improvement to below 0.5 cents requires 41-EDO, and below 0.1 cents requires 53-EDO.
Pareto frontier: fifth error vs third error:
A major third in just intonation has ratio 5/4, corresponding to octaves = 386.31 cents. In -EDO, the major third uses steps.
| EDO | Fifth error (cents) | Major third error (cents) |
|---|---|---|
| 5 | 19.6 | 17.5 |
| 7 | 9.8 | 20.5 |
| 12 | 1.96 | 13.7 |
| 19 | 7.2 | 7.4 |
| 31 | 5.2 | 5.4 |
| 41 | 0.48 | 2.0 |
| 53 | 0.07 | 1.4 |
12-EDO sits on the Pareto frontier: no EDO with fewer than 12 tones achieves a smaller fifth error. 19-EDO has a better major third but a larger fifth error. The choice of 12 is optimal for keyboard instruments where physical key count is the binding constraint.
Musical Connection / 音乐联系
From Pythagoras to the Modern Keyboard
The Pythagorean tuning system (pure fifths, ) was the standard in European music until the Renaissance. Instruments tuned in pure fifths cannot play in all twelve keys without audible “wolf intervals” — the gaps created by the undistributed Pythagorean comma. Keyboard instruments (organ, harpsichord, piano) are fixed-pitch instruments: unlike a violin, the player cannot adjust intonation in real time.
The mathematical problem is exactly the one this episode poses: find such that (an integer). The CF answer means 12-EDO is not arbitrary — it is the minimum for which the circle of fifths closes to within 2 cents.
中文: “十二个键,来自一个无理数的最佳有理近似。数学决定的,不是历史偶然。”
Cultural context and alternatives: The CF result explains why multiple independent musical traditions converged on 12. East Asian court music (Chinese 十二律 shí’èr lǜ, recorded since at least the Zhou dynasty ~1046 BCE) and Western keyboards both use 12 divisions — the continued fraction does not respect cultural boundaries.
Yet 12 is not uniquely optimal for all musical criteria:
- Arabic maqam uses 24 tones per octave (quarter tones), approximating neutral thirds (ratios between 5/4 and 6/5) unavailable in 12-EDO
- Indian shruti systems use 22 microtonal divisions (the 22 shrutis), targeting pure harmonic series ratios in multiple just-intonation contexts
- Turkish makam uses 53-EDO theoretically — the 6th convergent, , with fifth error < 0.1 cents — though performers bend pitches continuously
- Gamelan (Javanese/Balinese) uses 5-tone (sléndro) and 7-tone (pélog) systems, corresponding to EDO-5 and EDO-7, optimizing for different interval targets
The CF framework unifies these: each tradition selects a different convergent of a different target irrational, depending on which just interval (fifth, fourth, neutral third, harmonic seventh) is considered foundational.
12-EDO’s decisive advantage for harmony: The major third in 12-EDO is 400 cents, versus the just 5:4 ratio at 386.31 cents — an error of 13.7 cents, audibly significant in unaccompanied voices but masked by timbre in piano tone. Marin Mersenne (1636) and later equal-temperament advocates argued that the tradeoff — impure thirds in exchange for freedom to modulate — was worth it. J.S. Bach’s Wohltemperirtes Clavier (1722) demonstrated that a single keyboard could serve all 24 major and minor keys, which was impossible under meantone temperament.
Limits and Open Questions / 局限性与开放问题
-
One-dimensional optimization: The CF analysis optimizes the fifth (3/2) approximation alone. A more complete treatment would simultaneously optimize fifths (3/2), major thirds (5/4), minor sevenths (7/4), and higher harmonics. This is a multi-objective problem — the Pareto frontier in the space of (fifth error, third error, seventh error, …) — and has no single optimal solution. 12-EDO is a Pareto-optimal choice for the (fifth, third) pair, but not globally optimal for all harmonic series ratios.
-
Timbre dependence of consonance: The CF result assumes that just intonation ratios (3/2, 5/4, etc.) are perceptually optimal. But consonance depends on timbre: Sethares (1993) showed that inharmonic timbres (bell tones, gamelan metal) generate different “natural” scales where 12-EDO is not optimal. The mathematical question becomes: given a timbre’s partial spectrum , which EDO minimizes the average beating rate?
-
Is 12 the unique minimum?: The CF identifies 12 as the smallest with fifth error below 2 cents. But the threshold “2 cents” is a perceptual convention (just-noticeable difference for trained musicians). If the JND were 5 cents, EDO-5 would suffice. The mathematical result is clean; the musical conclusion depends on human auditory perception, which varies by training and context.
-
High-dimensional generalization: The Stern–Brocot tree gives the best approximations to a single irrational . Simultaneously approximating multiple ratios (e.g., and ) requires multi-dimensional continued fractions (Jacobi–Perron algorithm, LLL basis reduction). The question of which EDO best approximates the full 5-limit just intonation lattice simultaneously is open in the sense that no single “best” answer exists independent of the weighting function.
-
Dynamical systems view: The circle of fifths is the orbit of under addition modulo 1. Its return time statistics (how quickly the orbit returns near 0) are governed by the CF partial quotients — large means slow convergence. The partial quotient after the crucial convergent is relatively small (meaning 41-EDO is not drastically better than a naive estimate would suggest), while (much later in the CF) signals that there is a large gap where no small EDO improves substantially on 53-EDO. Understanding the long-range statistics of the CF of connects to the Gauss–Kuzmin distribution of CF partial quotients, an open area of metric number theory.
Among all -EDO systems with , 12-EDO maximizes the number of harmonic series ratios (with ) that are approximated to within the just-noticeable difference of 5 cents.
Falsification criterion: Exhibit an EDO with , , that approximates strictly more such ratios within 5 cents than 12-EDO does. (Preliminary computation suggests 19-EDO is a serious competitor for the 5-limit.)
Academic References / 参考文献
-
Khinchin, A. Ya. (1964). Continued Fractions. University of Chicago Press. — The standard reference for convergents, best approximations, and the Gauss–Kuzmin measure.
-
Barbour, J. M. (1951). Tuning and Temperament: A Historical Survey. Michigan State College Press. — Comprehensive history of tuning systems from Pythagorean through equal temperament, with detailed comma calculations.
-
Sethares, W. A. (1993). Local consonance and the relationship between timbre and scale. Journal of the Acoustical Society of America, 94(3), 1218–1228. — Demonstrates that optimal scales depend on timbre via the coincidence of partials.
-
Sethares, W. A. (2005). Tuning, Timbre, Spectrum, Scale, 2nd ed. Springer. — Full treatment of the interaction between inharmonic timbres and microtonal scales.
-
Milne, A., Sethares, W. A., & Plamondon, J. (2007). Isomorphic controllers and dynamic tuning. Computer Music Journal, 31(4), 15–32. — Multi-dimensional EDO optimization using LLL lattice reduction.
-
Hardy, G. H. & Wright, E. M. (2008). An Introduction to the Theory of Numbers, 6th ed. Oxford University Press. Ch. X–XI. — Best rational approximations and the theory of continued fractions.
-
Niven, I., Zuckerman, H. S., & Montgomery, H. L. (1991). An Introduction to the Theory of Numbers, 5th ed. Wiley. Ch. 7. — Dirichlet’s approximation theorem and applications.
-
Deza, E. & Deza, M.-M. (2012). Encyclopedia of Distances, 2nd ed. Springer. Entry on “Pitch distance” and interval approximation.
-
Balzano, G. J. (1980). The group-theoretic description of 12-fold and microtonal pitch systems. Computer Music Journal, 4(4), 66–84. — Algebraic analysis of why 12 emerges from ℤ_n group structure.
-
Erlich, P. (2006). A Middle Path Between Just Intonation and the Equal Temperaments. Xenharmonikôn 18, 159–199. — Regular temperament theory and the Pareto frontier of EDO systems.
-
Carey, N. & Clampitt, D. (1989). Aspects of well-formed scales. Music Theory Spectrum, 11(2), 187–206. — Mathematical characterization of scales generated by a single interval.
-
Stern, M. A. (1858). Über eine zahlentheoretische Funktion. Journal für die reine und angewandte Mathematik, 55, 193–220. — Original paper on the Stern–Brocot sequence.
-
Graham, R. L., Knuth, D. E., & Patashnik, O. (1994). Concrete Mathematics, 2nd ed. Addison-Wesley. Ch. 4 (Number Theory) and Ch. 6 (Special Numbers). — Farey sequences and Stern–Brocot trees.
-
Douthett, J. & Krantz, R. (2007). Maximally even sets and configurations: Common threads in mathematics, physics, and music. Journal of Combinatorial Optimization, 14(4), 385–410.
-
Weisstein, E. W. “Pythagorean Comma.” MathWorld — Wolfram Web Resource. https://mathworld.wolfram.com/PythagoreanComma.html