EP47: Counterpoint as Constraint Satisfaction
Overview / 概述
Renaissance composers navigated a labyrinth of roughly forty rules — covering intervals, voice crossings, leap resolutions, and cadential formulas — each time they wrote a single bar of counterpoint. Johann Joseph Fux codified these rules in his 1725 treatise Gradus ad Parnassum, creating the canonical five-species framework that has anchored counterpoint pedagogy for three centuries. What Fux described in prose, a modern computer scientist reads as a constraint satisfaction problem (CSP): variables are the note choices at each time step, domains are the sets of legal pitches, and constraints are the rule set.
This episode builds the CSP model precisely and analyzes its computational structure. The brute-force search space for forty time steps with thirty pitch candidates per step is — astronomically large. Arc consistency pruning, together with the especially tight constraints imposed by the cadential formula at the final positions, compresses the effective search space to roughly , a size that modern backtracking algorithms can navigate in seconds. Yet the episode also asks the harder question: the algorithm finds a legal solution, but Bach found a meaningful one. The gap between the two is the gap between constraint satisfaction and aesthetic understanding.
The musical thread runs through Palestrina’s characteristic soft constraints — smooth melodic motion, step-wise recovery after large leaps, avoidance of consecutive same-direction skips — none of which are hard prohibitions, but all of which define a stylistic signature. Encoding these as a penalty function transforms the CSP into a constraint optimization problem, where the objective quantifies how “Palestrina-like” a solution is. The inverse of that problem — learning the penalty weights from a corpus of genuine Palestrina counterpoint — is an open research question in computational musicology.
中文: “文艺复兴时期的作曲家,要手动满足四十条规则,才能写出一段对位法。每条旋律的音程、方向、节奏,都受到严格约束。计算机用约束满足,零点一秒就能找到合法方案——但那缺少了什么?”
Prerequisites / 前置知识
- Bach Voice-Leading Networks (EP03) — graph structure of voice-leading constraints
- Graph Theory and Forbidden Subgraphs (EP05) — forbidden interval pairs as forbidden subgraphs
Definitions / 定义
A constraint satisfaction problem is a triple where:
- is a finite set of variables,
- is a collection of domains, with being the set of candidate values for ,
- is a set of constraints, each being a relation on a subset of variables.
A solution is an assignment with for all , satisfying every constraint simultaneously.
Counterpoint instantiation (first species, 40 steps):
| CSP component | Counterpoint meaning |
|---|---|
| Variable , | Note choice at time step |
| Domain | Legal pitches (≈ 30 candidates per step) |
| Constraint set | Fux’s rule set (≈ 40 rules) |
A solution to the CSP is a fully legal first-species counterpoint line.
Johann Joseph Fux (1725) organized Palestrina-style counterpoint into five species, distinguished by the rhythmic ratio between the cantus firmus and the added voice:
| Species | Rhythmic ratio | Description |
|---|---|---|
| First | 1 : 1 | One note against one note (whole notes) |
| Second | 1 : 2 | Two notes per cantus note (half notes) |
| Third | 1 : 4 | Four notes per cantus note (quarter notes) |
| Fourth | Syncopated | Suspensions and resolutions |
| Fifth | Mixed | “Florid” counterpoint, all previous species |
First-species hard constraints (representative selection):
- Forbidden harmonic intervals: parallel perfect fifths, parallel octaves, parallel unisons
- Forbidden melodic intervals: augmented or diminished intervals
- Forbidden motion types: direct (hidden) fifths and octaves by similar motion
- Cadential constraint: the penultimate step must approach the final note by specific interval pairs
Each forbidden pattern corresponds to a constraint on a pair of consecutive or simultaneous variable assignments. This is structurally identical to a forbidden subgraph condition in graph coloring — see
.
A CSP is arc consistent if for every pair of variables linked by a constraint, and for every value , there exists at least one value such that the assignment satisfies all constraints between and .
The AC-3 algorithm enforces arc consistency by iteratively removing values from domains that have no supporting value in a neighboring domain. Each removal may trigger further propagation along the constraint graph.
In counterpoint: if the current note is tentatively assigned value , arc consistency checks whether there exists any legal value for compatible with . If not, is pruned from before any recursive search begins.
In a constraint satisfaction or optimization problem, constraints are classified by their role:
-
A hard constraint is a rule that every solution must satisfy. Violating it disqualifies the candidate. In counterpoint: “no parallel octaves” is a hard constraint.
-
A soft constraint (or preference) is a rule that good solutions should ideally satisfy, but violation is penalized rather than forbidden. Each soft constraint contributes a penalty term to an objective function.
The constraint optimization problem (COP) extends a CSP by adding a real-valued objective derived from soft constraint penalties:
where is the weight of the -th soft constraint and counts how many times violates it. The optimal solution minimizes subject to all hard constraints.
Palestrina soft constraints (representative):
| Soft constraint | Penalty increases when… |
|---|---|
| Smooth melodic motion | Step size of consecutive intervals exceeds a third |
| Post-leap recovery | A large leap is not followed by stepwise motion in the opposite direction |
| Avoid consecutive same-direction leaps | Two or more large leaps occur in the same direction |
Main Theorems / 主要定理
The set of legal first-species counterpoint solutions over time steps with domain size is exactly the solution set of a binary CSP with:
where each constraint in involves at most two consecutive variables or two simultaneous ones (cantus firmus note vs. counterpoint note). The constraint graph is a path graph augmented with cantus-firmus edges — sparse and structured.
For first-species counterpoint with time steps and domain size :
- Brute-force upper bound: total assignments.
- After arc consistency and cadential propagation: effective search space (order-of-magnitude estimate, not a tight bound).
- Legal solutions: many — the CSP is highly satisfiable, and the algorithm finds one in seconds.
The key insight is that many of these paths lead to distinct valid solutions, so the problem is not hard in the worst-case sense; it is the selection of a musically meaningful solution among the valid ones that is computationally uncharacterized.
The brute-force bound is immediate: .
The post-pruning estimate of follows from the narration-script’s order-of-magnitude calculation. After cadential back-propagation reduces the last several domains by a factor of each, and arc consistency prunes incompatible values at interior steps, the effective branching factor drops from 30 to roughly 10 per step. A conservative estimate gives reduced by about through constraint propagation, landing in the range. Modern backtracking solvers exploit additional look-ahead heuristics (MRV — minimum remaining values; LCV — least constraining value) to find a solution without exhausting this space.
The hard harmonic constraints of first-species counterpoint (forbidden parallel fifths, octaves, and augmented/diminished intervals) are in bijection with a set of forbidden induced subgraphs in the constraint graph of the CSP. This is the same structure analyzed in
EP05’s graph-coloring framework
.
Numerical Examples / 数值示例
Brute-force vs. pruned search:
This vastly exceeds the number of atoms in the observable universe () — though note it is smaller than , making brute force merely intractable rather than physically impossible. After arc-consistency propagation and cadential pruning, the script estimates the effective search space at:
At node evaluations per second (a conservative modern CPU estimate), exhaustive search of nodes would take roughly seconds ≈ 3 hours. But backtracking with forward-checking terminates as soon as one solution is found, making the typical runtime far shorter — empirically in the seconds-to-milliseconds range for well-constrained instances.
Domain reduction at the cadence:
Suppose the cantus firmus ends on C4. The cadential rule requires:
- (octave/unison of final) — domain size reduced from 30 to 3.
- must form a major sixth below or minor third above C, approaching by step — domain size reduced from 30 to approximately 4–6 pitches.
This 5-fold reduction at step 39 and 10-fold at step 40 alone prune the suffix search tree by a factor of , compounding with further interior propagation.
Soft constraint penalty example (Palestrina style):
Suppose a candidate counterpoint solution contains:
- 3 instances of a large leap not followed by contrary stepwise motion: penalty
- 2 instances of consecutive same-direction leaps: penalty
- 0 augmented melodic intervals (hard constraint satisfied): no penalty
Setting , the total soft penalty is . A solution with would satisfy all soft constraints — a fully idiomatic Palestrina line. The optimization problem seeks to minimize over all hard-constraint-satisfying assignments.
Musical Connection / 音乐联系
From Palestrina to Backtracking: The Algebra of Style
Palestrina’s counterpoint is famously described as “vocal” — flowing, singable, economical. The motion is predominantly by step (seconds), leaps are small (thirds or fourths), and when a large leap occurs the melody immediately reverses direction. These are not rules Palestrina consciously enumerated; they are the emergent signature of a stylistic tradition internalized over years of composition. Fux formalized this tradition into explicit rules, and those rules are exactly what a CSP encodes.
The gap between the CSP model and real Palestrina counterpoint is the gap between the rule set and the aesthetic intention behind the rule set. Fux’s rules are necessary conditions for Palestrina style; they are not sufficient. A CSP solver finds any legal solution — and there are vastly many. Most of them are monotonically ascending or descending lines that happen to avoid forbidden intervals but go nowhere musically. The missing ingredient is direction: a good counterpoint line has an arc, a climax, a sense of arrival.
This distinction maps precisely onto the hard/soft constraint dichotomy. Hard constraints eliminate illegal solutions; soft constraints, with appropriate weights, steer the solver toward solutions that resemble actual Palestrina. The weights encode aesthetic judgment — and learning those weights from a corpus of authentic Palestrina is an instance of inverse optimization (or inverse reinforcement learning in modern terminology).
The five species as complexity levels: From a CSP perspective, the five species form a complexity hierarchy. First species is purely harmonic (one constraint per time step pair). Second and third species add rhythmic constraints (subdivisions must also avoid forbidden intervals). Fourth species introduces anticipation constraints (the suspension must resolve down by step). Fifth species combines all constraint types simultaneously, yielding the densest constraint graph and the smallest effective domain at each step.
中文: “帕莱斯特里纳风格的特点:旋律流动平稳,大跳之后反向级进,避免同方向连续大跳。这些偏好不是硬约束,而是软约束——违反不是错误,但优质解应尽量遵守。”
Bach vs. the algorithm: Bach’s counterpoint, which follows a different and somewhat looser rule set than Fux’s Palestrina-species framework, exhibits a further layer of intentionality that no pure constraint model captures: motivic coherence. Bach’s inner voices often quote or invert the subject; rhythmic patterns repeat at transposed pitch levels; the texture as a whole expresses a single generative idea. None of this is a constraint in the Fux sense. It is the difference between solving a puzzle and writing a proof — both require satisfying conditions, but the proof must also be about something.
中文: “算法找到的是合法解,巴赫找到的是有灵魂的解——两者的距离,是约束满足与意义理解的距离。”
Limits and Open Questions / 局限性与开放问题
-
Fux’s rules are a stylistic approximation, not a specification. Fux codified Palestrina style as he understood it in 1725, introducing some rules that Palestrina himself did not consistently follow. The CSP model of “Palestrina counterpoint” is therefore a model of Fux’s model, with its own systematic biases. Any computational study must account for the gap between the historical corpus and the pedagogical codification.
-
Multi-voice complexity. First-species CSPs with a single counterpoint voice are polynomial in practice (the constraint graph is a path). Adding a second or third counterpoint voice introduces cross-voice constraints, densifying the constraint graph and potentially pushing the problem toward NP-hard territory. Whether -voice free counterpoint (without a cantus firmus) is polynomial-time solvable is an open question stated explicitly in the narration.
-
Soft constraint weights are not uniquely determined. Different weight vectors can produce different “optimal” solutions, all of which satisfy the hard constraints. The learned weights depend on the corpus, the feature engineering (which soft constraints are included), and the optimization method. There is no canonical set of weights for “Palestrina style.”
-
Aesthetic quality is not decomposable. The penalty function assumes that aesthetic quality is a sum of independent local penalties. This is almost certainly false: a large leap followed by contrary stepwise motion is good; the same leap in a different melodic context may be awkward. The penalty for violation of one soft constraint depends on the surrounding context, violating the additive independence assumption.
-
The search for musical meaning is not a search problem. A CSP or COP can find solutions that minimize measurable deviations from stylistic norms. But the question “why does this counterpoint phrase feel inevitable, even beautiful?” is not reducible to constraint satisfaction. This is not a limitation of the particular algorithm; it may be a fundamental limitation of the rule-based paradigm.
For independent voices with no fixed cantus firmus, the problem of finding a joint assignment satisfying all pairwise Fux-style constraints — or determining that no such assignment exists — is NP-complete.
Reasoning: As the number of voices grows, the constraint graph acquires cross-voice constraint edges per time step, and the joint domain per time step scales as . The interaction structure between voices at different steps — where a choice at step in voice constrains not only voice at step but also voice at step via harmony — can in principle encode 3-SAT instances. The conjecture predicts that even deciding feasibility (existence of a legal solution) is NP-complete for sufficiently large .
Falsification criterion: Exhibit a polynomial-time algorithm for 3-voice free counterpoint with all Fux hard constraints, or prove the problem lies in P. If the constraint graph’s treewidth remains bounded as grows, tractable dynamic programming would be possible — this would disprove the conjecture.
Given a corpus of authentic Palestrina counterpoint examples, the optimal soft constraint weight vector that maximizes the average score of corpus examples relative to randomly generated legal counterpoints can be recovered by inverse optimization (or maximum margin methods) with sample complexity , independent of the length of the counterpoint.
Falsification criterion: Show that the learned on a training corpus generates counterpoints rated by expert musicians as significantly better than those from a random legal CSP solution, on a held-out evaluation corpus. If no such improvement is found across multiple weight-learning methods and corpus sizes, the conjecture is falsified.
Academic References / 参考文献
-
Fux, J. J. (1725). Gradus ad Parnassum. Vienna. English translation: Mann, A. (1943). The Study of Counterpoint. W. W. Norton.
-
Russell, S. J., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach, 4th ed. Pearson. Chapter 6 (Constraint Satisfaction Problems).
-
Mackworth, A. K. (1977). Consistency in networks of relations. Artificial Intelligence, 8(1), 99–118. — Original arc-consistency paper.
-
Tsang, E. (1993). Foundations of Constraint Satisfaction. Academic Press. Free PDF: https://www.essex.ac.uk/people/tsang/books/fcs.htm
-
Lerdahl, F., & Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press. — Formalization of musical grammar as rule systems.
-
Ebcioglu, K. (1988). An expert system for harmonizing four-part chorales. Computer Music Journal, 12(3), 43–51. — Early rule-based counterpoint generation.
-
Pachet, F., & Roy, P. (2001). Musical harmonization with constraints: A survey. Constraints, 6(1), 7–19.
-
Anders, T., & Miranda, E. R. (2011). Constraint application with higher-order programming for modeling music theories. Computer Music Journal, 35(2), 25–41.
-
Farbood, M., & Schöner, B. (2001). Analysis and synthesis of Palestrina-style counterpoint using Markov chains. Proceedings of the International Computer Music Conference (ICMC).
-
Conklin, D. (2003). Music generation from statistical models. Proceedings of the AISB 2003 Symposium on Artificial Intelligence and Creativity in the Arts and Sciences, 30–35.
-
Herremans, D., & Chuan, C.-H. (2017). A functional taxonomy of music generation systems. ACM Computing Surveys, 50(5), Article 69.
-
Pearce, M. T., & Wiggins, G. A. (2007). Evaluating cognitive models of musical composition. Proceedings of the 4th International Joint Workshop on Computational Creativity, 73–80.
-
Briot, J.-P., Hadjeres, G., & Pachet, F. (2020). Deep Learning Techniques for Music Generation. Springer. Chapter 2 (Rule-based and Constraint-based Methods).
-
Collins, T., Laney, R., Willis, A., & Garthwaite, P. H. (2016). Developing and evaluating computational models of musical style. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 30(1), 16–43.
-
Huang, C.-Z. A., Cooijmans, T., Roberts, A., Courville, A., & Eck, D. (2017). Counterpoint by convolution. Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR), 211–218.