Beethoven + Statistical Analysis =…

The realm of “data analytics” is growing larger every day. I suppose it shouldn’t be a surprise that “big data” has finally found classical music.

The topic of the study: decoding Beethoven’s greatness by a statiscal analysis of (some of) his music.

The Study

This study was done by Fabian Moss, Markus Neuwirth, Daniel Harasim, and Martin Rohrmeier. It’s published on Public Library of Science and is available to read in full on

The study’s stated goal is to:

[C]haracterize the essential features (dimensions) of tonal harmony on quantitative grounds by applying statistical methods to a recently published dataset, the Annotated Beethoven Corpus (ABC).

Unfortunately, the composition (get it?) of the study is part of the problem. This corpus is a set of 28,000 chord labels added to scores of Beethoven’s 16 string quartets. While Beethoven string quartets are wonderful pieces of music, judging Beethoven solely on this genre – and further isolating it to only chords – is not a recipe for useful analysis.

And the results bear that out…

The Results

Hmm…Maybe I should have taken piano lessons…

For example, in terms of chord frequency, the tonic (I and i) and dominant (and dominant 7 in all inversions) are overwhelmingly the most common chords. This would (hopefully) be obvious to any music theory student and certainly isn’t groundbreaking statistical analysis of Beethoven. I can’t think of any composer from this time period who wouldn’t have a similar result.

There is potential, though.

For example, comparing this ABC data set with a similar dataset from other composers of a similar time (Mozart, Weber, Rossini, Schubert), may lead to a kind of “aural fingerprint” for each of these composers. Once established, these fingerprints could show some of what made each of these composers unique. Other interesting possibilities include finding mis-attributed pieces of music that don’t fit their (attributed) composer’s style

The ABC data set, though, is not helpful in this regard. Not only because of the problems mentioned above (single genre, only analyzing chords) but because Beethoven’s string quartets are quite different than some of his other compositions. They could be used to show the evolution of Beethoven’s style across this genre, but that also doesn’t appear to be the goal of this study.

Until then, this is a lot of data but without a clear and useful purpose.

You can read the whole research paper here.