The Million Song Dataset is an NSF funded project. The following density plots by decade were recently posted on Reddit here
What kind of songs is this about?
“The Million Song Dataset is also a cluster of complementary datasets contributed by the community:
SecondHandSongs dataset -> cover songs
musiXmatch dataset -> lyrics
Last.fm dataset -> song-level tags and similarity
Taste Profile subset -> user data
thisismyjam-to-MSD mapping -> more user data
tagtraum genre annotations -> genre labels
Top MAGD dataset -> more genre labels”
The web page has active links to each of these datasets. My guess is that they are more popular song than Opera or lieder.
Edit: “The core of the dataset is the feature analysis and metadata for one million songs, provided by The Echo Nest”