Roon is both incredibly great and excruciatingly poor

brian · February 26, 2018, 11:06pm

When it comes specifically to correcting metadata errors, I think there are more efficient ways we could use our resources than building a crowdsourcing system for that.

I think there are many classes of metadata corrections that could be handled by machines. in my opinion, we should run our options out in that department before risking human effort doing things that machines could have done.

I think the inferences being made are a little bit more absolute than my actual thoughts on this.

We’re not anti-crowdsourcing. We have already used it for the translation system. We are planning to use it for an internet radio directory, and for some other stuff too.

The trouble is using it for, specifically, metadata corrections, and also as a mechanism for adding unknown albums to the system. This seems like a very technical endeavor to me. It would be hard to imagine doing it inside of Roon itself–it would require a very complicated user interface, much more technical than our current editing stuff. And it would require a fair amount of education for people to use it, which I am not sure is practical.

Another problem is that I think the number of people who would participate is fairly small. A few prolific contributors plus a few dozen who participate more rarely. I don’t think we will make major progress on the problem with that size crowd. There are 11 million release entries our system, plus 1000+ new additions per day. It’s a pretty large problem to handle manually in a way that would truly produce transformative change.

A third problem is the lack of instant gratification. I think this is a really big one. If someone is going to put in effort to grooming metadata, they expect to enjoy the results when they finish. Without that serotonin feedback loop, it is pretty unlikely that we will bootstrap a successful population of editors.

The problem there is–Ingesting new/modified metadata into our system is time consuming from a compute perspective. There are some large-scale data processing steps that need to reason globally. For example, an algorithm that mines all of the release data in order to discover + materialize composition entities when they weren’t otherwise present. Our metadata system ingests/rebuilds data on a daily schedule–incompatible with instant gratification.

Crowdsourcing works better for things with less surface area and no “global-scale” reasoning like that. Fanart.tv is one of our data sources–very simple idea. Provide great high-res album/artist artwork. Something that anyone can do in a few minutes with a google images search.

Spotify crowdsources playlists. That works out great too. People know how to make playlists, no education or understanding of complex data models required. Lots of people can participate, not just a narrow set of committed, technical people.