I want to explain something about what we do, and the change we made recently, and get your take on how this should work.
Obviously, recognizing the tag and stashing it somewhere in the backend databases is simple, and we will do it. Getting it to the UI (when, where, how) is a more complex set of decisions.
We don’t actually have the notion of “track artist” or “track composer” directly.
What we have is a list of track credits, which take the form of: (artist, role, category)
- artist is (hopefully) an artist id, otherwise it’s a textual artist name.
- role is a detailed role string like “Baritone Saxophone”.
- category is one of: Main Artist, Artist, Composer, Conductor, Ensemble, Production
We use the categories to drive a bunch of decisions internally, and in UI presentation to help decide what’s relevant and what should be displayed where. For example, in most UI that shows “artists” with the track, we show the main artist(s), conductor(s), and ensemble(s).
The other thing we do with the categories is drive the merging behavior between your local data and data from our providers.
Before I go further I should explain something else about how our metadata model works. It’s built in three layers: your tags at the bottom, our provider data above that, and your edits above that. These layers exist for every album, track, artist, work, and performance. Not every field exists in every layer (for example, we don’t have a way to get the “period” in which a work was composed from file tags, nor do we get your “import date” from our providers).
We don’t have editing UI yet obviously, except for a few high priority edit operations like album merging, identify album, etc, but in the backend every field is editable.
In general, the behavior in Roon is to look at a field, and return the data from the highest layer that has that field populated. User edits trump provider data, provider data trumps local data.
Part of what we are doing to make Roon more friendly to people with groomed collections is introducing a special kind of edit called “prefer local metadata”, which effectively flips the bottom two (local + provider) layers. This will be a field-level operation, but we’ll make batch edit mechanism so you can multi-select the whole library and check off a list of fields for which you would prefer to trust local data if you want to do it across the board.
Most of the following concerns the default behavior, that is, what people see when they don’t go clicking around on the “prefer local metadata” screens–
As you can imagine there are a ton of little exceptions. For fields that contain release specific information (upc, asin, country, catalog#, label), we are planning to start trusting local tags over our own sources by default, since when that (statistically rare) data is present, it seems to be cleaner and more reliable than what our automatic systems can discern.
Track credits are also a special case. We don’t choose the whole local/remote list as a unit (this is almost what we were doing up until this week’s build, except we were considering negative information from our providers to still count as “having data” if the track was identified by one of them–no good).
We also don’t merge the track credits list entry-by-entry because misspellings (or near misses) create noisy duplicates everywhere. We tried this kind of merging in an earlier version of the product, and it was detrimental to the overall experience.
Also, credits from tags are by nature “crippled” since they are strings, not identifiers, so we can’t unambiguously match them up in cases where two people with the same name exist in your library, so when we can get data with proper identifiers, we prefer to use it.
So the way we merge track credits from local/provider data is on a category-by-category basis:
If your local data provides composers, and our provider doesn’t, we use local composers, else provider.
If your local data provides main performers, and our provider doesn’t, we use local main performers, else provider.
(We don’t currently extract other credit types from tags, but CONDUCTOR and ENSEMBLE are obvious candidates that we will add soon.)
Once we make a category-by-category decision, the list is stitched back together, and that’s what you see in the app.
Currently, we classify remixing-related roles (when data comes to us from our providers) in the production category. The problem is–with our default merging behavior, you would only see the remix guy in cases where there are no other production credits coming from our provider. It seems strange to lump REMIXED BY in with “Script Supervisor” or “Cover Art”.
What category do you think REMIXED BY goes in?
I can see an argument for categorizing them as composer or performer. Production seems wrong to me. Maybe there’s a case for making a new category solely for remixers…but the bar is pretty high for doing that, and I’m not completely convinced (yet) that remixers are “special enough” to justify dedicated handling everywhere.