Mapping of remixers from own tags

There are still a lot of mix albums where you know take tags for the performers and composers out of our groomed tags if you don’t have the information from your providers. This is working pretty good as I already posted.

Can you also do that for the Remixers? That would be great. I have a lot of albums where the remixers are missed. All my trags are completely tagged with the correct remixers under REMIXED BY

I want to explain something about what we do, and the change we made recently, and get your take on how this should work.

Obviously, recognizing the tag and stashing it somewhere in the backend databases is simple, and we will do it. Getting it to the UI (when, where, how) is a more complex set of decisions.

We don’t actually have the notion of “track artist” or “track composer” directly.

What we have is a list of track credits, which take the form of: (artist, role, category)

  • artist is (hopefully) an artist id, otherwise it’s a textual artist name.
  • role is a detailed role string like “Baritone Saxophone”.
  • category is one of: Main Artist, Artist, Composer, Conductor, Ensemble, Production

We use the categories to drive a bunch of decisions internally, and in UI presentation to help decide what’s relevant and what should be displayed where. For example, in most UI that shows “artists” with the track, we show the main artist(s), conductor(s), and ensemble(s).

The other thing we do with the categories is drive the merging behavior between your local data and data from our providers.

Before I go further I should explain something else about how our metadata model works. It’s built in three layers: your tags at the bottom, our provider data above that, and your edits above that. These layers exist for every album, track, artist, work, and performance. Not every field exists in every layer (for example, we don’t have a way to get the “period” in which a work was composed from file tags, nor do we get your “import date” from our providers).

We don’t have editing UI yet obviously, except for a few high priority edit operations like album merging, identify album, etc, but in the backend every field is editable.

In general, the behavior in Roon is to look at a field, and return the data from the highest layer that has that field populated. User edits trump provider data, provider data trumps local data.

Part of what we are doing to make Roon more friendly to people with groomed collections is introducing a special kind of edit called “prefer local metadata”, which effectively flips the bottom two (local + provider) layers. This will be a field-level operation, but we’ll make batch edit mechanism so you can multi-select the whole library and check off a list of fields for which you would prefer to trust local data if you want to do it across the board.

Most of the following concerns the default behavior, that is, what people see when they don’t go clicking around on the “prefer local metadata” screens–

As you can imagine there are a ton of little exceptions. For fields that contain release specific information (upc, asin, country, catalog#, label), we are planning to start trusting local tags over our own sources by default, since when that (statistically rare) data is present, it seems to be cleaner and more reliable than what our automatic systems can discern.

Track credits are also a special case. We don’t choose the whole local/remote list as a unit (this is almost what we were doing up until this week’s build, except we were considering negative information from our providers to still count as “having data” if the track was identified by one of them–no good).

We also don’t merge the track credits list entry-by-entry because misspellings (or near misses) create noisy duplicates everywhere. We tried this kind of merging in an earlier version of the product, and it was detrimental to the overall experience.

Also, credits from tags are by nature “crippled” since they are strings, not identifiers, so we can’t unambiguously match them up in cases where two people with the same name exist in your library, so when we can get data with proper identifiers, we prefer to use it.

So the way we merge track credits from local/provider data is on a category-by-category basis:

If your local data provides composers, and our provider doesn’t, we use local composers, else provider.
If your local data provides main performers, and our provider doesn’t, we use local main performers, else provider.

(We don’t currently extract other credit types from tags, but CONDUCTOR and ENSEMBLE are obvious candidates that we will add soon.)

Once we make a category-by-category decision, the list is stitched back together, and that’s what you see in the app.

Currently, we classify remixing-related roles (when data comes to us from our providers) in the production category. The problem is–with our default merging behavior, you would only see the remix guy in cases where there are no other production credits coming from our provider. It seems strange to lump REMIXED BY in with “Script Supervisor” or “Cover Art”.

What category do you think REMIXED BY goes in?

I can see an argument for categorizing them as composer or performer. Production seems wrong to me. Maybe there’s a case for making a new category solely for remixers…but the bar is pretty high for doing that, and I’m not completely convinced (yet) that remixers are “special enough” to justify dedicated handling everywhere.

4 Likes

Is there an ETA for this? Will the Album-title, track-title, disc#, track#, Album-Artist, Artist, Composer, and Genre fields all be available via the batch-edit mechanism? What about album art, specifically as folder.jpg or embedded within tags?

@trtlock : regarding album art you are already to use album art that’s in your tags

@brian : as I am sw-developer myself let me think a little bit about this and get back to you. first many many thanks for your detailed post so there is really some insight on how you handle things…

Yes, but only if you edit one album at a time – I’m asking if this will be part of the automated batch-edit procedure.

Is there an ETA for this?

It’s being worked on right now, and it’s our #2 priority (behind iOS support).

I generally don’t name dates until features go into alpha testing–before then, there’s always the possibility that priorities will shift around or delays will occur.

There’s a few weeks of effort here–unlike some of the other changes we’ve released in the audio area, pretty much everyone on the team ends up touching part of this one, there’s a substantial amount of UI to be designed and built, and a lot of testing/QA is required since anytime we muck around in the data modeling stuff, there’s a risk of regressions, corruption, data loss.

Album-title, track-title, Album-Artist, Artist, Composer, Album art

Yes

disc#, track#,

These are already solely determined by local data (munged from file names, directory names for certain multi-disc set organization schemes, and file tags). I don’t think anything is changing here other than perhaps making them editable manually in the UI.

Genre field

I think genre is going to be handled differently.

We will probably keep local genres and Roon’s genres separate in the data model (insofar as representing how they attach to albums and artists), but let them live in the same hierarchy with Roon’s genres, and make that hierarchy user editable.

We are planning to make genres of both types editable (add/remove from album/artist), so personally offensive cases can be fixed as one-offs. We may make settings for hiding/showing local genres or Roon’s genres for people who prefer to see only one or the other as a higher-level choice.

If you are not a groomer, your genre tags are probably a mess. I mined a bunch of un-groomed files and looked…it’s really gross what’s actually in that tag field in real life, and we don’t necessarily want to pollute the app with that mess by default.

This is still in flux, and we plan to solve the problem. It just requires a more complicated solution than the other fields.

2 Likes

Of course you’re right on that one…

Yes. We will support batch prefer local artwork.

Ha! Even if you are a groomer, the genre field is usually a mess. Currently, I have a bazillion albums under Pop-Rock, as the inherent subjectivity of genres, & the time-consuming process to come to a definitive decision while grooming eventually caused me to “deal w/all that later.” So Pop-Rock became the default over the years.

I would actually love to keep Roon’s (AMG’s, for all intents & purposes) pre-ID’d genres for all albums I have, and would not want to revert to my own tags in this area.

Hmm…except I have carefully changed genres within most of my live albums to include Stage Chatter, Introduction, Tuning, etc for tracks that aren’t songs. Mainly so when I invoke random-play of a genre, I don’t end up hearing 3 minutes of tuning, or Jim Morrison subjecting the audience to yet another shambling, drunken rap. Be a shame to overwrite all those tags, but not the end of the world if there’s a fairly easy way to re-instate afterwards.

Of course, at some time I would want to be able to fix obvious “mistakes,” and especially deal w/genre on a track-by-track basis within an album.

Just ruminating on all this…any thoughts?

This constitutes a really important insight, IMO. Can’t wait for the full-on editing UI.

Hmm…except I have carefully changed genres within most of my live albums to include Stage Chatter, Introduction, Tuning, etc for tracks that aren’t songs. Mainly so when I invoke random-play of a genre, I don’t end up hearing 3 minutes of tuning, or Jim Morrison subjecting the audience to yet another shambling, drunken rap. Be a shame to overwrite all those tags, but not the end of the world if there’s a fairly easy way to re-instate afterwards.

We solve this problem more directly using a feature called “banning” a track or album. You “ban” something by clicking the favorite heart twice.

If a track/album is banned, then it will never come up when playing a larger set of material.

For example, in this case, tracks 2 + 3 will never come up in radio play, artist play, genre play, etc.

Also, if you play the album using “PLAY ALBUM” they will be skipped.

They can still be played explicitly, or by selecting all of the tracks on the album and playing the selection.

This is what we use to skip skits, talking, applause, etc.

Cool – that’ll work.

@brian : First of all let me say that if there was no “Roon Experience” everything would be much easier so you might re-think…

…haha, ok forget that

Let’s get back serious. What I want to say with this is that your plans have really demanding and tricky challenges.

So here are my questions, thoughts, answers on this

This sounds really great.
So will we be able to select the tag-names we used and map them to the corresponding fields that you open for using local metadata? I think this will be necessary as people are using custom-tags, if there were just certain tag names mapped hardcoded this might get problematic.

I agree with you I think no one will use such tags with garbage in it.
Same question as for the previous section applies of course. I know of people using catalog#, catalogno, catalognumber etc. for instance.

Yes this was really not the best way of handling it but from what I have seen until now for track artists and composers this is way better. Others also confirmed that I read.

I see the difficulty here. And you really have the problem of misspellings problem from both sides, from the user and from the metadata-providers at least regarding track names. Didn’t check that for composers or artists until now.

I really see a problem child here. It’s called artist name variation. Especially in electronic music a lot of artists, and so also remixers, use artist name variations. In order to match that up correctly, also from a more general point of view it really depends on what the metadata-provider delivers and if this is right. In Discogs API for instance you have the possibility to somehow dissolve artist name variations.
That means if in the track credits the artist is CL2 you get back Chris Liebing instead. You can do that but this is not what is written as track artist on the album then. So if you want to mirror the credits of your album you have to use the artist name variations. Will these be handled as own artists then?

As far as I understood this process of checking against the provider and deciding is only valid if we don’t use the future “prefer local metadata” functionality, right? Will there ever be a match to your ID’d artists than if I prefer to use my artist tags??

I completely agree with you here that it should not be in the “Production” category, especially given your examples which are real production tasks/jobs.
I also see your problem creating a new category, special code has to be added everywhere you work with the categories.
So this is really a good question. Regarding the composer and performer categories. Are there any other roles in there other than “Composer” or “Performer”? And you just wrote in another post when they’re displayed and when not

because I think a remixer should always be displayed if it is a remix and the information is there. Sure one can write this info also in the trackname, but then there will be no way at all to have a roon-like experience with that.

So will we be able to select the tag-names we used and map them to the corresponding fields that you open for using local metadata? I think this will be necessary as people are using custom-tags, if there were just certain tag names mapped hardcoded this might get problematic.

Making this into a dynamic/customizable system has architectural and performance implications. In many (not all) configurations, that critical initial import experience that happens on the first run and forms most users’ first impressions of the app is very tightly optimized. Re-building the tag mapping into a dynamic/programmable system while keeping the current performance characteristics would be a lot of work. Also, I think if we were to go down that road, it would also become important for us to support user-defined fields, and perhaps even custom object types in the data model in order to do a complete job. Big project, and not necessarily where we are planning to go.

For now, at least, we are hardcoding. There is not a lot of ambiguity here–sure there are four ways to say “catalog number”, but something called “catalognumber” never means something else. Adding new cases over time should not be a problem.

I really see a problem child here. It’s called artist name variation. Especially in electronic music a lot of artists, and so also remixers, use artist name variations. In order to match that up correctly, also from a more general point of view it really depends on what the metadata-provider delivers and if this is right.

We get pretty good data on alternate names from Rovi and MusicBrainz. It’s not currently exposed anywhere in the UI, but we use it heavily in the backend. I haven’t dug deeply into the discogs data model yet, but if they have done a similarly good job at managing artist identity, we should have a lot of options here.

That means if in the track credits the artist is CL2 you get back Chris Liebing instead. You can do that but this is not what is written as track artist on the album then. So if you want to mirror the credits of your album you have to use the artist name variations. Will these be handled as own artists then?

Honestly, our thoughts on this topic aren’t firmly developed enough at this point for me to give you a good answer.

What I will say is this: in our system, artist links are standalone entities, not track-level metadata. If the person has a single ID, they will be displayed consistently throughout Roon. If they are split into multiple IDs, then each ID could display differently. We don’t have the machinery to support displaying links to the same artist ID differently in different contexts, and building that machinery is not on the roadmap.

The difficult editorial question is when should one person have two artist ids? And, how can we figure this out automatically and make the data look “right”. We have lots of information about alternate names, “also performed as” records, etc. in our backend databases.

This may be a philosophical conflict. The release is not the center of the universe in Roon–most of our power comes from making other things (artists, composers, works, tracks, performances, …) equally as first-class. One of the consequences of making artists first class is that we have a responsibility to make them behave consistently wherever they are displayed. Otherwise the product begins to become a dishonest representation of the underlying data model, and it becomes difficult to clearly understand what’s going on, form intuitions, etc.

One of the things I’ve learned repeatedly in this few weeks since we’ve launched is: the more well-meaning “magic” we do, the more people are either confused or dissatisfied.

As far as I understood this process of checking against the provider and deciding is only valid if we don’t use the future “prefer local metadata” functionality, right? Will there ever be a match to your ID’d artists than if I prefer to use my artist tags??

We have a hack for this that I don’t love, but that works well enough in practice that we can’t really live without it. When resolving a textual artist name, we look elsewhere in your library, and choose a first-class artist with an identical name. Conflicts are solved arbitrarily. Any fallout from this is (will be) fixable by editing.

If we did this globally for all artists in the universe (or, I suppose), in a sufficiently large collection, this would be a disaster of ambiguous name resolution, but doing this within the scope of a collection does seems to do more good than harm.

So if you are willing to add new cases or spellings than hardcoding won’t be a problem at all I think. You’re right there won’t be 100s of variations at least.

At least they handle a lot of ANVs for the artists which then always are linked from the track artists to the artist id / profile.

For the second part I think I didn’t understand fully what you mean. Maybe also my message was somehow misinterpreted. CL2 is just an artist name variation. I didn’t mean that it should get a second id. From my point of view it would be great if this was not the case. It’s no own artist, it’s just a name variation.

Ok I can see this problematic too. I was just interested if it is possible at all if choosing your own artist tags to get just the informations you have from first-class artists.

Like everytime thanks for your detailed answers.

So if you are willing to add new cases or spellings than hardcoding won’t be a problem at all I think. You’re right there won’t be 100s of variations at least.

Yup. So long as the meaning of the tag isn’t likely to cause confusion, we will add freely. Stuff like CATALOG, CATALOGUE, CATALOGNUMBER, CATALOG# is all unambiguous and primarily just a munging problem. Even handling tags from different file formats as we do today requires doing tons of this sort of normalization.

For the second part I think I didn’t understand fully what you mean. Maybe also my message was somehow misinterpreted. CL2 is just an artist name variation. I didn’t mean that it should get a second id. From my point of view it would be great if this was not the case. It’s no own artist, it’s just a name variation.

Maybe I got confused. What I’m talking about is this:

See those “aka” strings? We have those as first class entities in our data model, and (IME) they seem to be pretty comprehensive. In any case, they’re in the data model so we are capable of representing the concept and improving the data over time.

Every place where Snoop appears in my collection says “Snoop Dogg”, regardless of how he was credited on a particular release–we don’t keep textual data as part of credits in “Roon” data, just a link based on the artist id.

We plan to have an artist-level edit operation that lets you pick how to display an artist (either custom text, or choosing among our alternates). This is critical for classical users who often have a preferred way to spell composer names. Fryderyk Franciszek Chopin vs Frédéric François Chopin and so forth, but may also apply in some of your situations.

Ah ok, now I understand. That was not what I wanted to hear, but I think we have to arrange with this then. Won’t stop the show. What I hoped was that I have CL2 in the track (which mirrors the synonym under which he created the track and which is read on the album cover), which is actually the case in an identified album here

and if I click CL2 this will link me to Chris Liebing where “CL2” is in the aka list.

But at the current state CL2 is not changed to Chris Liebing either because it seems you don’t have an aka information on him at all.

So CL2 remains a single artist with a single id which is not first-class…