Sorting using international characters?

ChristianK · March 29, 2017, 9:00pm

Hi - really sorry if this is a question asked and answered many times before, but I couldn’t find anything…

Since I live in Sweden, there are artists with names containing ÅÄ and Ö, and these should end up last according to the sort order in Sweden, but it looks like Roon is sorting Å and Ä after A but before B, and Ö after O. Is there a way to manage sorting according to local/international standards?

Thanks

Greg · March 29, 2017, 10:12pm

Hi Christian,

Let ask @support about this one.

Cheers, Greg

AndersVinberg · March 30, 2017, 12:49am

It’s a difficult problem. Each language has its own well-documented sort order, but we have multi-language content. The German punk band Ärgerlich Igel is sorted in the beginning of the alphabet under German rules, the Swedish Ärggrön Igelkott is sorted in the end. The Aachen Big Band is in the front, the Aarhus Big Band is sorted in the back because Danish treats AA as if it were Å. The Provençal folk band Dépêche du Midi is sorted according to French rules for those accents, but Depeche Mode is a British band and doesn’t have any,

(Yes, I made those up.)

ChristianK · March 30, 2017, 6:15am

I understand that it’s not super easy to do, but I guess the way it should be done is to just look at the international settings in the client, meaning that if I have my MacOS or iOS set to Swedish and Swedish list sort order, that is what Roon should follow. Even if you have Swedish and German artists/albums, your client setting will give you what you expect as a user.

If that’s currently not possible/supported or more of a long term solution, a workaround could be to support sorting tags like ARTISTSORT and ALBUMSORT. There are disadvantages using sorting tags so it’s not an ideal solution, but it would be an improvement allowing users to find albums, artists etc where expected.

Of course this is part of a bigger question regarding localization/internationalization of the entire Roon interface (sorting, date formats, languages, …). Are there any thoughts/plans for that?

ChristianK · April 3, 2017, 9:08am

Is it only me having this problem or am I missing some other workaround?

joel · April 3, 2017, 9:11am

Sorting currently uses Unicode sorting rules. TBH, I’m not sure if country-specific Unicode sorting rules even exist. One for @brian.

ChristianK · April 3, 2017, 9:34am

Ideally, it should behave as the phone book in your apple devices meaning that what entries you have and how they are sorted depends on the language settings in your client. This means that even though they share the same database (i.e. your phone book entries), it will look different in the client if you have English or Swedish set as your preferred language in your client.

Peter Åberg and John Österlund will be sorted under Å and Ö using Swedish language client settings, but A and O using English language settings. If I could wish, Roon should work the same way

ChristianK · April 4, 2017, 7:14am

It sure does: http://unicode.org/reports/tr10/ (link to the Unicode Collation Algorithm).

ChristianK · April 10, 2017, 5:34pm

Am I the only one that care about international support?
It would be really great to know if better local support is on the roadmap or not!

Thanks,
Christian

Fredrik_Andersson · January 27, 2018, 2:25pm

No @ChristianK you are not alone, there is a discusion at Swedish Euphonia forum right now regarding the way Roon treats artists with Å,Ä,Ö in the name.

ChristianK · January 27, 2018, 2:52pm

Thanks @Fredrik_Andersson Fredrik - glad to hear that people do care!

@support and @danny , could you please give us some thoughts? Is this on the roadmap?

tripleCrotchet · January 27, 2018, 5:24pm

Yes, my wife is Danish and she cares for it.

I am mother tongue English, UK born but my wife is Danish and we currently live between Denmark and Ireland (long story). Anyway I am not the only one using roon in the household so it matters to her and often to visitors.

brian · January 27, 2018, 8:02pm

We have made huge progress with internationalization since this thread was originally posted. The user interface has been translated into quite a few languages since last March.

The #1 performance consideration when working with the huge browser views is the cost of sorting. The sort system is rather heavily optimized. Localized collation routines are 5-10x more expensive than our current ones, and our current stuff is not as fast as I’d like us to be.

There’s a big difference between sorting a few hundred contacts on a phone and 350,000 tracks in a track browser. Even with a more typical 50,000 track collection, it takes some time to flip the sorts in the tracks view. Making that 5-10x slower is not palatable.

I do see the value in what you’re asking for, but I’m not quite convinced this benefit is worth such a performance tax on the whole user base. Maybe that will change if there is more demand for the feature. We’ll definitely keep it in mind while evolving the sorting infrastructure in case we run into a cheaper way to accomplish this kind of thing.

tripleCrotchet · January 27, 2018, 9:26pm

I can understand why there might be no business case but I am curious about the technical argument.

Is the difference in machine sorting that noticeable in terms of the use case? The sense I get from the OP is the limiting factor is the human sorting you have to do in your head not the machine sorting. The problem is the international characters mostly occur in the middle of strings not the beginning so it is very disorienting and requires a lot of concentration. The time involved is orders of magnitudes greater than the machine sorting.

I don’t have this problem with roon (my wife and her friends do) but I am mother tongue English temporarily in Denmark. I don’t speak or read very much Danish so I often have Chrome auto translation on all day or I will routinely configure the TV, phone, device etc to English. This leads to problems I hadn’t anticipated though as half the interface is in English and half in Danish and the simplest navigation usually requires some kind of sort in my head. I certainly don’t notice how long the machine is taking in the process. This is the sense I have of the OP’s feature request.

brian · January 27, 2018, 10:00pm

Yes, absolutely.

Currently we do a rather expensive normalization up front + cache it, then do only ordinal comparisons while the user is waiting. Also relevant: this up-front work happens in the core, in a language-agnostic environment, not in the user interface–which knows about localization.

Handing off the whole job to a generic Unicode collate routine, and then federating the pre-computation/caching based on client locale is a way more expensive approach. It has some benefits–as discussed in this thread–but it’s not cheap from a performance standpoint at all.

Your middle ground proposal–considering that the special cases come in the middle of words, maybe the cost wouldn’t be so bad if we focused on those, etc, is a pitfall. That is implying that we keep the current approach, but start adding special cases as people complain to try to address pain points. Human language is too complicated for non-experts to own those details directly. That’s why the Unicode people exist and do what they do.

The proposal to use Unicode routines was the right one, performance considerations aside. I expect that at some point we will overhaul the way sorting works and it will make sense to figure out how to fit it in to the performance budget.

Just because you’re not exercising the worst case scenarios doesn’t mean they don’t exist

ChristianK · January 27, 2018, 10:17pm

Thanks for your reply Brian!

I understand that the track-list of the entire database is a nightmare, but is that really what users use?
I would be happy if at least the most common views like the album, artist, composer views would be sorted correctly.

You say that sorting is done server-side. Doesn’t that mean that better hardware provides faster sorting? If so, would it be possible that localized sorting only is done if you have good enough hardware lik what you do for the dsp-engine?

Or could you provide a sorting toggle allowing users decide wether they would like slow/correct or fast/incorrect sorting?

brian · January 27, 2018, 10:42pm

The compromises you are suggesting are more damaging than the compromise we are currently living with. They complicate the product and expose implementation details in the user interface that no-one should be compelled to think about.

I think this is something we should look at next time we are working on sorting/browsing infrastructure. I agree that it could be better–I just don’t think there is a quick fix, and I’m not that interested in adding complexity/confusion to the product as an alternative to doing it right.

tripleCrotchet · January 28, 2018, 12:22am

@brian, that’s clear enough. But just so it isn’t lost next time you come to look at this.

Maybe I have misunderstood you but I wasn’t trying to suggest a mixture of unicode sorting the pinch-points in libraries and non-unicode sorting the rest as seems to have come accross. That would be a very bad idea. No consistency across a library. No one would have any idea what they were looking at.

If anything I was trying to suggest (badly) some kind of auto-detection of unicode libraries with auto-unicode sorting of the whole library (not parts). The OP’s suggestion of a user configurable toggle also makes sense to me although you’ve made your position clear. It hadn’t occurred to me that this was an either/or choice. I just assumed there was some libraries that were unicode sorted and others that were not (for performance reasons).

I was also trying to say that trading a few milliseconds of performance is quite acceptable to some of us in exchange for a better overall experience. I understand you cannot build a business case on the back of a few outliers but having said that roon is a hard sell in my household largely because of these localisation issues. I’ve ripped and tagged a ton of 80’s Danish pop music but my wife won’t use it and I hate it. Whatever.