Export function problems with diacritics

Disclaimer: I am not arguing that the export function has no good use cases. It has and I have used it. But there are real issues and it is good to be aware and careful.

@Geoff_Coupe has probably a much more detailed understanding and is right, I believe, in saying:

Nevertheless, what exactly is written into this field can be problematic in my opinion. I agree with @AndyR when he says:

Let’s e.g. export track 1 of this album, which includes a diacritical character, a small ‘letter e with grave’ (è), in the track title, and see what happens:

The album is identified by Roon and metadata is set to prefer Roon, so likely originating from Musicbrainz or similar. File metadata includes exactly the same track title:

Screenshot 2020-08-05 18.10.29

(note: In the filename there are no diacritical characters, which is my preference to ensure compatibility when moving things around between file systems):

Screenshot 2020-08-05 18.09.42

On export of this track Roon creates - exactly as expected - a new file (and folder), this time with the filename including the ‘grave’ sign. So far so good.

Screenshot 2020-08-05 18.21.16

Without any further checks on the file I have a look at what Roon itself makes of the export. On face value ok, a new album is created with just the one file:

But when inspecting the file metadata something is off:

Screenshot 2020-08-05 18.18.46

And inspecting the file metada from the file manager - or using Yate, kit3, xAXT etc - confirms that they all render what is written to file exactly the same.

Screenshot 2020-08-05 20.36.29

(When switching in Roon to preference for file metadata for this field, this is also immediately visible in Roon)

Poking around in the original file with a hex editor shows that the (2 byte) code for the è sign in the file was (hex) C3A8. This makes sense as 0xC3 0xA8 (c3a8) is e.g. the correct UTF-8 encoding for character è.

In the export file from Roon this has changed/expanded into C383 C2A8 , with c383 (correctly) decoding in UTF-8 to à and C2A8 (correctly) to ¨. Which is then on re-import (again correctly) read by Roon and, if one chooses to use the file metadata, rendered as è

Repeating this process will keep expanding the bit-sequence exported into the title field. And when selecting file metadata for this field before export it will also become part of the filename - as shown below after another two iterations:

Screenshot 2020-08-05 22.44.03

It seems to me that Roon reads at least some metadata as if it is one encoding (perhaps UTF-8?), can also render it correctly internally within Roon, but writes it out again on export as if it was encoded in a different way.

Whether this is specific to MacOS or a specific ‘locale’ settings I don’t know. My system has pretty much default settings in this respect. I also have no good understanding of how different audio formats store / can store their metadata, so it could perhaps play out differently for different audio formats.

Bottomline: it is confusing and doesn’t seem quite right. Currently, on my system, I cannot be confident if (and how) an export can be achieved which correctly embeds the ‘basic’ metadata shown in Roon. Even staying entirely within one OS and file system, Roon itself does not reconstruct what it exported, it seems.

Possibly by setting everything to ‘Prefer file’ and making sure that file metadata has nothing else than characters with 8-bit codes, it might work. But that seems not realistic to me and would mean deliberately corrupting a lot of information :unamused:.

If there is a solution or a good workaroud, I would love to know! Or even better if someone explains what I could be doing wrong. But in the meantime I use export very, very carefully, and without assuming that the resulting files have all the correct 'basic metadata.

2 Likes

@ToneDeaf - many thanks for your post. I would say this is quite possibly a bug, and should definitely be looked at by @support

And perhaps @Carl or another moderator can move your post into a new thread in the Metadata Issues category so that it can be dealt with there?

2 Likes

I agree. Moved.

1 Like

Thanks guys, that makes sense. Its was late yesterday.

A few more tests this morning with different file types suggest that the issue is limited to Apple lossless (ALAC) and lossy (mpeg4-aac). It is the same for all funny characters I tried to stick in tags and for all tags with one exception - the lyrics tag seems to behave fine.

There is no such problem on export when the file is converted to AIFF, FLAC, Wav or mp3.

And the other way it 's true too - on export well-behaved AIFF files will manifest the same issue after conversion to ALAC or AAC-lossy.

Hope that is useful.

Hi @ToneDeaf,

I’ve passed this along to the team for further investigation. Thanks for the detailed report, it is much appreciated!

1 Like

Hello @ToneDeaf,

I wanted to touch base with some good news, which is that our technical team has been able to reproduce this behavior and we’ve opened up a bug report with our developers.

While I can’t say for certain when this bug will be fixed, getting things reproduced in-house is a critical first step, and I will keep this thread up to date as the team passes along feedback and work begins to get this resolved. Thanks again for the report!

1 Like