DAC issues with XMOS usb chipset and Roon on Linux

Is thee a model / version of this xmos that is known to be an issue specifically then?

The NuPrime uDSD has had this issue from the day I first used it in Room, which is a good few years back now.

Not sure. We probably won’t actually be able to know because we don’t have visibility into how XMOS packages their stuff or what modifications are being made to that on a product-by-product basis by DAC vendors.

Every indication here is that the bug is on the DAC side–that is fairly easy to see from the failure mode. The DAC plays invalid sounds, ends up hung, and the entire relationship between the DAC and OS is broken from that point until the DAC is power cycled.

It’s impossible for a piece of software like Roon to cause behavior like this on its own without an underlying bug in the device, kernel, or OS. In our QA experiments, the bad state survives restarting everything but the DAC, which almost 100% isolates the problem to the DAC and almost 100% exonerates Linux/Driver/Roon.

I made an assumption that it was a problem with newer firmware because this issue has not taken shape into a clear pattern until fairly recently, and the products that we have reproduced with are more recent, so the explanation made sense. It’s not something we are 100% sure of, just a theory. We will probably never be totally sure.

We’ve also noticed that the appearance of this bug is wildly inconsistent from system to system. Some systems can accomplish hundreds of format transitions without trouble, and others fail within a few transitions.

Maybe it’s a timing bug–this is a common class of mis-implementation on USB devices that is consistent with all of the above. Some DACs assume that software->driver will exercise state transitions at a slow pace, and run into race conditions/failures when ending a stream + starting a new one in rapid succession (which is the best thing for user experience, and an area that we have put work into optimizing).

If that’s the case, the workaround is straightforward, but distasteful–artificially waste some time at these transition points so that the DAC isn’t overwhelmed. Yuck. But it wouldn’t be the first time we have found a bug like this in a USB DAC. Then there’s a question of whether everyone gets slowed down, or if we can somehow detect devices with this bug. None of this is very pretty…

1 Like

@brian, I’ll happily ship the uDSD to you if it’ll help.

Thanks for replying Brian, so even though I could not repeat the behaviour via MPD it’s not Roon specifc.? This is good then but just wondering how I could not make it happen in MPD playing the same content in the same order that made Roon crash the DAC everytime? I noticed the buffer size in Roon of the Alaa stream is a lot smaller in Roon compared to MPD could this be a factor?

Small update for the technically inclined–this seems to be the root cause of the problem: https://www.xcore.com/viewtopic.php?t=6417

So I think we can now say more definitively that this issue is in the XMOS firmware on the DACs and not a problem with Linux or Roon/RAAT. We will be reaching out to manufacturers shortly and sharing these details.

5 Likes

A step in the right direction. Thanks Brian for the update.

It seems that Roon could mitigate this issue by not upsampling MP3s to 44.1/24, but perhaps to 48/24? Do I understand that correctly? Until DAC firmware upgrades arrive for the various XMOS DACs.

You’re playing something with 44.1K on 24 bits, which works. After that you’re switching to 44.1K on 16 bits, and then stuff breaks down.

Hil Bill, I’m not brian, and obviously not answering for him.

But this XMOS bug isn’t limited to MP3’s though. If someone really wanted to do any custom sample rate conversion, you can already do this yourself manually now, here:

@Bill_Janssen, no not quite–the trigger is transitions between 16/24/32bit, not 44 vs 48. And this is not MP3 specific either.

Remember that this affects some DACs, not all, so any mitigation would have to either detect those products or add a user-facing setting to let people opt in. The former is somewhat difficult and the latter is generally distasteful as a product design choice.

It’s much cleaner for the world, and better for everyone long term, if the DACs have their bugs fixed. If we patch around every DAC bug, it disincentivizes manufacturers from doing the right thing.

We now know that this issue is timing sensitive–so a player that is just 1/10th of a second slower than Roon at switching sample rates is unlikely to encounter the issue. This is likely why reproducibility varies so much from system to system (at least, in our testing, this bug is variably repeatable. Some systems can repeat it as much as one out of every 2-3 transitions, and some need 100 transitions before encountering a failure).

By the way, in offline discussions with partners, we have been relayed user reports of this issue affecting MPD and other players…which is to be expected given the technical understanding linked above.

For personal mitigation, you could set up a procedural EQ with 2 gain blocks that cancel each other out. That’s the closest thing to “nothing” that will also force the 16->24/32 conversion, I think.

We’re going to see how it goes with the manufacturers before deciding how to proceed on any possible mitigation efforts here. They all entail work, long-term compromises, or both–whereas DAC manufacturers should be happy to apply the straightforward fix outlined in that thread + spin new firmware without any of those downsides once the issue is brought to their attention.

While I understand the appeal of this approach, it doesn’t account for vendors of abandonware. That’s all too common in the tech world. Take the Project S2, for example, where differences between the designer and the manufacturer resulted in the designer leaving and now the S2 will basically never have another firmware update (other than the one the designer said they MIGHT consider making and selling directly themselves). I know the S2 is XMOS-based, I don’t know if it suffers from this bug, though.

As far as I can tell - it does. Mine has locked up quite a few times. It is only seeing this thread that made me realize it does indeed seem to happen when switching formats. I have had it happen when fiddling with Roon’s DSP as well (enabling, tweaking, disable - which I guess would switch it through 16, 32, 16, or 24, 32, 24 etc).

The best we can do is to start being somewhat more forceful with Pro-ject support. I have seen John Westlake become quite vocal and public in defending issues with this DAC, so I guess that places him in the firing line for support as well.

As it stands, this issue not withstanding - it does also need an update anyway to put a final end to MQA dropouts on Linux devices. We should not be the ones to suffer for this just because they cannot get their business dealings in order though inadequate project and/or risk management or whatever.

So we are likely in the hands of the Manufacturers, I wont hold my breath of getting a quick update considering the last one for mine was 18 MTHS ago so this issue must have been around then too. I was
using spdif until June this year though so would not have noticed it.

I didn’t say we definitely wouldn’t do anything here–but I do want us to contact manufacturers first and see how it plays out before making that decision.

The age of hardware products as unchanging artifacts is over. We are living in a world where no-one should be tolerant of companies who release connected products without a plan for long term updates/support. Whether it’s about compatibility, security, bug fixes, or simply delivering more value over time, the availability/commitment to delivering software updates should be taken into account when making purchases.

Apple, Nest, Tesla, Sonos, and countless other brands have cemented this as an expectation in consumer electronics today. Clearly, this is baked into our philosophy too, with both hardware and software products. To the extent that we can push manufacturers to “get” this, we will always try to before doing the “bad” thing and working around their problems.

To be clear, I am not criticizing any particular brands here. We haven’t reached out to anyone and gotten a “no” yet or anything like that. This is all just general commentary right now. We will be reaching out to people soon.

3 Likes

I was referring to this line in the link you posted:

So for example switching from a 44.1k/16b to a 48k/24b is no problem, but if the sample rate stays equal; (for example 44.1/16b to 44.1/24b) you will get this bug.

So, it’s when you switch bit depth, but keep sample rate the same. Apparently Roon presents low-res content as 44.1/24 based on this bit from Simon:

It tends to happen more on low Res content that Roon itself automatically upsamples to 44.1/24

I don’t know enough about MP3 to know if it’s actually possible to present it at a different rate, but I’m assuming the majority of user’s tracks are either MP3 (or other low-res) or CD, and making sure they don’t use the same sample rate would alleviate the problem some during the couple of years manufacturers will surely take to update their DACs, without having to waste time sleeping.

Definitely a timing issue as well. I can use a Mac Mini 4,1 running Ubuntu to drive the USB connection, and the problem doesn’t happen. I can replace that with the Pi/RoPieee box, using the Ethernet connection, and it happens. I can switch the network connection of the RoPieee box to WiFi instead of Ethernet, and it stops happening.

Looking at the Mytek Liberty firmware updates changelog, I see this in the latest version:

Fixed issue with fast changes of sample rate and signal source

Wonder if that’s the change they’re referring to.

Try it and see.

I’m already running 1.31, so if that’s the change, it doesn’t work so well.