Roon 1.8 sound quality change?

The Jitter will be in your DAC, not on a network…

1 Like

Christian I like how you explain this. But

My Core runs as virtual machine on some servers in the basement. Roon sends 5 second data junks to the streamer using wireless. The streamer buffers these junks and sends now the data stream precisely clocked using AES/EBU to the DAC. How can there be jitter coming from the CPU?


I’m sorry, but what does this mean? If you have separated your core and endpoint, and use asynchronous transport between the two, how can jitter even be a thing?


This is true for SPDIF, but using a Roon Ready endpoint data is transferred to a computer at the other end over ethernet the time part is taken out of the equation. Normal Roon Ready endpoints have a buffer of 2-20 seconds. If this concept did not work, the entire Internet as we know it would not work. Roon Core is not relevant for the audio; it transfers data packages (with checksums, and request for corrupted packages again). Any computer over the last 40 years have been capable of handling this with no errors. The entire world is built up around this concept in fact :slight_smile:

This can easily be tested by starting playback from Roon against one of your ethernet connected endpoints. Unplug the ethernet cable while playing… music will continue for a while (2-20 seconds). If you are scared of noise coming in via the ethernet cable, use UTP instead of STP cables), or use wifi for the Roon Ready endpoint.


I noticed one interesting detail. When I turn on the phase inversion in the DSP settings, then the thinness decreases and the sound becomes similar to the sound of version 1.7.
I still don’t understand why files from the hard drive sound thinner than from streaming resources. Before the update, this was not observed.

It’s the same for me with a Zenith Mk3. The regular core option is now very close to the experimental Squeezelite player option. Prior to the 1.8 update the sound quality was significantly better with the experimental option but now they are the same or very close. I’d need additional listening time while ensuring all dsp settings are the same to determine if the same or one slightly edges out the other.

1 Like

Of course the comment on sample by sample was figurative. I have long understood the difference between requirements for asynchronous data transmission and synchronous data transmission. This difference explains why a cheap plastic CD-ROM does work flawlessly while we have all heard differences between cheap and sophisticated CD-audio players - drives.
More modern, talking about Digital audio in the modern way, we have not much SP-DIF, but rather USB or Ethernet transfer. Ethernet is jitter-free but USB is not.

I have USB only thus I cannot do this test which I have no doubts works well.
Your intervention leads however to an interesting point - Who has Ethernet link from Roon Core to DAC and found direct improvement, Who has USB or SPDIF ?

Agreed. Something definitely changed on that end. Exp/Squeezebox mode to me sounds slightly worse now – I’m hearing a bit of glare and thinness vs the regular core mode. I’ve switched back and forth a few times, and I keep on preferring regular mode. I don’t use any DSP so I’m hearing each mode directly.

Although that is entirely possible, my setup did not change when I used it with 1.7. 1.7 was definitely fine sound-wise to me.

Some background.
Here is the document that cites this 2 microseconds interaural detection limit. It comes from one of the fathers of MQA (Hiugh Robjohns in 2016) who digged in the works of Georg Von Bekezy (Nobel Prize in physiology) and later by Nordmark in 1976.
It advocates for high-res digital audio: clever resampling, High-Res formats, MQA.
Then all of the software and hardware setup must follow…
While phase shifts and group delays are often considered when defining audio quality, temporal precision is rather less well understood, and has been largely ignored until relatively recently. However, it has been known about in some sense since the work of Von Békésy back in 1929. He was investigating the acuity of human hearing in identifying sound source directions, and his work suggested that the ear can resolve timing differences of 10µs or less. This was extended in 1976 when Nordmark’s research indicated an inter-aural resolution of better than 2µs! More recent international research confirms that the human auditory system can resolve transient timing information down to at least 8µs, too.

To put this in context, simple maths indicates that a sinusoidal period of 8µs represents a frequency of 125kHz — but no one is suggesting the perceptual bandwidth of human hearing gets anywhere near that! Entirely different ear and brain functions are involved in detecting a signal’s arrival timing and its frequency content, possibly in much the same way that the ‘rod’ cells in the eye detect brightness or luminance, while only the ‘cone’ cells detect colour. So the established auditory bandwidth of around 20kHz remains broadly accepted as fact, and a secondary system is presumed to be involved in detecting signal transients with remarkably precision. Perhaps one of the reasons we have seen a trend towards higher and higher sample rates in ‘hi-res’ audio equipment over recent years is because higher sample rates inherently improve the ‘temporal precision’ to some degree.

I also just compared regular core to experimental mode with dsp off for both but preferred experimental. To my ears the regular sounded cleaner but less full and with a less holographic sound stage. I think it would come down to the type of music being played and personal preference as to which is better.

Either way it’s interesting that the two are now so close as it opens up different options especially with experimental dsp sampling conversion limited to 192/24 on PCM. I don’t think it’s that experimental is worse but that for some reason 1.8 makes the regular core option so much better at least on an Innuos.

Some clarification seems to be in order here:

In Isochronous USB mode, the client DAC has to retrieve clock information from the host sending data packets, since it is a one way pipe, jitter prone. Very early implementations suffered from that scheme.

Current USB-DAC implementations use Asynchronous USB mode, where the client DAC triggers the host to send data packets to keep the buffer filled, thus being able to use its own clock. Measurements nicely show the effectiveness of such an arrangement.

Please research your comments first to prevent misinformation of other members.


I know lots of people talk of jitter reducing devices for USB, but I don’t buy that. USB is asynchronous too. And the results with modern equipment is outstanding. For example, this is ASR’s test of the SMSL Sanskrit 10th MK II DAC, $130 at Amazon ($139 in blue), through USB.

Compare with SPDIF which is synchronous:

I didn’t cherry-pick luxury devices, this was the first I came across.

Why am I so adamant about this?
Because USB, like Ethernet, sends packets. Works like this:

  1. The sender (e.g. Roon Core) collects a bunch of samples and puts them in a packet, like writing them on a piece of paper.
  2. The piece of paper is sent over the USB cable. This transmission is not timed to the music because there is other traffic, like UI info for the DAC screen, but it is plenty fast enough.
  3. The network stack in the kernel of the receiving DAC taps the higher level code on the shoulder and says, I’ve got some data for you, it is this big, give me a buffer.
  4. The code hands over the buffer, the data is filled in, the high level code reads those samples from the buffer, clocks them carefully to its clock and they get decoded.

Nothing in that chain was clocked to the music sample rate, until the last step.

Ethernet works the same way.



In the RAAT protocol there is a feedback loop that ensures that it is the DAC that pilots the exact timing of the flow of the incoming data to honor the fixed clock rate. This is an essential feature as in the end what matters is that at the exact point that the conversion to analog is done, the data must have the exact right timing (for both Left and Right channels), no matter what happens upstream of that point.

So with this type of control loop, a statement like « the jitter will be in your DAC, not in the network » is hard to figure out. Besides the A/B comparisons I made were between the direct feeding of input AIFF file into HQplayer, and the same with Roon activity, and finally with Roon itself passing the file data to HQplayer.

If it is not jitter, it has to be some form of digital noise polluting the analog part - what else ? But that becomes harder to explain given the setup where the DAC does the same thing in all cases and just receives digital data from the same code running on the same computer - and passing through the same, galvanically decoupled, USB link. Although I am not sure up to which frequency the decoupling is efficient.

I read a difference between indicated and confirmed!

1 Like

Yea, but these cost US$650

There are arguments that the issue is noise, not hiss through the speakers but noise that compromises the decoding. I don’t know enough to argue for or against that, although like you I wonder about galvanically isolated USB. But thus is Sonore’s argument, and I’m not going to dispute Sonore’s claim that their devices are beneficial for SQ. They may be. But I’m convinced it’s got nothing to do with jitter.

I have a Sonore MicroRendu, btw. I have tested DACs close enough to the Core to run direct USB, through the MicroRendu, and direct Ethernet where possible. I recommend you try that too, lots of good devices offer trials.

There are actually cheap consumer grade USB receiver chips available that galvanically decouple the USB bus and power.
No specific high end audio design needed for that.


Oooooh! Are these tuned to 1.8? :rofl:

1 Like