Nowadays dithering is generally recommended in each and every step where you reduce the resolution of an audio stream. It is easy to show that you get rid of all the “evil” digital artifacts by dithering and the only thing you have to sacrifice is an unimportant tiny rise in the noise floor. But who says dithering is always necessary? Could dither even have negative side effects? In some situations I believe it can!
In Roon software dither is used in the DSP volume control, in the general DSP and in the track or album gain feature which means that you in some situations add dither several times in a row. The noise floor is still extremely low and everything is ok - or? Without dither and without any audio signal you have no activity at all and total silence but with dither activated you have processing activity as well as the dither noise itself all the time - even without any audio signal present…
In a studio environment it is always good practice to use dither because you don’t know for sure what will happen to the signal later on - massive gain for instance - but at the final stage very close to the endpoint of the stream in your listening room the situation suddenly becomes very different because the artifacts of an undithered reduction from say 32 to 24 bit will be completely inaudible. But what if the artifacts from the active dithering process itself has an impact on the character of the perceived sound?
I feel convinced that the sound from my bridge/DAC combination with undithered digital “Device volume” activated has a more clear and open character compared to the Roon “DSP volume” situation with the same attenuation.
I will therefore suggest Roon to add a choice from several flavors of dither and obviously also the possibility to disable dither completely at least in the “DSP volume” section.
I have a feeling that you will need more tangible evidence than „I feel convinced“ to get traction for this.
Ok, but it is a fact that a lot of audio editing software gives the user the possibility to enable or disable dither and even to choose among different kinds of dither. The reason for this is obviously that the kind of dither you use leads to audible differences where the “optimal” choice has to based on your subjective impressions alone.
It is my impression that many Roon users are willing to experiment with all the possible combinations of settings so why shouldn’t dither settings be included as well as all the general DSP possibilities for instance?
Not saying it shouldn’t but there are a thousand things Roon should do and many of them are necessary to make it work as advertised, so just saying that there is more to do than they can handle and a more convincing case helps. People who want to tinker endlessly can run HQPlayer anyway.
Maybe this should be a #feedback:feature-suggestions topic where people can vote
Dithering 24 bits doesn’t result in any audible difference, considering we’re talking about -144dB of random noise in the worst possible case. You can do it hundreds of times and still stay orders of magnitude below audibility.
I think you’d be wasting your time experimenting on this. If you really want to try, play a flat signal (all samples zero) and a 24-bit dithered silence (with no noise shaping, to make it the worst possible) and see if you can hear any difference between the two in your system at max volume. I can share such tracks if you’re interested.
I agree that it is really hard to understand why the implementation of dither should cause audible differences even in a 24 bit stream. I’m only trying to isolate the possible technical reasons behind the often reported tendency in the Roon software to degrade the sound quality to some degree when you enable any of the DSP features. I fully agree that from a pure technical standpoint like yours there shouldn’t be any plausible reason to change anything at this point but when a simple change from “DSP volume” to “Device volume” leads to inexplainable audible improvements I just have to investigate the phenomenon a little further.
Just to make it clear: My “Device volume” takes place in a USB/spdif interface between a Raspberry Pi4 (with an ultra low noise 5V supply) with Ropiee as Roon Bridge with USB output. The spdif interface is based on a XMOS 208 chip with integrated digital volume control. Because the output is spdif the resolution is limited to 24 bit.
My measurements and FFT analysis confirms that the only objective difference between the “DSP volume” and “Device Volume” situation is the absence of dither in the “Device volume” situation.
If I compare the two outputs in my analysis software the difference in the FFT spectra is obvious but the level is actually about -140 dB below 0 dBFs as expected. In other words: there should be no other objective differences between the two scenarios than the absence of dither in my “Device volume”.
I might be wasting my time doing things like this, but I’m retired now and I have more time for my hobby as ever before ![]()
Interesting. How do you know the difference is dither? Is it possible to share the FTT graphs?
You can try to test your hypothesis by playing a non-dithered and a pre-dithered signal with no DSP in Roon and 100% device volume in both cases. It should be a blind comparison though, since you seem to have some expectation about the “often reported” degradation. DSP volume in Roon was measured before and shown to be practically perfect.
I have read DrCWO’s whitepaper several times and I know the Roon DSP volume control is as good as a digital volume control with precise dB calibrated steps can possibly be. My recent measurements (all based on free Windows software of different kinds) has now made it clear to me that the volume control in the latest version of the XMOS 208 USB chip is very, very close to the same theoretical optimum performance in this field. They should be able to deliver the same “close to perfect” performance when it comes to perceived sound quality…
AB- or ABX-tests are very objective tools but they can be difficult to use in practice and the method can give reliable results but the the tools must be considered as a quite insensible measuring instrument unfortunately.
I might be biased here but I know that even the smallest differences in things like implementation of dither can lead to audible differences in the perceived performance.
How do you know that? And what is “insensible” about ABX testing tools?
In pro-audio applications various types of dither are used for specific music genres obviously because they have individual audible consequences.
In AB comparisons the difference between A and B needs to be above a certain threshold to be detected but that doesn’t mean that differences below this threshold are unimportant or inaudible in other situations.
I haven’t heard of that. The only time dither should be applied is when reducing bit depth, and the only reason to do it is to randomize (i.e. de-correlate) quantization noise from the signal. A triangular distribution dither plus noise shaping seems to work best for all signals. Noise added for creative purposes is something else.
I think that’s exactly what it means. You can do ABX for each of your particular situations. If it’s audible, comparisons should reveal it.
I can recommend this very informative article where the use of dither in pro-audio is explained. Some types of dither is recommended for speech, others for rock and so on. There’s a lot of references to the subjective evaluation of the sonic results in general. You also find pure objective recommendations like: “If your audio is going from a 32-bit floating point to 24-bit, then don’t Dither, as the depth cannot go any higher.”
Thanks for the link. I went through the article. It’s not very technical and has a few inaccuracies. I don’t disagree that dithering when reducing to 24 bits is not really necessary - especially when further processing will be done, but doing it doesn’t hurt either and it will definitely not be audible in any way, even if played in isolation and even if done more than once. I can show you what the noise floor looks like for 24 bits with no dither, with one dither and with 100 dithers, to see how small the differences are.
Regarding the types of dither, they say “type 1”, i.e. the non-shaped dither, is “primarily used” with low dynamic range recordings. I don’t think that’s because it’s better in any way, but because the loudness of the material would hide further reduction in noise audibility anyway. It doesn’t mean that applying shaped dither like type 2 and 3 to rock music would be inappropriate.
Also, I don’t see the need to have two types (1 and 2) of shaped dither for production. Ideally, shaping should follow the equal loudness curves, so that the quantization noise is made as small as possible for a given bit depth, even if reduced shaping wouldn’t make an audible difference. Besides being marketing differentiators, having different options for dither is useful only for experimentation or demonstration.