Parametric EQ and DSD playback

If I’ve enabled an EQ filter and playback a DSD file, what does Roon do?
I’d assume it converts it to PCM in order to apply the DSP.
Yet in my “enhanced” signal path, I see:

Source: DSF DSD 64>Parametric EQ>Sigma Delta Modulator>Sonore microRendu>USB Output.
That seems to indicate the EQ is being applied DSD without conversion?

Is that possible? Or is Roon actually not performing the EQ in this instance, because it is a DSD file, and just not changing what is displayed as the signal path?

No answer after almost 3 months. Really?

Flagging @support.

We do indeed process DSD without performing a DSD->PCM conversion first. The signal path is reflecting that accurately.

I’m going to explain how it works–keep in mind that there are some subtle technical details here, and some background knowledge is required to understand them fully. Processing DSD isn’t nearly as straightforward as processing PCM. With the exception of a few simple operations, you can’t process it directly in the 1-bit representation. There are more steps involved, but it’s possible to perform those steps in a way that keeps all of the important properties of DSD intact.

First, I’ll explain DSD->PCM conversion, because it helps to understand the other technique in a relative sense.

DSD->PCM conversion starts with a with a DSD signal and produces a signal with two characteristics:

  • PCM representation (lower sample rate, wider samples)
  • Low noise floor throughout the frequency domain of the PCM format that is as flat as possible.

The first one is obvious–we need a PCM-like representation at the end. The second goal is more subtle–it is saying that the content of the signal must look like a PCM signal. It must be accepted and played properly by PCM equipment. It must be processable by downstream DSP processes that expect to work with PCM data, and so on. It must not cause damage to equipment that’s expecting PCM.

This is accomplished in three steps:

  1. Start with a DSD stream, and widen from 1 bit-per-sample to 64 bits-per-sample
  2. Downsample it by 8x (so DSD64 -> 352.8kHz, DSD128 -> 705.6kHz, etc).
  3. Apply a low pass “reconstruction filter”. This filter also exists in a DSD DAC, but since we are effectively simulating the DAC, we must simulate that aspect here too, since PCM DACs do not have this filter.

The reconstruction filter removes the noise inherent to the DSD signal before it can reach equipment that might not be prepared to handle it. Most of the energy in a DSD signal lives in this noise (well over 95%), so even though the noise is all at inaudible high frequencies, it’s important to filter it out so that your gear is not asked to turn that energy into loud, high frequency sound.

If you look at a spectrogram of DSD->PCM converted data, it looks like a PCM signal. Depending on the source material, and the sensitivity of your spectrogram, you might see a bit of a very quiet noise floor in the area where the transition band of the noise shaping filter used during mastering crosses over with the transition band of the DSD->PCM low pass filter (30-60kHz for DSD64).

OK, so now that DSD->PCM is explained, lets talk about the case you’re actually interested in–the one where we process and output DSD without converting it to PCM.

This works like this:

  1. Start with a DSD stream, and widen from 1 bit-per-sample to 64 bits-per-sample
  2. Apply a low pass filter to remove the bulk of the inherent noise energy from the widened signal.
  3. Apply processing steps to the wide intermediate format.
  4. Send the signal through a sigma-delta-modulator to re-render the “wide” 64-bit stream into a 1-bit DSD stream.

The low pass filter (2) in this process might sound like the reconstruction filter we discussed above, but it is very different. It is much more lenient, less steep, and it only attenuates frequencies over 100kHz–and these already have a very poor SNR because of the inherent noise shaping in DSD, so we can be sure that no meaningful information existed there in the first place.

Without the filter, sound quality suffers significantly or the sigma delta modulator risks becoming unstable (i.e. starts outputting horrible sounds that ruin your ears and if you’re unlucky your gear too).

At step (3) the signal is structurally similar to a PCM signal–in that it is comprised of a series of multi-bit samples. However, it does not have content typical of PCM signals and it maintains the DSD sample rate. If you looked at a spectrogram of the intermediate format in (3), it would look just like DSD, except with the bulk of the noise above 100kHz severely attenuated by the low pass filter.

By maintaining the original sample rate through processing, the time-domain characteristics of DSD are maintained. By designing the filter to stay far away from musical content, the frequency-domain characteristics are maintained too.

Sometimes this form of processing, or this intermediate format is referred to as “DSD-Wide”. We didn’t use that term because some people have defined DSD-Wide as an 8 bit intermediate format (whereas we use 64 bits…a luxury of precision afforded to us by running on modern desktop-class CPUs) and I didn’t want to create confusion.

13 Likes

Converting DSD to PCM is a ‘lossy’ process and requires very steep filters around 30kHz to cut off the ultra-sonic noise. This type of conversion is best to avoid. On the other hand, PCM to DSD conversion is inherently low noise, once noise shaping is applied, a gentle filter at 50kHz is sufficient enough. Most DSD DAC has this type of filter built-in but not for PCM.

When you convert to 44.1 kHz, the filter is very steep, as it has to go from 0 dB at 20 kHz to -120 dB or lower at 22 kHz. Yet, the result is indistinguishable from the original hi res signal in a blind test. There’s no real reason to avoid.

Very interesting, may I ask ask a question after all those years? When modulating from 32 bit back to 1 bit does this also include noise shaping? Or is this left out to keep the signal characteristic as close as possible to the original dsd stream?

Modulating to 1 bit always includes noise shaping. Without it, in-band quantization noise would be too high.