I used to think the DAC was 90% of the sound. This sub-$300 DIY project proved me wrong

There are many small things that helps very significantly: a clean and stable electrical supply. An excellent grounding in the whole configuration. Do you have a zero hum in your phono at maximun level? EMIs ? CC isolation in the supply? Galvanic isolation from router and network? All equipment decoupled sonically from floor? Positive and neutral kept through the whole configuration?

That is not a philosophy or a mere theory, it’s an established fact. If the “real or imagined” subjective experiences contradict it, you are effectively debating it, whether you admit it or not. Challenging those experiences seems appropriate to me.

I don’t understand this part. The question you’re asking in the very first post mentions the impact of transport (to use your own highlight) on sound quality. Your “tinkering” is exploring exactly the relevance of the transport:

Has anyone else here felt that a streaming transport made a bigger difference than a DAC upgrade?

1 Like

Where to begin…starting out with no system you have to plant your flag somewhere. And given that the most “colored” member of a music system is almost always the speakers, that’s often what we need to select first, and it’s a matter of taste. Everything else “drives” that choice. Ideally it would be great to find that perfect space for a system first, and work to ensure the power coming into our homes is clean and well grounded.

Today’s amplifiers, be they tube or solid state, have over the years converged on sonic attributes, though there’s still much to guard against in terms of speaker demands and room size. Upstream devices matter to a large degree. The better to expose the best the speakers can offer. The vinyl side of things has a long history of development so their maturity. Digital is the baby that keeps us surprised year after year.

Ridding our system of noise, which can come through the RFI/EMI, wiring (interconnects, etc.), all forms of timing (clocks), there’s just so many elements in the chain to consider.

I often feel that this is a neverending hobby we’re in. But then I sit back and just enjoy the music!

1 Like

Having begun with Roon on a generic pc, then later on bespoke version from Roon, and finally on my current Grimm MU1 each has bettered the last. And what’s interesting is that each of these upgrades has made an impact on what my DAC can accomplish. And I won’t even get into all the DACs I’ve had over the past decades.

I’ve been reading this thread – on moderator duties – but these are my thoughts as a community member. Whilst I am sceptical of many claims, I do tinker and experiment, and enjoy both analogue and digital sources. However, I am a little confused by what appears to be contradictory information.

For instance, why is a real-time Linux kernel necessary when Diretta is supposedly drip feeding the stream, which typically would traverse the network in a matter of seconds? Or is this simply a consequence of finding an OS that runs Diretta on a Raspberry Pi?

And, if the bits aren’t altered (I take this is a given) then are we simply addressing noise on the analogue signal? This, I agree, can cause unwanted side effects in the analogue domain (more on this soon), but there’s nothing controversial here.

However, RF noise, EMI, ground loops, and DC in the AC supply etc. affect all of our equipment, not just the streamer. Since conventional wisdom suggests that eliminating noise at the input of the DAC is already achievable using galvanic isolation, for example, magnetics, RF, or optics, I’m uncertain what this experiment is doing subjectively or otherwise.

For example, when I was tinkering with a power supply for my streamer (an Allo DigiONE) using an integrated amplifier’s DC out and an LDO, I thought (subjectively) that there was an improvement. However, a subsequent experiment revealed that this particular modification actually introduced noise rather than eliminating it. I was disappointed. My efforts at tinkering were all in vain. Yet, I was certain of the improvement at the time. This is why an entirely subjective approach is fundamentally flawed.

If I could try this without the wholly unnecessary cost of AudioLinux*, I would, since I have everything else at hand. After all, what is an experiment if we don’t learn something or discover if there is any truth in the claims?


*Did I read somewhere that GentooPlayer is a suitable alternative to AudioLinux?

2 Likes

Since you are talking about digital, the biggest problem I see is the upsampling for conversion by a Delta-Sigma chip. Basically, unless you use HQPlayer, the upsampling is typically awful on most chip-based DACs or steamers. I recommend R2R or discrete DACs - nothing with a chip.

Hi @mjw,

Thank you for this incredibly thoughtful and detailed post. This is exactly the kind of constructive, skeptical tinkering discussion I was hoping to have. You’ve asked some of the best technical questions in the entire thread, and I’ll do my best to answer them from my perspective.

First, I want to say I completely relate to your story about the Allo DigiONE power supply experiment. That feeling of “did I just fool myself?” is why so many of us (myself included) are skeptical. Your point about the “subjective approach” being “fundamentally flawed” is a fear every honest tinkerer has to grapple with. It’s why I’ve tried to be so open about my own journey from skepticism.

You’ve raised two central technical points that I’d love to clarify.


1. The “Drip Feed” vs. Real-Time Kernel Paradox

You asked:

…why is a real-time Linux kernel necessary when Diretta is supposedly drip feeding the stream…?

This is the key to the whole architecture, and it’s a common point of confusion. The “drip feed” isn’t what the Roon Core sends; it’s what the Diretta Host creates.

The architecture is in three tiers:

  1. Roon Core (NUC/Mac Mini): This sends the standard, “bursty” RAAT stream over the network.
  2. Diretta Host (Pi #1): This is the “processing engine.” It receives that “bursty” stream from the Core. This is where the Real-Time Kernel is critical. The RT kernel allows the Host’s CPU to manage its processes with high precision, re-packaging that “bursty” data into Diretta’s “calm, evenly-spaced” stream without its own timing getting messed up.
  3. Diretta Target (Pi #2): This receives the “drip feed” from the Host over the dedicated, physically and logically isolated point-to-point link. Because the data is now calm and perfectly timed, the Target’s CPU is barely working (slow and low), generating minimal electrical noise right before the DAC.

So, the RT kernel isn’t needed for the “drip feed”; it’s what creates the “drip feed” by processing Roon’s chaotic, bursty data with precise timing.


2. The “DAC Isolation” vs. “Processing Noise” Problem

You also said:

…eliminating noise at the input of the DAC is already achievable using galvanic isolation… I’m uncertain what this experiment is doing subjectively or otherwise.

You are 100% correct. Good galvanic isolation (in a DAC, or via SFP, etc.) is fantastic at blocking common-mode noise coming from the network cable.

This Diretta architecture is designed to solve a different (or additional) source of noise: the internal electrical noise (RFI/EMI) generated by the endpoint’s own processor.

Even with perfect galvanic isolation from the network, a “bursty” RAAT stream still forces the endpoint’s CPU to spike its activity (work hard, then sleep, work hard, then sleep).

My hypothesis—and the core theory of this project—is that these spikes in current draw create low-frequency electrical noise inside the endpoint itself. This noise is particularly insidious because it’s theorized to fall into a frequency range where a DAC’s Power Supply Rejection Ratio (PSRR) is least effective. The noise, therefore, is not filtered out, and it pollutes the clean ground plane and power rails that the DAC’s sensitive analog circuits rely on.

By using the Host-to-Target “drip feed,” the Target’s CPU has a constant, low, stable workload. No spikes, no low-frequency processing noise. We’re not just isolating from the network; we’re calming the processor at the most critical moment.


On AudioLinux vs. GentooPlayer

Finally, regarding the “wholly unnecessary cost of AudioLinux”: You are absolutely right that GentooPlayer is a suitable alternative! (Though it’s worth noting that GentooPlayer is not free, either.) This project isn’t an ad for one OS; it’s about the architecture. As I’ve mentioned previously, the SOtM sMS-200 family of products provides a more-or-less turnkey alternative also.

I chose AudioLinux for my guide simply because its pre-built RT kernel and Diretta support are excellent and well-documented. Well, that, and I already had an AudioLinux license for the single-RPi Roon endpoint I was using at the time. If you (or anyone) were to try this with GentooPlayer, I would be genuinely thrilled to hear the results. Honestly, in other Diretta forums, GentooPlayer is a more popular solution. That’s what this ‘Tinkering’ is all about.

Again, thank you for the fantastic, probing questions. I hope this clarifies the why behind the architecture.

This is probably off-topic and could be forked, but if you follow the history of digital, you can consider Shannon’s articles about the sampling theorem, published in 1948-49, as the start of the digital revolution. Shortly after that, PCM, which is by far the most common coding scheme for digital audio, was patented in 1952. That’s about three quarters of a century ago, even before stereo vinyl records were introduced. And of course, the red book standard, which was standardized in 1980, is already almost a half century old. It seems to me the digital baby has long grown up.

QED (highlight is mine) All modern DACs are using the same decades-old principles, while their performance has exceeded the abilities of our hearing for quite some time now. It’s probably not coincidental that the “tinkering” done here is focused on something entirely non-audio-specific: a network protocol.

4 Likes

So, as I understand the claims
(1) they are not disputing that a bit is a bit - either the digital contents in a file or a stream is either bit accurate or not.
(2) it is possible to have digital distortion where the signal level between on & off are either so smoothed out that there are bad guesses or even interference results in a pretty clearly flipped bit. But checksums & such should catch & require retransmission (thus importance of having a buffer). And in any case, such obvious digital corruption is not what is being discussed here.
(3) Instead it is argued that high freq noise in the digital transport signal, while not impacting the digital signal (assumed bit perfect transmission), somehow affects the DAC so that that analogue output is distorted from what it would be without that electrically noisy environment.
(4) It is posited that the impact on the analogue output is currently beyond the capability of the current analog analyzer technology, yet is something that our human ears & mind can feel.

Is that basically the proposed conjecture?
I am not an electrical engineer, yet I can understand the possibility of everything up to #4. But, I don’t quite understand why technological measurements are unable to detect what our very limited human auditory process can hear.
Assuming such a technology shortfall exists, what measurement is needed and how difficult would it be to realize such a non-subjective test comparing the generated analogue signal in a clean transport environment vs a normal transport environment?
Also, I assume the internal circuitry in the component holding the DAC is also extremely quiet in this domain.

Then there is the need for equivalent transduction to sound on both sides of the A/B comparison (for human audio subjective test)
For that matter how does the proposed analogue distortion (from transport issues) compare to the analogue transduction-to-sound factors , assuming high quality headphones?

2 Likes

Hi @Michael_Arones,

Thank you for this post. This is the kind of thoughtful, rational engagement I was hoping for in this thread.

You have perfectly summarized the proposed conjecture. You are 100% correct: my claims are not about “bits are bits” or “data corruption,” but are built exactly on your points (3) and (4). BTW, I have successfully verified that Diretta makes no changes to the data; what arrives at the USB input of the DAC is exactly what Roon sent to the endpoint. What’s changed is the only manner of delivery.

Your questions about why measurements can’t detect what our ears can are the right ones to ask. I just posted a very detailed reply to @mjw (post #91) that tackles this topic.

It explains my hypothesis for why this happens: the theory about “bursty” data causing low-frequency CPU processing noise, which in turn pollutes the DAC’s analog stage by getting past the DAC’s PSRR (Power Supply Rejection Ratio).

I’d invite you to read that reply, as it’s my best attempt to answer the very logical questions you’ve just raised.

Unlike some, I don’t argue that what we’re hearing in these experiments can’t be measured. Only that we’ve not worked out the right protocol to measure it yet. It’s more involved than capturing 1 kHz sine waves and calculating a static transfer function.

A few points regarding measurements:

  • There is nothing “static” about transfer functions, since they completely describe the response of a system to any and all input. (Also, for the sake of accuracy, the response of a system to a 1kHz sine wave cannot be used to determine the system’s transfer function.)
  • Whatever gets past the DAC’s PSRR should manifest in the DAC’s analog output. I would think that should appear as low-frequency components far enough below 1kHz and loud enough to be resolved from the said 1kHz sine by a good analyzer.

@David_Snyder, you expressed interest in seeing some measurements, but you also seem ready to dismiss them as irrelevant or incomplete. I am seriously tempted to build this setup and measure it, but that would incur some expenses on my part: a couple of RPi 4’s (I only have RPi 3’s), an OS license, and a Roon subscription. (I’m currently using LMS, but the discussion is about the “bursty” RAAT vs. the “smooth” Diretta, so you’d probably want Roon to be in the picture.) So, before I start down this path, let me know if there are any specific measurements you’d like to see that you think would be relevant and might make you question your subjective impressions so far.

4 Likes

Off topic, but curiosity always gets the best of me. Are you saying that, before you used HQP, your music sounded awful?

2 Likes

Awful upsampling is accurate - there are serious issues there. It sounded like digital. Glare and harshness and lacking in soundstage. Digital doesn’t sound awful but there is a colouration and it’s largely from upsampling filters.

Fully agreed

Do NOT try to take this through airport security

3 Likes

Good point!

Off topic I know, but heading to a MTB holiday in Taiwan years ago, my friend was hoiked out of the queue on the ramp to the ‘plane - he had a small tyre inflation CO2 canister in his luggage, once explained he rejoined us on the ‘plane

yes. -and Thank You for your civility in this thread.

2 Likes

Wow! Interesting but complex thread so far. It seems to all come down to this “calming of the stream“ of bits to the target. Nice intuitive language, but leaves me wanting more. Perhaps you could explain why the same effect would not be equally well achieved with a buffer and wireless transmission of bits to the endpoint? The wireless transmission solves, the galvanic, isolation aspect of the network problem, and the buffer, which I believe is already implemented by many products, if not all should be able to feed the bits at a “just right rate“. I will anticipate that you will state that wireless can be a source of noise and I agree for many users in high population areas it probably is more problematic. However, I am fortunate enough to live in a relatively quiet rural area where the level of RFI is probably considerably less. So for purposes of discussion let’s consider that to be the case.
Just as a sidenote, I checked the Diretta website and they appear to be charging €200 for the license not 100 as you state earlier in the thread.

2 Likes

@David_Snyder Kudos to you for your curiosity and your diplomacy in dealing with the hi-fi peeps who insist bits are bits.
My experience is very similar to yours. My upgrade experience from an RPi3 with Digione Signature to a Pi2AES hat and now to a series of parts from Ian Canada that include his FiFOPi Q7 reclocker and LifePO4 power supplies has been extraordinary, each step of the way.

The recent upgrade and inclusion of the re-clocker to the RPI device and the further refinement from the battery powered PSU BOTH shook the ground of my mediaroom. In all cases I was feeding a Chord Dave Dac and can testify (subjectively) that at each step Digione>Pi2AES> Ian Canada Flagship solution I had significant improvements in SQ.
So I echo your findings… the endpoint performance from the transport contributes hugely to sound Quality.

4 Likes

I know this is SOtM/Diretta specific, but I edited SOtM’s instructions on the direct Ethernet connection and sent them to May, she said she will publish them (having checked them!)

So there should be a new version out soon

1 Like