RAAT and clock ownership

AMP · February 15, 2018, 1:53am

In theory, as long as the DAC has a sufficient amount of data on hand to perform decoding at the proper rate, the way that data buffer is maintained shouldn’t make a difference. The problem is that keeping a buffer full at the proper level is actually very difficult when you don’t control all of the variables.

Here’s a thought exercise… I’m going somewhere with this, I promise…

You have a tub that can hold a certain volume of water (that’s your buffer). Unfortunately, you don’t have a lot of room so your tub has to be relatively small.
The tub has a drain that removes water at an average rate (that’s your DAC)
The tub has a spigot that can be used to add water and you can use that spigot to adjust the rate at which water is added.
A certain volume of water has to go down the drain, but the tub can’t hold enough water to satisfy that overall need.

Now, what rate do you need to add water to the tub in order to make sure that it never over fills, but never dries out?

If the fill rate is higher to the drain rate then the tub overflows.
If the fill rate is lower than the drain rate then the tub dries out.

(both of these are bad with water and audio)

If the rate is exactly the same then everything is happy. Assuming you pre-fill the tub with some water to get the process started and then maintain the water at that level everything is fine.

Easy. Problem solved… nope.

Unfortunately (in this analogy) “exactly” means down to the molecular level (or every last bit). Problem is that while the drain may average 44,100 units of water per second the actual rate at any given time may be a little faster or a little slower (that’s jitter)

To make matters worse the spigot is on a shared plumbing system (the network) and other water users can have an impact at the volume of water that can be delivered to the spigot at any given time. You can set the valve on the spigot perfectly, but the minute someone flushes a toilet you’re screwed.

In order to maintain the system you need to actively monitor and adjust the valve on the spigot to maintain the fill rate to ensure that you never allow the water level to get below your established minimum. You also need to be sure that you make adjustments to ensure you don’t get above a safe maximum. You need to be both predictive in managing the high and low water marks as well as reactive to changes in water volume available via the spigot and minute changes in the actual drain rate.

That is what RAAT is doing. It’s modeling the clock rate of the DAC (the drain) to understand how the tub is being emptied. It’s also responding to network performance in order to ensure that the fill rate doesn’t allow a buffer over or under-run.

It’s even more complicated than that as there’s also a tub on the core side that “drains” through the network to the one on the DAC side and the size of the tubs is different depending on the sample rate involved.

Now imagine how complicated it is to group zones (especially if there are different sample rate requirements for each zone). The zone tubs need to drain in absolute lock step in order for this to work but you have no control over the pipes connecting the tubs!

Now, back to your specific question.

In reality, whether you are moving data over the network or within a device you need some way to manage the buffers. There are generic ways to do this and they do work, but RAAT was developed specifically for the needs of audio (everything that’s needed and nothing that’s not). It provides a standard interface to both sides of the equation and is easy to implement. Given that RAAT is very good at what it does and in the absence of a better way of doing it within ia single device then why not just use RAAT?

That is, in fact, what Roon does. Whether the data is traversing a network or playback is local to the core RAAT is still employed to manage the send and receive side of the chain. It may be internal to one piece of hardware or separated by some distance, but the protocol is the same.

Bingo, and the key really is a functioning network environment. That doesn’t mean one that is datacenter quality or with infinite bandwidth, but good enough to satisfy the needs of the audio data being transmitted along with any other uses at any given time. It also doesn’t mean that you need a bunch of tweaky devices and dongles along the way in order to make it sound better. It just needs to meet the standards and that’s not hard to accomplish (although a lot of audio-related network stuff [cables, filters, clockers, etc] pretty much ignores the standards).

Keep in mind that audio bitrates are nothing in comparison to the bandwidth available on a typical gigabit link (even a really crappy one). DXD is about 20Mbit/sec. DSD512 is around 45Mbit/sec. Gigabit ethernet is 1000Mbit/sec!

As long as the network is reliable and can handle the data rate then RAAT just works (and it works really well). Take away the reliability and it starts getting ugly.

As an aside, although WiFi has data rates far in excess of the audio stream requirements the nature of the way that WiFi works means that those speeds are only possible when you try to push a lot of data through at one time in big chunks. The way that the bitstream is metered out for audio (in near real-time) is the worst-case scenario and results in horrible efficiency… but that’s a story for another day