What's the difference? RAAT vs AirPlay

I wrote up a white-paper for manufacturers a few months ago, but it was a little bit sparse in some areas. So here, I’m expanding it and filling in the gaps with more explanation. There’s some light compare/contrast with AirPlay in here, but I didn’t make that the focus of the writeup.

Architecture

Well-architected systems give better user experiences. They work better in the short and long term, and they surface fewer unexpected limitations down the road as the world changes.

AirPlay started its life as a feature of the AirPort Express. It evolved into an audio distribution system a few years later. It is hobbled by trying to fit into the performance envelope of the embedded devices of yesteryear, and it’s cobbled together–a mishmash of hacked up versions of several well-known protocols, with very little coherence to the overall system design. It looks like what it is: an overgrown feature masquerading as infrastructure. It’s stretched pretty thin, at this point, and it hasn’t evolved in a long time.

RAAT has an advantage: it is 10-30 years younger than the bits+pieces that make up AirPlay. We can see not only where the world has gone, but, one level up, what types of change have taken place. We are in a great position to do a better job.

Also worth mentioning: designing network protocols for transporting audio is a core competency of ours. I did the protocol design for RAAT, but this isn’t my first time, it’s my third. The last protocol I built handles all of the networked audio distribution for Meridian’s products, and the one before that handled audio distribution for Sooloos products up until about 2010. There is no substitute for the experience of actually putting something like this into production in the real world.

RAAT is plumbing. It gets the audio from point A to point B without screwing it up, and without bringing limitations to the table that might compel the software/hardware on either side of it to screw things up. It’s an enabling technology for “doing things right” everywhere else in the system. Otherwise, it shouldn’t get in the way.

Design Goals

  • Support all relevant audio formats today and for the foreseeable future. We don’t publish a list of formats that RAAT supports because it is not the limiting factor. RAAT is already built to handle multi-channel, and 32bit content. Once Roon supports them, RAAT will too.

  • Stable Streaming over Ethernet and WiFi networks. We take this for granted in 2016, but it’s easier-said-than-done, and a huge set of implementation choices are driven by this requirement.

  • Modest endpoint hardware requirements. This means endpoints don’t have to handle expensive DSP or content decoding–that will happen on the server. This means that many existing devices can add support for RoonReady without changing the hardware.

  • Audio devices must own the audio clock. Many other protocols get this wrong, including AirPlay. It’s not possible for two clocks to agree perfectly. Letting the DAC control the pace of streaming removes the need for a clock-drift-compensation mechanism that is bound to increase cost, decrease quality or both.

  • Tight playback synchronization suitable for multi-room listening. There’s a careful line to walk here. If we demand ultra tight (1-10us) sync, it becomes impossible to implement the system on existing/unspecialized/heterogenious hardware platforms. We shoot to be within 1ms (and under ideal circumstances often much better), which is more than adequate for multi-room listening.

  • Support for new streaming services, file formats, DRM schemes, etc can be supported without firmware upgrade. In fact, the only reason an upgrade should be required is to fix a low-level bug, or to access more hardware functionality. This is really important. Not all partners/hardware have easy firmware update paths that can be done at home. Our acceptance of this reality has deeply influenced RAAT’s design. Just as with Google’s Cast devices, the majority of the business logic is delivered to the device at run-time as a script. This means that we are capable of completely re-designing the audio streaming and buffering logic without updating device firmware. This is absolutely critical, since most of the bugs + evolution in a system like this relate to networking, not audio. Other than Cast, we are unaware of another system that is this flexible.

  • Cheap to implement, and easy to distribute. No patented technologies involved. No requirement that manufacturers use technologies that are subject to export restrictions. And Roon provides provides a high quality, portable reference implementation as a base for customization instead of a pile of documents describing a network protocol.

  • Provide a great user experience. This means no stupid 2s delays when touching transport controls (looking at you, AirPlay). It means no too-simple-to-be-good approaches to zone synchronization (looking at you, squeezebox). It means no artificial stream format limitations. It means that the system is flexible enough to allow processing in the server or the endpoint. It means that volume control and source selection works right whenever possible.

  • Promote Honesty regarding what is happening to the audio. RAAT is tied to Roon’s signal chain feature. We work with manufacturers to make sure that potentially destructive processing stages like software volume controls are exposed to interested users, and that processing isn’t being concealed or hidden.

  • Enforce high quality user experiences via a certification program. User experience is another core competency for us. We are actively pushing hardware companies to make better user experiences by iterating with them on the product before allowing them to be released. We require parity between RoonReady integrations and other audio protocols offered by the devices, ensuring that Roon support does not become a second class citizen. Another requirement of the certification program is that hardware manufacturers leave devices with us long-term for support and QA purposes.

  • Two-way control integration. Artwork and now-playing information can be displayed on hardware devices. Front-panel controls and IR remotes can control Roon via the device. Volume controls on device front panels can be kept in sync with Roon. If you’re talking to a device that has multiple inputs, and start music in Roon, the input automatically switches to Roon’s input. Anyone who’s used Roon’s Meridian integration knows the value of this set of capabilities.

  • Deeply extensible protocol. We’ve placed many extension points in the hardware protocol, and in the interfaces between the RAAT implementation and the hardware-specific code. This allows us to easily support more functionality in the future. We fully expect to learn of more use cases as the breadth of hardware that we are supporting grows, and the protocols are designed to get out of the way and scale gracefully.

  • No support for under-specced platforms or un-proven network stacks. RAAT is built to evolve over time. We continue to improve the network protocol. We might decide to change the buffer size requirements on the device to increase stability. We might decide to build a second network protocol optimized for streaming over WAN, or something else like that. We give the same advice for users of Roon as we do to manufacturers building RAAT-based products: under-specced systems lead to bad user experiences; hardware is cheaper than ever and getting cheaper all the time; don’t over-economize if you want the best result.

19 Likes