Roon, Chromecast Audio, Google Home devices and Voice Control

Hi there,

when using Roon to stream to a chromecast audio zone, I´m actually controlling Play, Pause, skip or repeat song and volume & mute and unmute commands using my voice with a Google Home mini… and, at least in my case, works great.

In paralell, Roon can actually assign regular Chromecasts as "Display devices"in order to show the info of every type of Zone (Raat included) in Roon… that´s really nice.

¿What about adding, as a new feature, the possibility to assign Google Home devices, already detected by Roon as Google Cast enable devices, just as “Voice Control Devices” that could be assigned to control whatever Zone we want?

I would love to control the playback and volume of my Raat Zones that way, with my voice.

Just an idea…

1 Like

+1 for this feature request please.

Simple voice control over RAAT endpoints would be a significant improvement which would greatly enhance the existing Google Home voice control capability.

Wider voice control from Google Home / Hub devices over Roon would powerful enhancements to the user experience. For example to request specific artists, songs or genres from the Roon library, or to select a seed to launch roon radio?

I’d hope that the APIs are mature enough to make this viable? Confident that there would be mass uptake giving the low cost and market saturation of these devices.


1 Like

I’ve been meaning to try this as an exercise in figuring out how to use the Google Cloud Platform. It’s not too hard to set up. You put a little service into the free tier of GCP that can accept commands from Google Voice. You run another service on your local network (a Roon extension) that opens a socket connection to the GCP service. The GCP service then forwards commands from the Google Voice server to the local service over the TCP link, and the local server does something with them.

Ah, but that’s the hard part. Roon Labs tried doing this with Alexa, and reported that they couldn’t themselves achieve acceptable performance. Working through the somewhat vestigial Roon API provided for extensions makes it even harder. Lots of ambiguity in voice commands. Not sure it’s doable, which is one reason I haven’t gotten around to trying it yet. Doing it right would probably involve imaging (“audioing”?) the track names, album names, artist names, etc. as accent-adjusted phones, and doing phone segmentation on the actual utterance, then matching them. Lot harder than doing text lookups.