RoonAppliance taking over Windows 10 server

Roon Core Machine

WIndows 10
Intel NUC NUC8i3BE
Intel Dual-Core i3-8109U Upto 3.6GHz
32GB DDR4
C Drive: 128GB SSD
D Drive: 16TB DroboPro
P Drive: 500G Thunderbolt 3 SSD (Plex Media Database and Roon Database stored here)

Networking Gear & Setup Details

Routers/WAPs: Eero 5 Pro (5Ghz WiFi, 2.4Ghz WiFi, 1G wired ethernet)
Switch: Netgear GS116Ev2

Connected Audio Devices

6 Chromecast Audios (all CA’s centrally placed, connected to amps that go to each room)
1 Google Home Hub
4 Google Homes (all output goes to one of the CA Audios)
1 Google Nest Stereo Pair

Number of Tracks in Library

40,000 tracks. 99% of which are 320k MP3s

Description of Issue

There are only two applications on this windows server: Plex and Roon. Both read from the same music library directory which is placed on the Drobo. I have been using Plex for years, and started using PlexAmp last year.

A few months ago I moved over to using Roon in-house mostly for its interface and music discovery features, and purchased the product to install on the Windows 10 NUC. I began to notice fairly quickly that the Windows NUC began to have a hard time responding to both Roon and Plex. Logging into the NUC I began to notice that the service “RoonAppliance” was consuming (and not freeing) memory, and was consistantly using between 30%-40% of the CPU, even when no load was placed on the system. (i.e. no music playing on Roon, no music or video playing on Plex.) Occasionally, RoonAppliance would jump to 90% of the CPU and/or max out the RAM, thereby slowing or stopping the system. Killing the RoonAppliance service and letting it restart would let the NUC return to normal, but the cycle would repeat within a week.

I made a post on reddit looking for someone with a similar issue, and was contacted by someone from Roon support. After giving them permission to look at the behavior on my box, he reported back that “I see something going kind of nuts with your chromecast devices…like we keep discover them, then they are disconnecting, then we re-discover, then disconnect, and so on. It was going on pretty fast around the time you posted, and I see it kind of happening all the time. That’s definitely not normal. I’ll make sure the right people take a look at it.”

I know it takes time to diagnose this sort of thing, however me situation has gotten pretty annoying - and I’ve disabled Roon on my NUC until I can get this straightened out - I will revert back to PlexAmp until then.

Question for the crowd: has anyone seen this before with ChromecastAudio devices? I am close to pulling the trigger on a VSSL 6, which has it’s own CA implementation – but I’d like to get to the bottom of this before I move forward.

Sounds like a bad idea to me, unless they use a wired network connection. WiFi devices placed in close vicinity to each other are prone to produce connection issues.

There has been no issue with this configuration on Plex, PlexAmp, Spotify, YT Music, Apple Music, Pocketcasts, or a half a dozen other services that I use. Network monitoring shows that these devices are strong, stable and connected 100% of the time.

While the sentiment always drifts towards “blame the existing equipment,” it’s not terribly helpful advice. The one new entrant into my ecosystem is Roon. Everything else has been running fine for a year or longer.

This has to do with the RoonAppliance app and it’s relationship to chromecast audio.

So you tested this, let’s say turning all but one off and see if the same problem is still occurring?

They don’t work the same way as Roon as they don’t track announcements and the usually much larger caches they use may mask the short outages while ChromeCast devices re-connect to the network.
The close vicinity might be one reason why they drop off and re-connect constantly to your network, but other reasons for that behavior may exist.

This is just a step for your troubleshooting to get to the bottom of this. It’s up to you if you want to pursue this possibility or ignore it, but question the crowd and then ignoring the answers is probably not terribly helpful too.

NOTE: I use ChromeCasts and there are absolutely no issues with them.

So it seems to be obvious to me that this statement is not true in a stable network environment.

Technical background information:

“I see something going kind of nuts with your chromecast devices…like we keep discover them, then they are disconnecting, then we re-discover, then disconnect, and so on. It was going on pretty fast around the time you posted, and I see it kind of happening all the time. That’s definitely not normal."

ChromeCast devices announce themself (the availability of the service) on the network they are connected to. Roon just tracks those announcements. New announcements usually get issued when there is a change in the network connectivity of the ChromeCast devices or they boot up/reboot.

So all points to your network (how it’s setup, firewalls in use, …) or directly at the ChromeCast devices. So this might take you quite some time to troubleshoot all the possibilities.

So you tested this, let’s say turning all but one off and see if the same problem is still occurring?

Would it surprise you to learn that, yes I did? I actually did a lot more than that - network topology is part of my work, so I’m no stranger to any of this. However, to be clear - I caged each one of the CAs, and then did them in pairs and triplets. Not every combo, but enough. The problem persisted.

However, your reply made me a little more curious - so I’ve ordered a few ethernet to miniUSB adapters in order to put the CAs on a wired connection to rule out all possibilities of wifi interference so that can be taken out of this conversation.

They don’t work the same way as Roon as they don’t track announcements and the usually much larger caches they use may mask the short outages while ChromeCast devices re-connect to the network.

I have read the vague backstory of the magical, mystical stuff going on in Roon that they use to justify their Roon Core architecture - and I understand that they like to use it to tout that they are “different from other services”… but it’s a hard story to buy. I think they picked one architecture and now they are stuck with it - it doesn’t feel particularly better at device discovery or rendering than any other service. I’d argue it’s a little worse, honestly.

The close vicinity might be one reason why they drop off and re-connect constantly to your network, but other reasons for that behavior may exist.

Again, the assumption here is that the CAs are disconnecting from the network. They are not. I have logs of their activity, and once they are on they are on. (With the exception being when my Eero system updates and reboots itself in the middle of the night.)

This is just a step for your troubleshooting to get to the bottom of this. It’s up to you if you want to pursue this possibility or ignore it, but question the crowd and then ignoring the answers is probably not terribly helpful too.

A fair point that I accept - and I wouldn’t anticipate everyone on this board to understand the steps I went to before posting the question/situation here… buy you’re correct.

From my point of view, I have had a stable ecosystem for literal years. When a new element gets introduced and the ecosystem destabilizes, it’s a relatively safe bet it’s the new element. Totally willing to admit that it might be how I have Roon configured, but all the signs still point to Roon being the issue. (I’ve had Roon offline now for a few days, and the CAs are perfectly fine with all the other services, and my NUC is chugging along at minimal memory and CPU consumption.)

NOTE: I use ChromeCasts and there are absolutely no issues with them.

So this is helpful. How many? Do you have them in a mixed environment with other audio renderers?
Do you use them in audio groups? Do you have a Roon OEMed device? If not, did you set up a Roon Rock? Do you have it just as an application running under Windows or Linux?

So it seems to be obvious to me that this statement is not true in a stable network environment.

It should not be, but that is what I am observing. Once again:

  • Network logs show no CA disconnect/reconnects
  • all other services use them just fine
  • RoonAppliance application bloats over time
  • NUC system becomes unresponsive when RoonAppliance gets too large

One thing that did occur to me while writing all of this and reading over the Roon recommendations for installation is that I have the Roon database stored on the RAID drive that the content is stored on. I see now that the recommendation from Roon is for to be on an SSD, which makes sense.

I’m going to try that this week and see if that changes the equation for some reason.

Cores on Linux, 1 CCA on Ethernet (via WiFi bridge), 1 CCA on WiFi, 1 CCU4k on Ethernet (via WiFi bridge). Mostly I use Roon with them (because it works for me) occasionally also used from my phones/tablets, seldom used for music from Plex (Plex is used for movies/series/TV). No group, just PIA and creates more problems than it solves. If I really had a need for groups I would switch to RPi Roon endpoints and use their RAAT protocol instead.

Roon (or RoonAppliance) is just software running on an OS. If Roon appliance is to blame for doing something basically wrong, then any user should experience the same problem. This seems not to be the case though.

  • good
  • again, other services use ChromeCast in a different way, so not helpful
  • many users currently report similar issues, possibly unrelated to your ChromeCast issue (other reason)
    There is only the clue from the tech you cited that your issue is related to ChromeCast. There might as well be more than one reason at play in your setup currently.
  • expected, there is only a limited amount of resources available

This for sure makes sense, but is unlikely to change anything regarding the ChromeCast issue.

"I’ll make sure the right people take a look at it.”
Might be best to wait for them to return to you then?

Note: One user once reported issues that he could resolve by disabling an unspecified mDNS reflector software that he had still running in his network. Maybe something you want to check.

1 Like

Thanks for the responses…

Note: One user once reported issues that he could resolve by disabling an unspecified mDNS reflector software that he had still running in his network. Maybe something you want to check.

:thinking: Interesting angle. I don’t have a DNS reflector running… and this persons issues were slightly different than mine, but it wouldn’t hurt to check a few things.

1 Like

Hello @Rob_DeMillo ,

Apologies for the slow response here, our support team has recently undergone some changes to keep up with needed volume, and we’re working hard to catch up with all of our users.

We have been discussing this issue in-house since your Reddit post, and while we are seeing some of the cast disconnects you are seeing in our logs, they are nowhere near as often as on your end and are not leading to significant resource usage.

Please note that Roon Build 970 addressed a few memory leaks, and this may have had something to do with the RAM usage you were seeing. I’d be curious to know if you are still seeing this even after updating to build 970 or above.

After some internal discussion, it looks like mDNS is involved in the issue, when we see the cast lost on our end, it looks like there is also a TTL=0 mDNS packet and TLS errors, like so:

Info: [cast] lost device Chromecast Ultra because we got a TTL=0 mDNS packet
Info: [cast/client] [Chromecast.local] Unable to authenticate TLS connection

Are you seeing the same on your end when the issues occur?

Can you provide any more details on how these devices are grouped? Do you have significant amount of grouped zones under the Google Home app?

It would also be good to see if the current database has any impact on this issue. If you can please confirm if you see this on your end with a fresh database, this would be a very useful data point to have, I have provided instructions below on how to achieve this:

  • Create a Backup of your current database
  • Exit out of Roon
  • Navigate to your Roon Database Location
  • Find the folder that says “Roon”
  • Rename the “Roon” folder to “Roon_old”
  • Reinstall the Roon App from our Downloads Page to generate a new Roon folder
  • Verify if the issue persists on a fresh database before restoring the backup

Please, do let us know your thoughts on the above when you have a chance, thanks!

Hi @noris - thanks for responding.

As you know, I’ve been chatting with Brian over reddit DMs about this. I have some good news, and weird news:

  • I installed 970 on my Windows server, and it 100% solved the memory leak problem! So, congrats there.
  • Unfortunately, the sustained CPU usage remained on the Windows server, and was basically shutting down my server
  • As the CPU would routinely hit 90%, I created a service daemon to kill RoonAppliance every few hours. That kept my machine alive.
  • While all this was going on, I took possession of a Synology D1821+ 8-bay RAID enclosure, running DSM 7.1.
  • I moved all of my content from the DAS hanging off the Windows server to the Synology.
  • Once that was done, I installed 970 on the Synology (as well as the latest Plex Media Server), and transferred control of both Roon and Plex to the Synology.
  • I shut down my Roon and Plex on the Windows server.

All of the above happened two weeks ago. Since that time, there have been no memory issues on Synology. I do see Synology CPU spike throughout the day, but it quickly settles back down to very low CPU usage. My guess is that whatever is causing the spiking is not acting as a high water mark on the Synology, but under Windows it acts as a high water mark and never recovers.

As to the Chromecast Audios: I purchased hardwired ethernet dongles for the 6 CAAs that I have running, under the idea that maybe the WiFi signal to the 6 CAAs was causing rapid connects/disconnects. (All 6 CAAs are within a few inches of each other.) However, since none of this registers on any other service I have, and my LAN logs do not show rapid connects/disconnects, I considered this a long-shot.

However, since the Synology is humming along fine with 970, I’ve abandoned that plan (although it is doable, it’s still a PitA to move CAAs to hardwired ethernet.)

I hope all of this info helps.

Rob

1 Like

Sorry @noris - just saw all this below the fold. To answer your questions:

WRT the TTL=0 messages you are seeing: I do not. All of my logs show that the CCAs connect at boot up, and stay connected until either I turn them off, or Eero (my router) does it’s nightly update. No disconnections, no issues with mDNS as far as I can see.

WRT how I have the Chromecast grouping established, here’s a shot of my groups:

WRT the fresh database:

  • yes on the Windows server I’ve blown away my DB and recreated Roon multiple times. (It’s the first thing I did prior to my reddit post.) The problem remained, of course.
  • see my previous message about moving to Synology. I did that with a fresh DB as well

Hope all of this helps

1 Like

Hi @Rob_DeMillo ,

Thanks for that additional information!

Glad to hear that the memory leak is fixed on your end.

Are you able to correlate the CPU spikes to any specific activity in Roon? Do you have Audio Analysis completed for your library?

Can you please upload a full set of your Synology Roon Logs (by using these instructions) to our Roon File Uploader and let us know once uploaded?

https://workdrive.zohoexternal.com/collection/8i5239cc05950ac07456889838d9319545a82/external

Then I can take a look to see if there are any clues, thanks!

1 Like

Hi @noris - thanks for the follow up.

I will send the logs when I get a chance.

As to the CPU spikes - no, I don’t correlate them to any activity at all. When my Roon usage is idle, and I am not playing anything on any output device, the spikes still occur.

Either you or Brian asked me to turn off audio analysis, which I did, but the spikes still occurred both on the Windows and Synology RoonAppliance… (I’ve since re-enabled audio analysis on the Synology, since the CPU returns to a normal usage pattern after the spike.)

I’ll dig up both the Synology logs and the last known Windows logs that I have, so you can compare both. I’ll have to do this after the holiday weekend here in the states because I don’t have access to the windows server right now.

1 Like

Hi @noris

Hey I still owe you the logs, I’ll get those early this week. In the meantime, I put 6 of my CCAs on hardwired Ethernet this weekend.

Can you see if you are still getting the connect/disconnect messages? I don’t notice a change on my end, but… :man_shrugging:

Hey @Rob_DeMillo ,

It looks like an automatic report came in from diagnostics. I checked and it looks like you are still getting quite a lot of cast errors even on the NAS Core, including some interesting Exception writing message to stream: traces, please see below:

07/12 06:11:35 Info: [cast] lost device CastDevice[DeviceId=SHIELD-Android-TV-b0d8454caf181af280656a0249e3d9db._googlecast._tcp.local, Name=SHIELD Android TV, Address=192.168.7.180] because it disconnected
07/12 06:11:35 Info: [cast] lost device CastDevice[DeviceId=Google-Cast-Group-84e0a02c3d7c4c949aa6d4a6c9af39d9-1._googlecast._tcp.local, Name=Google Cast Group, Address=192.168.7.79] because it disconnected
07/12 06:11:35 Error: [cast/client] [Chromecast-Audio-72a644b5093866851f78b3c478c6447f._googlecast._tcp.local] Exception writing message to stream: 
07/12 06:11:35 Error: [cast/client] [Google-Nest-Mini-7cfa71611db0a843c8752d44004a991f._googlecast._tcp.local] Exception writing message to stream: 
07/12 06:11:35 Error: [cast/client] [Google-Cast-Group-4f20f237ed1c4dcf9476090538a67f5d-1._googlecast._tcp.local] Exception writing message to stream: 

&

07/13 06:56:00 Info: [cast] lost device CastDevice[DeviceId=Google-Nest-Mini-7cfa71611db0a843c8752d44004a991f._googlecast._tcp.local, Name=Google Nest Mini, Address=192.168.7.127] because it disconnected
07/13 06:56:00 Trace: [zone Guest Room display] Loaded Queue=0 Tracks Swim=Inactive AutoSwim=True Loop=Disabled Shuffle=False
07/13 06:56:00 Trace: [zone Dining Room Speakers] Loaded Queue=10 Tracks Swim=Inactive AutoSwim=True Loop=Disabled Shuffle=False
07/13 06:56:00 Trace: [Dining Room Speakers] [Inactive] [PAUSED @ 0:13/7:45] Flying Dutchman - Jethro Tull
07/13 06:56:00 Info: [cast] lost device CastDevice[DeviceId=Google-Cast-Group-84e0a02c3d7c4c949aa6d4a6c9af39d9-1._googlecast._tcp.local, Name=Google Cast Group, Address=192.168.7.79] because it disconnected
07/13 06:56:00 Trace: [zone Kitchen display] Loaded Queue=0 Tracks Swim=Inactive AutoSwim=True Loop=Disabled Shuffle=False
07/13 06:56:00 Info: [cast] lost device CastDevice[DeviceId=Chromecast-Audio-299fa92bd7a9082b1439170500b24a36._googlecast._tcp.local, Name=Chromecast Audio, Address=192.168.7.79] because it disconnected
07/13 06:56:00 Info: [cast] lost device CastDevice[DeviceId=Google-Cast-Group-9bcdc980fff5489186dc9b5ecef681e3-1._googlecast._tcp.local, Name=Google Cast Group, Address=192.168.7.79] because it disconnected
07/13 06:56:00 Trace: [zone Laundry Room Assistant] Loaded Queue=0 Tracks Swim=Inactive AutoSwim=True Loop=Disabled Shuffle=False
07/13 06:56:01 Info: [transport/zonedisplay] Zone display unregistered: CastDevice[DeviceId=Google-Nest-Hub-04c30982058e1bdd98e00dde0afaa9b6._googlecast._tcp.local, Name=Google Nest Hub, Address=192.168.7.134]
07/13 06:56:01 Info: [cast] lost device CastDevice[DeviceId=Google-Nest-Hub-04c30982058e1bdd98e00dde0afaa9b6._googlecast._tcp.local, Name=Google Nest Hub, Address=192.168.7.134] because it disconnected
07/13 06:56:01 Info: [cast] lost device CastDevice[DeviceId=Nest-Audio-5c58543e7acbe004522623c1cf3163d2._googlecast._tcp.local, Name=Nest Audio, Address=192.168.7.105] because it disconnected
07/13 06:56:01 Info: [cast] lost device CastDevice[DeviceId=Google-Cast-Group-1e3efaaf4fee434483b0756ba299f09c._googlecast._tcp.local, Name=Google Cast Group, Address=192.168.7.105] because it disconnected
07/13 06:56:01 Info: [transport/zonedisplay] Zone display unregistered: CastDevice[DeviceId=google-nest-hub-eb99e0d2dafbbf789d44f56d8136354d._googlecast._tcp.local, Name=Google Nest Hub, Address=192.168.7.103]
07/13 06:56:01 Info: [cast] lost device CastDevice[DeviceId=google-nest-hub-eb99e0d2dafbbf789d44f56d8136354d._googlecast._tcp.local, Name=Google Nest Hub, Address=192.168.7.103] because it disconnected
07/13 06:56:01 Info: [cast] lost device CastDevice[DeviceId=Chromecast-Audio-887af7cc9c30b2a63eb80c55daab439d._googlecast._tcp.local, Name=Chromecast Audio, Address=192.168.7.83] because it disconnected
07/13 06:56:01 Info: [cast] lost device CastDevice[DeviceId=Google-Cast-Group-2c78d28be45041a8a8e21e1a1c81d7cf-1._googlecast._tcp.local, Name=Google Cast Group, Address=192.168.7.83] because it disconnected

Since the issue impacts both your Windows 10 Core and the NAS Core, it suggests that the issue could be in the network configuration somewhere.

How do you have the Eero’s configured? Are they on the latest firmware? If you remove the satellite and leave just the main router plugged in, do the cast issues remain? I’ve seen improperly configured Eero networks cause stability issues in the past before.

Hi @noris

Thanks for the response. There’s some interesting things in there - it’s not just the chromecast audios that are throwing those for you, it’s the Android Shield as well, and the minis… basically everything Chromecast.

The networking here is solid (I run the networking group for a large, worldwide company as part of my day job, so I’m pretty comfortable with all of this.) The eero is up-to-date, and the CCAs are now on ethernet and plugged directly into a switch which is plugged directly into the eero gateway - and it is really rock solid at this point. The Shield is also plugged into the gateway, and the other chromecast devices are wifi’ed around the house.

I do not see any of these errors on my end, and none of the other software I have that references the chromecast devices show the slightest hint of this problem.

It’s bizarre in the extreme that you are seeing this happening on your end. All I can think is that the cast device handler on your end is erroneously throwing exceptions. Is that possible? (Do you see this with any other of your CCA users?)

Now that the memory leak is gone, RoonAppliance is no longer crashing – and Roon is successfully playing on all of my cast devices – so I don’t notice any performance issues at all despite those errors you are seeing on your end. What this implies to me is that it may be happening with your other CCA users, but no one is reporting it because they don’t see any symptoms. (The only reason this was surfaced with Roon is because you guys discovered it when you were looking into the RoonAppliance crashes… I never saw anything off with my CCA/Roon connection.)

If I can figure out if there’s anything in my eero setup that’s causing issues, I am happy to fix it - but can you check in with a few of your other CCA users to see if you see the same thing on their end?

Puzzling.

Adding this - I picked one of the CCAs on my network that was throwing an error to Roon (192.168.7.79) - this is monitoring for the last hour, and it’s show 100% uptime, and no disconnects/connects.

This topic was automatically closed 45 days after the last reply. New replies are no longer allowed.