Excessive Memory Usage and Performance Slowdown in Roon Server (ref#0BVMRS)

We reviewed recent logs from your AMD server and noticed there were crashes that seemed to happen around activity with the Boss DACs that you have. Could you try disabling them and seem if there is any improvement?
Also can you use the directions found here and send over a set of logs from your NUC server to our File Uploader?

Funny you should mention that. That is why I moved one to wired and the other much closer to the AP.

Look at the event list below.
This is before I switched back to the AMD server. This is the i5 NUC.
You are probably wondering what could possibly have caused all those RoonServer stack overflow crashes? It appears pretty much continuous.

This was simply when I was updating firmware on the two endpoints, trying to update to various versions to see if I could work around the Linux Realtek driver issue. They restart their OS and reinit their hardware several times during an update or reconfig so they would have alternated between responsive and not. That will cause network comms errors. Whilst that is not great, none of it should cause a single lost stack frame in the host, let alone the hundreds or thousands that it would take to accumulate to a 4MB stack overflow (and loss of allocated memory belonging to any lost objects left behind on those stack frames well before that). That is why I asked whether you test with packet loss.

As soon as you decide to accept WiFi you have to allow for packet loss. Even in a strong WiFi situation with a single AP and single client, you have to be able to deal with packet loss caused by, for example, nearby cellular comms, Bluetooth, Zigbee, Microwave ovens, and just RF noise, all of which will mess up the odd packet here and there. But if you rely on the fairly error-free nature of a modern switched LAN those problems could take months to accumulate… until you see poorer wifi.

Network error handling in a streaming setup has to be 100% recoverable, even if such events are rare. Otherwise it is just a matter of how soon the big problem will happen.

I wonder how many of the folks reporting the need to reboot regularly have WiFi endpoints?

1 Like

The NUC server is not running at the moment but if I switch back to it and still see problems I will grab a set of logs and upload.

Thanks for sticking with this. I hope we can make some progress.

OK, I did not see any crashes on the AMD PC as server.
And after the other revisions memory use has seemed fairly stable too.
I still had IPv6 disabled, and library analysis options turned off for what that is worth.
Tidal remained enabled but Qobux disabled.
I ran that for 48 hours and it seemed steady-ish at around 800MB.

The I enabled Qobuz. Memory use immediately shot up to 1.3GB but then stayed there, fairly stable.


Half a day later I decided to try re-installing it on my NUC server.
I uninstalled Roon from the NUC completely (inc settings and database), then rebooted the NUC, checked the SSD (all OK) and then reinstalled a fresh download of Roon earlyaccess for Windows.

I did basic setup with just Tidal enabled (no library storage at all) and it seemed steady at around 470MB. That is with analysis turned off and IPv6 still disabled.
I then added one storage location. That was a NAS volume containing FLACs in folders by album, with 7186 tracks.
Roon started adding those tracks to the library and then it crashed again with a stack overflow…

Then RoonAppliance restarted and finished adding tracks OK.
A little while later memory use is 760MB.
Note the time in the event log above. 10:52am today 08/09/2024.
I exited RoonServer (right click on tooltray icon) and will now upload the server logs snapshotted just after the storage finished adding and Roon Server had safely closed.

I also wanted to rule out faulty memory on the NUC so it is now running Memtest86 v11. It is almost finished… no problems found so far.

This message updated because first attempt I uploaded the Roon client log rather than RoonServer. Sorry about that. Please ignore the first upload today, I will do anther upload now that has the SERVER logs you need.

Still running with just one library folder in storage locations (about 7200 tracks in 790 folders).

I still have two library folders I have not yet added.
One is multiple folders of podcasts from BBC Sounds (about 750 tracks in 60 folders). The other is MP3 (about another 1500 tracks in 128 folders).
IPv6 still disabled and analysis options still turned off.


08/09/2024
10:34 Downloaded Roon for Windows earlyaccess 08/09/2024 10:34
Immediately installed and roon started.

10:37 Added Tidal. All OK.
10:38 Added other audio devices. All OK.

Tidal seems to play OK.
Added SMB folder from NAS to library storage.
Adding just started, did not get very far before…

10:52:08 CLR stack overflow in RoonAppliance

RoonAppliance then relaunches but no more added to log file.
Roon finishes adding rest of SMB folder to library without further crashes.


18:19 note: I did enable Qobuz just an hour or so ago. THEN…

17:51:59 CLR stack overflow in RoonAppliance

I’m not sure if that is related to Qobuz or not. For paranoia sake I have since disabled Qobuz again anyway. Maybe you’ll see something useful there in the later logs.

What I have NOT seen today is any sign of the memory problems I was seeing before. Memory use is now staying steadyish at around 800-900MB.

And I now see, another CLR Stack overflow behind the scenes at 18:52:02
Nothing was playing so it went unnoticed but the server quietly stack faulted and re-started.

If anything had been playing it would have been interrupted. Just 3 crashes in 24 hours… I might not notice if I were not looking for it.

I wonder if people are seeing this occurring occasionally and just dismissing it?

Last 3 server logs zipped and uploading now.

Note: Qobuz still disabled, analysis options still OFF, IPv6 disabled.

And checking the server Windows logs again this morning I see that RoonAppliance crashed with a stack overflow again at 03:07 this morning 09/09/2024

Well no UI’s were active open (though maybe open but suspended on one android device) and nothing had been playing for several hours so maybe nothing to do with WiFi after all? (though all those crashes the other day were definitely during firmware upgrades to two or three ropieee WiFi endpoints).

I was fast asleep. NOTHING was going on at 03:07 except things the server decided to do for itself.

All looks normal here now. Roon was working fine at 07:30 this morning and it’s memory use is just a ‘normal’ 750MB.

IPv6 still disabled. Qobuz still disabled.
Only one NAS folder (7200 FLACs) connected as storage, no local folders, no MP3, no podcasts.

I saw far fewer of these crashes (just one or two) when the server was running on the much faster AMD PC with 32GB. Suggests some sort of race condition maybe?
Fingers crossed that you find something useful in the logs I just uploaded.

And another crash just now at about 08:07
I think this may have been just after I closed Roon Server (by right-click menu on tooltray icon) in order to grab the log files, and then restarted it. I was surprised to see that crash event… nothing had been playing and I did not notice anything happen.

I just uploaded the logs from around this, together with the Windows event log display.

I did just try quitting and relaunching again to see if it happens every time I start/stop the server… but it does not.

I should add that RoonAppliance stack overflows are the only Application Error events that I see in the logs.

And another crash just now at 12:51:58

Relevant RoonServer\logs and a snapshot of the event have just been uploaded.

Everything same as before.
Nothing was playing. Silent crash and restart would have gone unnoticed.

And I quit using Roon at around 7pm this evening but on checking I see there were then two more crashes after that.

I had thought all the problems had just stopped… but apparently not.

Logs (and Windows events timestamps) zipped and will be uploaded right now.

I have added two more NAS locations back into the library, and I have re-enabled Qobuz before these errors but I do not think that appears to be related… analysis had already completed 100% without errors and they appeared to play just fine.

I still have not seen a return of the huge memory use.
IPv6 remains disabled.

Two more new crashes for you last night and this morning.

10/09/2024 23:36:58
11/09/2024 07:21:57

There is an event screenshot included the log upload that I will send now but I have also put a longer one in this post below.

Is there something odd about the timestamps on these events?
It seems very unlikely that they would recur so closely at these minutes and seconds.

Is that timing something to do with Windows? Or something to do with Roon?

Hi @Andrew_Beveridge,

We appreciate your continued push for additional information - I wanted to follow up and let you know we weren’t letting your thread slip through the cracks, our team currently has all the information they need to continue investigating :+1:

And with that, we may have a potential fix related to your issue over in our early access build - you can read more about that here: EarlyAccess: Roon 2.0 Build 1458 and ARC Build 297 is Live!

We should be pushing these fixes to production relatively soon, so if you’d rather wait than move over to Early Access, you should be able to have access to these fixes soon.

Awesome. Thanks for that.

I was already using earlyaccess and don’t have much to lose so I have just picked up 1458 and installed it on the NUC.

I don’t know how much I’ll be around to check on things over this next week but if I do see anything important I’ll report back.

Fingers crossed.

Hi Benjamin

It is still crashing out with the CLR stack overflows.

I installed Build 1458 at around 22:18 yesterday.
It crashed early this morning at 00:52:03
Same stack overflow exception as usual.

log files now uploaded.

Thanks for giving that a try @Andrew_Beveridge - we’ll share the info with the team. I hope to have more information to share after additional investigation. :+1:

So, I have been away the last seven days.
I left Roon Server and my NAS (ie. Roon storage) running just to see what would happen.

It seems that during the last 7 days, with nothing playing, no UI’s open anywhere, RoonAppliance.exe crashed and restarted itself 10 more times.

I am uploading the logs and a Windows Events Log screenshot now.

Hey @Andrew_Beveridge,

We have a ticket in with development, but the process will likely be slow. Reproducing the issue has proven difficult, and doesn’t see to be widespread. We appreciate your patience while we continue to work through things.

While I don’t think any of the fixes in the latest update will help your case specifically, certainly let us know how things run no the latest update. :+1:

The latest update (build 1460 earlyaccess) still crashes.

See screenshot below. It crashed at 02:07 this morning… Roon was not in-use at the time.

It is pretty clear from other community posts that several other users are seeing similar problems.

And the timing of these events is still looking “odd”.
20240920

And it looks like it crashed again 53 minutes ago at 21:22 (now 22:15)

Nothing obvious to provoke it. Once again, nothing playing, no UI’s active. Nothing else of note running on that server i5-NUC (Windows 10 Pro x64 22H2. OS Build 19045.4894) ).

Qobuz still is (and was) disabled, analysis switched off, nothing new added to library storage.

Tidal WAS enabled FWIW.

Hi @Andrew_Beveridge,

Great, thanks for confirming. :+1:

I hope to have more information to share soon.