Memory leak in Roon?! [Solved: Update to Build 970]

Same problem. Running Roon Server Build 952 on dedicated Ubuntu 20.04.4 LTS. Running fine for more than a year until recent update. Problem is that once a day the RoonAppliance process likes to gobble up all the CPU and memory and becomes unresponsive.

Is it the kernel OOM killer?

I’ve enabled systemd-oomd which kills the process more gently than the kernel.

Hi @spockfish,

We recognize how corrosive these symptoms are for affected users. You’re absolutely correct that we can more precisely and frequently communicate what’s happening behind the curtain as we push to roll out a fix.

Here’s where we currently stand: the team has identified the issue, and we’re working to roll out a permanent solution in a high-priority ticket.

While I don’t have a firm timeline to resolution at this moment, what I can do is provide daily updates in this topic to keep you informed with whatever information is available to me until we have next steps to offer you.

14 Likes

Roon is using about 85% memory here with 4-10 threads pegged at 100% usage, constant skips and dropouts when playing music(which get worse and worse till it completely stops playback). System has a Ryzen 5900X and 32Gb of ram. Os is Fedora 36. Please fix this because this is unacceptable.
This is on a fresh boot after a couple of hours. Restarting the service has it shoot up to 40% mem after 20sec and pegging 2-4 threads. Have to restart the service and hope i can listen to an album or two before problems begin.

1 Like

Hello everyone,

Thank you all for your continued patience here. As of last night, a potential fix has moved to the pipeline for internal review. The timeline to resolution remains uncertain at this time, but we don’t anticipate this issue should drag out much longer now that we believe we’ve pinned it down.

We will continue to provide new details in this thread as they become available. Otherwise, I’ll provide another update in ~24 hours.

10 Likes

Thanks for the update Connor, highly appreciated.

If you want volunteers to do some quick testing… just let me know.

Regards,
Harry

1 Like

Hi everyone,

Daily update here. We’re testing out the fix for this issue and while I still don’t have a precise timeline for the rollout, we’re confident the permanent solution will resolve the underlying conditions of the problem.

Thank you to those of you who have helped redirect affected users here, as I understand reports of this issue have been both lingering and widespread on the forums. As always, we’re very grateful for your patience. We’re working hard to disseminate the most accurate and current information possible as we close in on resolution.

7 Likes

@connor, I’ve been suffering this problem for what seems like months now (took me quite a while to narrow down what was happening). Finding this thread, I’m comforted to know the problem has been identified and a fix is in the pipeline. I’d be very glad to help test if it’s helpful. My Roon core (i3 NUC) will eat up more and more RAM becoming more and more sluggish until the system becomes unresponsive after less than 24 hours (which also causes heavy SSD use due to no available RAM, etc.). Thanks!

1 Like

I am running Roon Core on a Mac Pro (Late 2013) with 64gb ram with no DSP or convolution, etc. I stream via HQPlayer NAA to a NUC11 and do all upsampling on the NUC11running HQPlayer Desktop. When I reboot 2 days ago and started Roon it was using 1.3gb of ram. After 2 days open it is now using 4gb of ram and music has only played for roughly 6 hours total, rest of the time sits idle. Short test but it appears it is growing ram usage by 1-1.25gb per day.Looking forward to a fix for this memory leak.

2 Likes

At this point, within just a few hours, my Roon Core is completely unusable and is taxing the RAM and SSD resources to the max, making the computer it’s running on kind of freak out. (I’ll reluctantly continue to use alternative apps and services while waiting for a fix). :zipper_mouth_face:

Hi everyone,

The fix continues to move through the pipeline. I appreciate those of you who have offered to help test it out. While I recognize you’re eager for any solution after such a frustrating wait, we’ve already started running this fix through testing so we can roll it out as efficiently as possible.

While it’s disheartening to head into a weekend without a fix in place, rest assured this is a top priority to ship out and the team is finalizing as I type this. I’m happy to answer any questions you might have.

7 Likes

Thanks for the updates. Much appreciated.

Are you confident that the fix addresses all the different memory leak behaviours reported? The worst examples reported seem to double memory usage in a matter of minutes, but others have reported a much slower rate. Also the worst situation seems to exhibit the problem all the time, but others (like me) experienced an isolated issue (just once in the two weeks since I updated to 952). Given the range of issues, is there more than one failure mechanism, or can the same fault manifest differently?

1 Like

There are so many variables that a compromise will certainly be found in the new week that will solve a lot but not everything. How quickly and whether a problem arises depends on the operating system, the hardware or just the RAM. More main memory, everything else the same, makes a difference.

Just a confirmation, I have Roon Core running on a newer Dell Server/SAS/Hardware Raid 5/48GB of Ram MS Server 2019 (Fully Patched) and have latest Roon stalling approximately 2-3 hours after reboot:

1 Like

This would not be possible even with 2.7 million titles under Manjaro. What did the machine just have to do. Do the logs show anything usable?

@Uwe_Albrecht
Setting expectations is always key when working with customers/consumers. I get that is what is being done here - Essentially what I read from the latest statements is expect nothing, it will be a compromise that might resolve something in a fix that will come at some point in the future. That is typical support replies.
Why should there be a compromise? Are there several severe issues that needs to be fixing, do you have additional information? In situations where you expect compromises, customers should be engaged in testing that the issues are resolved proactively. I.e. giving private releases to customers on different platforms exhibiting problems to see how the fix behaves prior to official release to avoid more potential dissatisfaction.

@connor
What I struggle to understand is:

  1. What problem did you actually identify and working on fixing and what can we expect from the release?
  2. What platforms are the fix targeted for?

T

Tbh my biggest question is why 952 wasn’t been rolled back two weeks ago and then investigate the fix. While I appreciate this forum represents the noisy users I have a mate with a Nucleus having exactly the same problems I was. This release seems to have consistently busted things for a lot of people for whom 943 was working perfectly.

I now have the hotfix and it seems the big issue is resolved for me (although still not as stable as 943 was for me based on some other behaviour I’m experiencing) but I have currently cancelled my annual renewal due to renew next week because I have real concerns with how updates are being managed. Roon is the best music experience I’ve found, but only when its working. I’ll likely switch to monthly and see how the next month goes, but I’m genuinely questioning the Roon support processes :frowning:

2 Likes

Database updates are a possible cause which means that some of the features of 948 might not work.
Speculation on my part, but a number of recent updates have spent time making the database changes

1 Like