Memory under pressure: your experiences with Roon garbage collection

Background

I’ve been a Roon customer for less than a year and I enjoy using it daily. It’s become clear to me that Roon is a sophisticated and complicated system that many users are shielded from by design. When issues have come up, the volunteer moderators and support team have been very responsive and helpful.

I’m sharing some technical observations with the Roon community in order to learn from your experiences with the following and how to deal with it. To be clear, I am not describing a memory leak. I am describing poorly performing garbage collection.

Situation

Interacting with Roon feels increasingly sluggish over time and when it lags, music playback and streaming to audio devices stops intermittently. The following usually happens because network communication failed while the Roon server froze during garbage collection.

  • Music playback stops

  • Audio devices disappear from the Roon UI

    • After several seconds, devices are visible again and music can be resumed at the last played position
  • Syncing metadata to Roonlabs and Qobuz (e.g., adding an album to library) fail intermittently or take many seconds to see visual feedback in the Roon UI that syncing was successful

The workaround has been to restart Roon when I notice things slowing down.

Observations

Over the past couple of months, I’ve noticed that Roon feels snappier after updating to the latest version. I started paying close attention to the log files to determine if it was the result of software improvements (e.g., .NET v10) or the magical reboot/restart that fixes things 99% of the time.

Sifting through logs and trying to correlate events with what I experienced is difficult, so I now use Grafana to visualize events over time, and patterns are emerging.

  1. Physical allocated memory expands and contracts within an acceptable range with no observed out of memory conditions.

    Snapshot of my Roon server memory and heap stats

    Top panel shows normal physical memory in green for 24 hour period

    Top panel shows normal unmanaged space in blue for 24 hour period

  2. Specific Roon objects move between managed and unmanaged memory in mirror-like fashion. For example, database flushes result in growing the heap by hundreds of kilobytes to megabytes while the unmanaged space shrinks by a similar amount.

    Effects of Leveldb flush on memory and GC in 4 hour period

    Correlated effects of Leveldb flush on memory and GC in 4 hour period

    Zoomed in view shows size of flushed db objects in 20 minute period

    Interesting event (but probably not related to the memory pressure) is Leveldb compaction at the file level that happens throughout the day. These events are CPU intensive as compaction will use all the resources available at the time.

    Example of how long compaction can take

    Example of the size of data being created, deleted, organized, and compacted

  3. There are 2 periods during the day (one of them is scheduled library maintenance) with long running database flushes. After those events, the heap continues to grow, while unmanaged memory and overall physical memory can shrink.

    Top panel shows heap in yellow for 24 hour period

  4. This leads to more frequent garbage collection and GC pauses take longer.

    Log summary grouped by day shows increasing GC pauses for 8 day period - note the max pause duration

    • Roon restarted on Mar 24
    • Roon remote will “lag” if it interacts with the server during a half second GC pause
  5. The percentage of time spent garbage collecting becomes noticeable when it approaches 5% for my setup.

    % of Roon’s runtime spent garbage collecting continues to increase

    Processor: 3.3 GHz Quad-Core Intel Core i5
    Memory: 32 GB 1600 MHz DDR3
    Disk: Samsung SSD 860 EVO 2TB
    OS: Mac OS Sequoia 15.7.5
    Roon: v2.63 (build 1644) early access
    Database size: 1.061 GB

Related discussions

Jun 2022
“The RAM usage on Roon OS grows relational to the library size being managed by Roon.”

Jun 2022
“Resolved a possible memory leak appearing during audio analysis. Resolved a possible memory leak in the connection management mechanism”

Mar 2023
"Memory management on modern operating systems is very complex, and the stats about memory consumption at these levels are often misunderstood because they are incomplete and not intuitive.

Feb 2026
“Based on your library size the amount of physical RAM being used appears correct with no suggestion of a memory leak”

Mar 2026
“I generally keep a close eye on memory stats in the server logs to try and see where and when the nonsense starts (zones disappear, app hangs etc).“

Apr 2026
”The app takes a long time to respond to commands… Yes, rebooting helps, but the issue returns after some time.”

  • I’m linking the post from @Tom_Harskamp because he describes Roon slowness that disappears after reboots.
  • However, he also includes possibly unrelated issues that are discussed elsewhere as known issues, or issues that are specific to a person and their hardware/software.
    • “It take a long time to go to Tidal end load the home page of it.”
    • “When pick a play list hit play it can take on to 1 minute that the play list or album starts to play.”
3 Likes

With EA’s latest build of Roon, it looks like they’re starting to address this issue.
If you’re interested, you might want to check it out.

I think it’s already with EA, the .NET 10 switch was mentioned, and:

Thanks, I missed it. I’m feeling tired :sweat_smile:

1 Like

Saved me some work with troubleshooting this! Nice one. Just wanted to chime in and say I have the same issue, have run out of memory on my Roon server 3 times in the past 2 weeks. Never happened to me before. Now, with this issue, in addition to the dates being wrong with Qobuz content, and Roon randomly stops playing music, it’s really building up some frustration. Not to mention how darn slow Roon has become lately (I have tried to restore from backup and all that, still turns into slowness after a day).

A number of us are also monitoring Roon using Grafana, and the leak discussed in the linked topics was resolved for Linux (I can’t comment on other OS) some months ago.

Indeed Roon seems to be performing as correctly with memory usage as expected … unless matters have changed in the past week.

However, I have observed a CPU core held at 100% utilization for no apparent reason a couple of times.

1 Like

@mjw you may be right. I’ve not been patient enough to leave Roon alone.

I could just let the .NET CLR do it’s thing and manage memory as Roon configured it to do. But, I just restart Roon before it becomes intolerable :slight_smile:

I see a pattern of growing managed memory space and I “feel” the effect of memory pressure on Roon. The dashed blue vertical lines below are when I’ve restarted Roon - the timeframe shown is from Mar 16 - Apr 1.

I notice that too. On my system, the times in the Leveldb LOG file shows compaction events and they can align with high CPU utilization. It’s something to keep an eye if high resource utilization is long lived.

1 Like

This isn’t what I am observing. As soon as Roon starts, one core is pegged at 100% continually, whether 24 hours or a week, until a restart.

However, this isn’t easily reproducable. I’ll share my Grafana charts when I return home.

1 Like

This sounds promising: “On April 20, 2026, we’ll release a new version of Roon that brings performance improvements and significantly reduced memory usage.”

Looking forward to learning what’s changing under the hood.

1 Like

The Early Access release notes for B1644 also mentioned that 1644 would be the start of a longer drive for memory usage improvement, and the B1643 release notes mentioned better performance partly through moving to .NET 10.

The EA release notes should keep you reasonably appraised.

By the way, I linked this thread to an EA feedback thread regarding memory usage: EarlyAccess: Roon 2.65 Build 1645 : RAM usage feedback
It’s best to post EA info/feedback/issues in Early Access to ensure that Roon Labs learns from it.

2 Likes

All of this looks really cool. I’ve started looking through this stuff and seeing what we can learn from it, but also into setting this up in my own dev/test environment.

Thanks!

1 Like

Memory behavior has changed with Roon v2.65 build 1645 (early access). Has the dev team decided to manage objects in memory themselves rather than letting .NET GC manage it for better perceived performance at the cost of increasing overall memory? A defect?

Physical and unmanaged memory are larger compared to build 1644, while managed memory, % of runtime, and gc pause duration remain lower.

Scheduled maintenance at 5:00 with the regular/expected Leveldb flush and syncs

Physical memory increases as with previous build, but remains high with build 1645

.Net’s garbage collection doing its thing, but heap size is smaller than with build 1644

Unmanaged memory size remains higher than with build 1644 - in that build unmanaged space became smaller while managed remained larger

3 Likes

After 5 days of not using Roon, sitting idle Apr 6 - Apr 10, garbage collection pauses are still greater than 100 ms :frowning:

Memory Summary April 4 - April 14

  • Virtual: 47 - 49 GB

  • Physical: 3.3 - 5.5 GB

  • Managed: 1.99 - 2.4 GB

  • Unmanaged: decreased from 3.1 GB to 1.3 GB

Garbage Collection Analysis

GC pause percentage grew, but physical memory usage decreased from 5.5 GB to 3.3 GB over 10 days. The garbage collector is becoming less efficient, but it is working as expected by reclaiming large amounts of memory and preventing a memory leak.

  • Frequency: Roon logs [stats] every 15 seconds, so at least one garbage collection about every 15 seconds.

  • Pause duration: the last GC pause duration ranges from 15 ms to 72 ms.

    • Relatively short duration but they’re consistent.
    • Pauses > 100 ms is when people typically notice hiccups with Roon.
  • Percentage of runtime in pauses: ruh roh - percentage grows from 1.44% at the start of the logs to 2.17% by the end.

Cumulative overhead of garbage collection

Date Min (%) Max (%) Average (%) Median (%) Std Dev (%)
04/04 1.44 2.02 1.78 1.73 0.13
04/05 2.02 2.53 2.25 2.22 0.10
04/06 2.53 2.82 2.75 2.80 0.10
04/07 2.82 2.94 2.91 2.93 0.04
04/08 2.93 2.99 2.97 2.98 0.03
04/09 2.97 3.05 3.02 3.03 0.02
04/10 0.16 23.72 3.11 3.02 1.01
04/11 0.88 2.62 1.84 1.87 0.27
04/12 2.11 2.74 2.43 2.39 0.18
04/13 0.07 23.88 2.47 2.75 1.35
04/14 1.66 2.17 2.01 2.03 0.08

Notes

  • During steady-state operation (e.g., 4/7 to 4/9), the pause percentage increased at a low rate of ~0.1%.

  • On 4/10 and 4/13 the pause percentage jumps to ~23.8% - Roon spent nearly a quarter of its time frozen for memory management.

  • Drops in Min values (0.16% and 0.07%) were due to manual restarts.

Size Matters

In my case…

  • Tracks: 43.8k

  • Albums: 7.84k

  • Artists: 3.95k

  • Works: 17k

  • Performances: 24.6k

Yours is probably larger (see what I did there?).

Large Data Moves across the Heap

The rising GC pause percentage and high virtual memory usage are due to how Roon bridges LevelDB and SymSpell with the .NET runtime.

Elephant in Unmanaged Memory

LevelDB is a C++ key-value store and in Roon, it’s the database for our music library.

The memory pressure comes from moving data between unmanaged and managed memory.

  • MemTable flushes: LevelDB stores recent writes in an in-memory MemTable. When this fills up, it flushes the data to disk. During this process, Roon maps these native buffers into the .NET managed heap to process library logic. This creates the mirroring we see in screenshots below: unmanaged memory shrinks as the MemTable clears, but managed memory spikes as .NET objects are created to represent that data.

  • Allocation pressure: constant conversion of byte arrays from LevelDB into Track or Album objects in .NET, generates high memory pressure. The Garbage Collector runs frequently to clean up these short-lived bridge objects.

Gorilla in Managed Memory

Roon uses the SymSpell library for very fast type-ahead searching and correcting typos. Unlike LevelDB, SymSpell is entirely in-memory and managed by .NET.

SymSpell pre-calculates every possible delete variation of a word. For a decent size music library, this dictionary becomes a huge graph of millions of small Strings, Lists, and hash table entries in the managed heap.

  • Mark-and-Sweep GC bottleneck: during a garbage collection, the Mark phase walks the entire tree of live objects to see what can be deleted. Because SymSpell keeps a huge, permanent dictionary of strings in memory, the GC has to examine millions of objects every time it runs. This is why the GC pause duration stays high even when Roon isn’t doing much – it must walk the entire graph to see what’s still in use.

  • Heap fragmentation: large search dictionaries and database buffers land in the Large Object Heap create gaps in the small object heap. Over time, this fragmentation makes the GC work harder to find contiguous space, which is why % of runtime in GC pauses increases over several days.

AI for Validation?

I’ve spent a couple weeks now learning what Roon’s doing with memory and I’ve shared my ignorance and questions here :wink:

A lot of what I’ve written has been hypothesis driven and assumptions made from reading Roon logs and memory profiling, but I’m still uncertain of what I’m seeing without knowing Roon’s internals.

So I fed some screenshots and CSV files to AI to see what it “thinks”. Take the following with a pinch of salt - I kinda hate its smug assuredness.

[Edit: 2026-04-17] I removed the AI generated content after reading about the Roon community’s stance on AI in the forums ( FAQ - Roon Labs Community ).

Timeline Analysis of creeping managed heap

Starburst diagram showing largest “rooted” objects in memory

My Takeaways

  • Roon’s memory pressures are real.
  • Roon gets slower due to increasingly longer garbage collection pauses.
  • Full garbage collection is expensive and pauses code execution in all threads for a relatively long time.
  • Threads are waiting for a lock to acquire or a request to finish which occasionally causes network connections to fail and Roon does not recover gracefully (depends on the scenario).
  • Short term solution is to restart Roon.
  • Long term solution is in the hands of the Roon dev team and they’re making incremental improvements with each early access build.

I think this brings my tinkering to an end :slight_smile: I have more respect for what Roon is and how it does what it does compared to when I started this exercise. Don’t let AI’s negative tone detract from Roon’s many merits!

If you’ve made it this far, you deserve a cookie!

2 Likes

Still not sure if you get the right audience in Tinkering. Did you see the new EA update with the fix?

1 Like

I was scratching an itch - learning how Roon operates to distinguish between primary issues and ancillary issues (e.g., network, hardware, OS). I’ve got a better understanding of when to create Support posts and how to provide useful information in other categories like Early Access. I’m amazed that you and @mjw are seemingly everywhere and the amount of helpful information shared.

I did and am now running version 2.65 build 1648, but the fix is for a memory leak.

There’s still the need to improve Roon performance by reducing the frequency of garbage collection and length of pauses when full GC happens. Memory use will be high by design, but the efficiency and speed of data storage and retrieval in memory is not yet a solved problem.

The dev team is making incremental improvements with design choices made early on…

I believe they will soon get to a point where these memory discussions become irrelevant for most users.

I would like to see Roon’s data design evolve to better work with .NET’s CLR. At its core, Roon is a database engine.

Pros of current design:

  • Entity centric relationships - like a graph database to match how we think about music and musicians (a song has components, a band contains members).
  • Data layout is large but cache friendly - it’s contiguous, aligned, has predictable access patterns.
  • Single Instruction Multiple Data friendly - can leverage .NET’s SIMD accelerated types to operate on multiple data elements simultaneously across an array of data in a single step, effectively processing data in parallel.

Cons with current design:

  • Roon allocates data on hot paths and they become GC events.
  • Memory safety is ensured by .NET but GC pauses negate the benefits of fast data lookups.
  • Library search and spell checking builds an enormous data dictionary.
  • Lots of String and object references - needs marshaling/conversion when data is passed between managed and unmanaged code.

Separate from this topic, there is another design choice I’d like to see evolve.

  • Roon lacks ACID transactions (DB susceptible to corruption) with per-component multi version concurrency control (profiles/users changing library metadata simultaneously can leave data in an invalid state).

This technical stuff is interesting, but at the end of the day, I want to stop thinking about managing Roon. When it gets to the point of it “just works” for me and my household, I’ll be a happy camper. Currently I’m tech support for them when things go wrong.

4 Likes

See here for my feedback on Memory usage - my charts aren’t as involved, but as I am running ROCK with only logs to pass into ChapGPT, this is what I have.

and on the B1647 build, issue fixed in B1648

This is what I am seeing since I rebuilt my server. Roon still doubles memory usage over time. I also observed a single core pegged at 100% a couple of times, and hand to restart Roon server. There is no apparent reason for this, and I do wonder if this is a problem others experience when performance drops off radically on a dedicated machine such as a NUC.

1 Like

I run on Windows and don’t see the accumulated memory consumption.
My server is a dedicated machine running Win 11 IoT LTSC presenting Roon Server and Minim 2 as well as SMB-shares to my network. At this time it has been running for almost six days constantly, and memory seems to be managed by the running services:


As you can deduct, my Roon Server is set to “scheduled maintenance” where it goes a bit ballistic after 1AM every night. But i also seems to release a bit of the memory acc.

Anyways, if i go back for a few days, the average memory commit is staying between 31 and 34%. I have been rebooting due to other experiments, so this data point might not be entirely descriptive though.

1 Like

Thanks for sharing your updates @simon_pepper

You’re probably aware but just a friendly reminder that if you’re using an LLM provider that makes your data available to the public, consider sanitizing personal information in the logs before sharing. For example, search for ‘token’ and you’ll see:

  • Roon authentication token
  • Roon user id
  • Email address associated with your Roon account

There’s other information that you might want to keep private:

  • Your external IP address and internal device IPs
  • Information about the Roon host

The memory leak appears to be fixed, but slow performance due to memory pressure is still an ongoing issue.

If you’re a glutton for punishment, these flags turns on additional information that gets logged (beware that you’re opening a firehose of data) for further monitoring and investigation.

-perftimers            Enable performance timers
-instrumentationtrace  Retrieve and print trace from instrumentation
-dbtrace               Retrieve and print trace from music library database
-logfiles=<number>     Number of log files to preserve (default 20)

See https://help.roonlabs.com/portal/en/kb/articles/flags for instructions on how to enable them and the warnings. Personally, I would only enable perftimers temporarily because there’s too much data to wade through as an end user when the other flags are used.

1 Like

@mjw It’s great that you can see the CPU and memory trend over a long time period. Raises several questions in my mind, but feel free to ignore them…

CPU

Do you start the Docker image with interactive mode flags? If so, does switching to detached mode reduce CPU spikes?

Is Roon the only image running on the host? If not, could other images affect CPU utilization?

Would adding more CPU cores or constraining the Docker container’s CPU cycles help?

When there’s a CPU spike, do your Roon logs show thread count - concurrent processes that could be overwhelming the CPU?

Is there a correlation between high CPU and high IO (e.g., disk reads/writes)?

Memory

Can you see what Roon’s managed and unmanaged memory looks like and how they affect overall physical memory usage?