RoonServer High CPU and RAM Usage Ubuntu 22.04 / Roon Remote slow / no response

Does the other environment variable stay or go?

Anyhow Ive added it and rebooted.

cat /etc/environment
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
DOTNET_GCRetainVM=0
DOTNET_gcServer=0

Im running the latest supported kernel for Ubuntu 22.04.3, which is as below.

System:
  Host: librenms Kernel: 5.15.0-83-generic x86_64 bits: 64 Console: pty pts/0
    Distro: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop System: ASUS product: All Series v: N/A serial: <superuser required>
  Mobo: ASUSTeK model: RAMPAGE V EXTREME v: Rev 1.xx serial: <superuser required>
    UEFI: American Megatrends v: 3701 date: 03/31/2017
CPU:
  Info: 6-core Intel Core i7-5930K [MT MCP] speed (MHz): avg: 2679 min/max: 1200/3700
Graphics:
  Device-1: NVIDIA GM204 [GeForce GTX 980] driver: nvidia v: 535.86.05
  Device-2: NVIDIA GM204 [GeForce GTX 980] driver: nvidia v: 535.86.05
  Display: server: X.org v: 1.21.1.4 with: Xwayland v: 22.1.1 driver: X: loaded: nvidia
    gpu: nvidia,nvidia tty: 189x34
  Message: GL data unavailable in console. Try -G --display
Network:
  Device-1: Intel Ethernet I218-V driver: e1000e
  Device-2: Broadcom BCM4360 802.11ac Wireless Network Adapter driver: wl
Drives:
  Local Storage: total: 2.16 TiB used: 1.37 TiB (63.7%)
Info:
  Processes: 548 Uptime: 11m Memory: 31.25 GiB used: 5.84 GiB (18.7%) Init: systemd runlevel: 5
  Shell: fish inxi: 3.3.13

After a few hours and playing music with Roon from a local FLAC album, I can advise that the issue of memory leak, with the variables in place is worse.

I have backed out the change and rebooted.

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
#DOTNET_GCRetainVM=0
#DOTNET_gcServer=0
~

can you try with:

DOTNET_GCRetainVM=1
DOTNET_gcServer=0

?

I think you got this right already, but I want to try to answer anyway.

When making changes using /etc/environment, youā€™ll need to reboot, then attempt to use Roon ā€œnormallyā€, and see if physical memory usage seems to grow to consume all available memory on the machine.

Hi @ben, during the last three days I was unfortunately cut off from Internet and suffered intermittent blackouts. These problems seem to have been addressed, and since yesterday afternoon all is back to normal againā€¦

So I unset the previous environment variable DOTNET_GCRetainVM and set the new suggestion DOTNET_gcServer=0, rebooted and started to ā€˜stressā€™ Roon by playing back, adding new Qobuz favorites to my library and removing Tidal favorites.

I have been running now for 18 hours, and these are my observations so far:

  1. After startup, the system stabilizes initially with RES memory size 7200M.

  1. After playing music for some minutes and doing nothing else, memory usage drops to about 5800M.

  1. After adding/deleting a few albums some time later, the Tidal/Qobuz storage library update process is triggered, RES memory size goes up.

During this process, two aspects were changed: First, the reported load grew up to and slightly over 2.0, when before it stayed around or slightly above 1.0. But, and quite unexpectedly, user interaction on my two remotes (Android Phone and iMac) stayed seemingly unaffected. Noticed some little resistance with scrolling, but page loads and interaction with buttons (start/stop play etc.) felt snappy.

  1. I listened to music until late at night, then the system stayed idle and performed the daily database backup an 3 AM. Here the screenshot first thing this morning; RES memory size has dropped again:

  1. As it is release Friday, I took advantage and added some more new albums (Qobuz). The Qobuz storage library update process was triggered again and you can see how load is around 2.0. RES memory during this process goes up:

  1. The storage update process ran very fast and at the end the reported RES memory size was about 8400M. Now, about three hours later and after having added some more albums, the memory size stays the same.

So, my first impression with this recent change is positive. I donā€™t observe any short-term memory usage increases other than during the storage library updates. Even so, the user interface interaction this time stayed snappy. If the system keeps performing like this even after several days of runtime, Iā€™d say all is well!

I hope my Internet access and power stay online during the next days, so I can keep testing the performance and memory usage during the following days. Iā€™ll report back after the weekend.

I will try with /etc/environment set like that.

This looks good to me too. Thanks for the help, I really appreciate the detailed reports and being able to do something to test this without waiting for a full release cycle.

Iā€™d really appreciate it, thank you :slight_smile:

Hopefully this will let us resolve this for all RoonServer users soon.

Hi @ben, hereā€™s my update after the weekendā€¦

Firstā€¦ regrettably, last Friday afternoon I had another environmental event which forced my Roon server to shut off. Since then it has been running, though, and I am nearing the 72 hours mark:

  1. After playing music Friday night, on Saturday morning the reported RES memory size had dropped, as it does after running the database backup at 3 AMā€¦

  1. Again, after adding a couple of albums to the library, the Qobuz storage library update process was triggered, and the RES memory size immediately went up:

  1. During the rest of Saturday and after finalizing the storage library update process, metadata updates kicked in, and that process is also very demanding on memory. It ran for several hours. This is from 3:10 pm:

  1. Sunday morning, the reported physical memory size had again dropped. See here log excerpts shortly before and after the 3 AM database backup. The metadata update process of the previous day had made the reported file handles go sharply up, and they also stayed up until the database backup process:

  1. Yesterday afternoon, once again a metadata update process was triggered at approx. 16:12 pm. See the immediate effect on physical memory usage and file handles:

The process terminated in less than 14 minutes, and afterwards the physical memory size and reported file handles stayed up for the rest of the afternoon and the night:

  1. This is the log shortly before and after the database update at 3 AM today Monday:

Youā€™ll notice that the physical memory size doesnā€™t go back to the approx. 5900M of the day beforeā€¦ At this moment, after playing some albums this morning and without any metadata updates nor storage library updates today, it sits at about 6200M, with a tendency to decrease little by little.

Observations:

  1. At no moment the music playing was affected by the running processes. The user interface remained reasonably snappy.

  2. The garbage collector seems to behave better with DOTNET_gcServer=0. I could observe little increases in memory usage during music playback and perusing the user interface, but memory would come down after some minutes.

  3. I have never experienced sudden increases in physical memory usage, other than by the storage library update and metadata update processes. Problem is, in my experience this is never brought down again by other means than performing a database backup. And this only during the first few days after server startup. After that, the reported physical memory usage stays high, and performance of the server will worsen, until a restart is necessary.

  4. During these 72 hours, the max reported physical memory usage was about 10.200M during metadata update.

  5. I donā€™t think that tuning the GC behavior alone will resolve the widely reported performance problems for users with larger databases. It doesnā€™t seem right that the storage library update and metadata update processes would make physical memory usage go sharply up, without ever releasing that memory after the process terminates. For a long time I have been observing that a database backup will bring physical memory and file handle numbers down, but only for the first two or three days after server startup. After that, not anymore. This effect will not be noticed with smaller databases, say <70.000 tracks or something around that. With a database growing above that size, the performance impact by storage library update and metadata update processes will be felt ever more.

Iā€™ll keep observing the server behavior and report back with any observations in a couple of days.

I think I generally agree with what youā€™re saying here. Iā€™m hoping that this setting change will reduce the tendency for physical memory to not decrease when managed memory decreases, so that there will be less of a ā€œratchetā€ effect where physical memory usage only ever increases.

I agree, so Iā€™ll keep working on other fixes in addition to this sort of change. Iā€™m glad this seems like an improvement, regardless of how much work we may still have to do :slight_smile:

1 Like

It seems to do so, it looks quite positive to me:

andreas@symphony:~$ tail -f /var/roon/RoonServer/Logs/RoonServer_log.txt | grep Physical
09/18 15:22:11 Info: [stats] 25720mb Virtual, 6164mb Physical, 4079mb Managed, 351 Handles, 91 Threads
09/18 15:22:26 Info: [stats] 25720mb Virtual, 6164mb Physical, 4085mb Managed, 351 Handles, 87 Threads
09/18 15:22:41 Info: [stats] 25720mb Virtual, 6164mb Physical, 4076mb Managed, 351 Handles, 91 Threads

Roon has been playing music for several hours, and compare this recent log with that of 03:05 am, after the last database backup!

Do you have an understanding why the storage library update processes for Qobuz and Tidal would be executed ever more slowly, after several days of server uptime?

I can estimate the rate at which these processes run by the rate of new entries written to the log file, and Iā€™d say that on a newly started server, the process runs at least two orders of magnitude faster than the same process after some days of server uptimeā€¦

Having worked on memory managers and GC for a significant (past) part of my career, there are many ways for subtle issues (such as certain kinds of fragmentation due to the interaction between an applicationā€™s memory manager and a library chunk allocator) to cause allocation time to grown with number of allocations.

Well, whatever the subtle underpinnings of these allocation processes, of which I know nothing, the observed performance impact is massive. It is that which at good last forces the user to restart Roon server, as every process and user interaction feels like swimming in molassesā€¦

Thatā€™s why @ben wrote ā€œIā€™ll keep working on other fixesā€ :slight_smile:

Ok, reporting back.

Unfortunately this does not solve the memory leak. I was using Roon normally this afternoon while writing a document. Thats the memory spike at the end of the graph. If I had kept going it would have consume all 32GB RAM and then the swap space until that was consumed also.

I have since moved my roonserver to a much smaller capacity SSD modded Mac Mini 2017 model running Fedora Workstation 38. Roon runs without an issue.

I guess Roonserver just doesnā€™t like Ubuntu.

This topic was automatically closed 36 hours after the last reply. New replies are no longer allowed.