Roon on Debian Linux Problem

When I was troubleshooting this while the issue is happening, I also don’t see anything in those logs which is why I stopped checking them. How do you propose we move forward? To summarize, when the issur starts:

  1. The roon service is still started but with a state of “not responding” as shown in some of the logs snippets in my previous posts.
  2. While the service is in this state, it will consume a lot of CPU and RAM resources until OOM killer kills the service.
  3. Reinstalling roon without deleting /var/roon does not solve the issue.
  4. Reinstalling roon after deleting /var/roon and starting fresh (no database restore) works temporarily until it starts happening again after around 1.5 days of uptime.

I need the newest logs from when the problem occurs. The logs you sent show me a functioning RoonServer.

Ok. You posted a little bit too late since the issue happened 30 minutes before your first post here. Now that I applied the workaround (reinstall plus delete /var/roon and restore from backup), I will need to wait for approximately 1.5 days for the issue to happen again. I’ll wait and make sure it comes to a point where OOM killer kills the service and I will get you the updated Logs folder again. While I’m at it, do you need any other set of logs for when the issue happens again?

/var/log/mesages or journalctl is good

Ok. Right now, one endpoint is streaming off of the Core and it’s taking a lot of RAM. Is this normal?

This might be related to this: Memory leak in Roon?! [See Staff Post] - #70 by Bill_Janssen

The person in the last post is saying his was using 48GB of RAM! That is insane.

@danny

Here are my roon logs so far:

The service didn’t stop yet but the RAM usage is messed up. It got to a point that it was at 90% (of 32GB) usage even without any endpoint streaming. I see tons of Tidal errors in there but not sure if that’s what’s causing it.

@danny ok, OOM killer just kicked in to kill roon. And as expected, the service won’t start anymore (not responding message). Here are the fresh set of logs:

/var/log/messages: Dropbox - messages - Simplify your life
journalctl -u roonserver.service -b: Dropbox - journalctl.txt - Simplify your life

These logs should show all events from the time I reported the high RAM usage a few hours ago up to the point where OOM killer kicked in. The physical RAM usage is very evident in the Roon logs and is mostly at 28GB or so.

1 Like

Hi @Kevin_Mychal_Ong ,

Your var-log is showing quite a few hard disk errors, perhaps this is attributing to the memory leak:

Apr 10 03:14:35 nuc kernel: [738485.103974] EXT4-fs (sda1): error count since last fsck: 138
Apr 10 03:14:35 nuc kernel: [738485.103982] EXT4-fs (sda1): initial error at time 1629819004: htree_dirblock_to_tree:1003: inode 70516840
Apr 10 03:14:35 nuc kernel: [738485.103988] EXT4-fs (sda1): last error at time 1647358208: ext4_empty_dir:3005: inode 95815045

You may want to run a disk check, reinstall the OS and/or use a different hard drive.

Is it pretty common for hard disk errors to cause memory leaks? Shouldn’t it be affecting other things on my server too if that is the case?

@noris

Also, sda1 is the partition on my storage SSD where my local music files are. It’s not the OS disk (nvme drive) so I don’t think reinstalling the OS will do anything to the errors.

root@nuc:~# df -h
Filesystem                        Size  Used Avail Use% Mounted on
udev                               16G     0   16G   0% /dev
tmpfs                             3.2G  4.4M  3.2G   1% /run
/dev/nvme0n1p2                     23G  5.3G   17G  25% /
tmpfs                              16G  4.0K   16G   1% /dev/shm
tmpfs                             5.0M     0  5.0M   0% /run/lock
/dev/sda1                         1.8T  999G  742G  58% /mnt/storage
/dev/nvme0n1p5                    1.9G  6.4M  1.7G   1% /tmp
/dev/nvme0n1p3                    9.2G  3.3G  5.4G  38% /var
/dev/nvme0n1p6                    193G   26G  158G  14% /home
/dev/nvme0n1p1                    511M   18M  494M   4% /boot/efi
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/7e4b7bc557d6dbebe65848a8e647c81480fa8607e8f81e7e127051b37004096a/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/b99eec5fc2214896f6f23a788607149f579d217c15c6c11e996677c031313c2f/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/505205bf611016752431128729b71ffbc2f13139f53f7ddcb7ed6b77befd7f3a/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/25f374918223be019b25f35f21a053d8b4b455ddaada0259437c215bd85c78cc/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/6b47ff466d400750ad8cd9dffb86e415426a3f6b2cfc81e43befdf4fbb744304/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/dfd78587070db6673a0b6c32e6cb38cc2b3a00ffd8cb272252f473a7cb96730e/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/bc93535b62129f20b981d88af9990350a47ba40e9fc6a6f5702f4e209f1b2871/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/8c75355d40afdac24635870bc92fae92927b02e020718c1ec63887ea85fba062/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/71c1393dd3d2b40e72bc5c6081d1123c9635d2fcd012d58b39e7727f8f38d80b/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/8eb664d4fcce3c4afe32e302a13763884dbc170369c3bb8d14b0a21dfe26c55c/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/c9ce4eb7c674ebc8fcfae7f629c8c8097d3600603668d7655d0b2e6834343cbd/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/ca999edab1cd3beb0bbd53795f02e6c768ee17108f5831ce79c978f20df0f178/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/73034525e50a1794c6ea849b0a1e595cca9ceaf592243c2f969025eb39a0653a/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/97a5e015299c33a497f3d4649333ea0981abb49f7f60b2646b879c894f15273b/merged
overlay                           193G   26G  158G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/a0b81d42d80ede5627b0f29f9352c9664ae8e3a33791269903188060a9ff2a85/merged
tmpfs                             3.2G     0  3.2G   0% /run/user/1000
synology.home.arpa:/volume1/data  104T   65T   40T  63% /mnt/data

My experience only is.
Once you start seeing errors somewhere, it will snowball.

If the base OS isn’t stable everything depending on that OS can’t be predictable.

I understand that. But we’re not seeing OS errors here, are we?

I’ll go back to reading along.

If the sda drive only holds music files, I’d simply disconnect it and see if Roon starts up and runs without memory problems. That your Linux reports problems with sda seems to be clear, and what effect that may have on the working of Roon remains to be tested. In any case, sda should be revised and probably replaced.

3 Likes

I wasn’t trying to be offensive. I just want to base my next course of action on facts. If any of the logs point me to the OS being corrupted or anything, then I’m all for fixing that. But if the logs are pointing to the storage disk (which only holds my local music files) having errors, then reinstalling a whole OS won’t really do anything to solve those errors.

Yes, that makes sense and this will be my next course of action.

EDIT: I already ran fsck against /dev/sda1 and fixed a couple of corrupted directories. I’m guessing this is because there are lots of albums in there that are in Chinese. But anyway, I have it clean now:

root@nuc:~# fsck -y /dev/sda1
fsck from util-linux 2.36.1
e2fsck 1.46.2 (28-Feb-2021)
/dev/sda1: clean, 1113099/122101760 files, 270035392/488378385 blocks

I still kept it in an unmounted state for now and see how run the Roon Core behaves without the drive.

I’m not seeing any changes to the RAM usage after I unmounted the /dev/sda1 partition from the system. I’ll post another set of logs when the OOM killer kicks in to kill the roonserver service.

root@nuc:~# df -h
Filesystem                        Size  Used Avail Use% Mounted on
udev                               16G     0   16G   0% /dev
tmpfs                             3.2G  4.1M  3.2G   1% /run
/dev/nvme0n1p2                     23G  7.0G   15G  33% /
tmpfs                              16G  4.0K   16G   1% /dev/shm
tmpfs                             5.0M     0  5.0M   0% /run/lock
/dev/nvme0n1p3                    9.2G  4.1G  4.6G  48% /var
/dev/nvme0n1p5                    1.9G  6.3M  1.7G   1% /tmp
/dev/nvme0n1p6                    193G   25G  159G  14% /home
/dev/nvme0n1p1                    511M   18M  494M   4% /boot/efi
synology.home.arpa:/volume1/data  104T   65T   40T  63% /mnt/data
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/25f374918223be019b25f35f21a053d8b4b455ddaada0259437c215bd85c78cc/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/c9ce4eb7c674ebc8fcfae7f629c8c8097d3600603668d7655d0b2e6834343cbd/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/8eb664d4fcce3c4afe32e302a13763884dbc170369c3bb8d14b0a21dfe26c55c/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/6b47ff466d400750ad8cd9dffb86e415426a3f6b2cfc81e43befdf4fbb744304/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/505205bf611016752431128729b71ffbc2f13139f53f7ddcb7ed6b77befd7f3a/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/b99eec5fc2214896f6f23a788607149f579d217c15c6c11e996677c031313c2f/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/dfd78587070db6673a0b6c32e6cb38cc2b3a00ffd8cb272252f473a7cb96730e/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/ca999edab1cd3beb0bbd53795f02e6c768ee17108f5831ce79c978f20df0f178/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/73034525e50a1794c6ea849b0a1e595cca9ceaf592243c2f969025eb39a0653a/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/a0b81d42d80ede5627b0f29f9352c9664ae8e3a33791269903188060a9ff2a85/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/35b60600b6e354127b36965bb2930545fc4356cd7d9e464713fd7122b9a9250e/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/71c1393dd3d2b40e72bc5c6081d1123c9635d2fcd012d58b39e7727f8f38d80b/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/97a5e015299c33a497f3d4649333ea0981abb49f7f60b2646b879c894f15273b/merged
overlay                           193G   25G  159G  14% /home/kevindd992002/docker/var-lib-docker/overlay2/8c75355d40afdac24635870bc92fae92927b02e020718c1ec63887ea85fba062/merged
tmpfs                             3.2G     0  3.2G   0% /run/user/1000

@danny @noris The OOM killer just killed the process a few hours ago and here’s a new set of logs:

Again, sda1 is unmounted here so you should not see any errors related to it. The last error I saw in /var/log/messages for /dev/sda1 was yesterday before I unmounted and ran fsck against it:

Apr 14 21:20:18 nuc kernel: [184845.834857] EXT4-fs (sda1): last error at time 1647358208: ext4_empty_dir:3005: inode 95815045

If you grep “nvme” (my OS disk), you shouldn’t see any errors too:

root@nuc:~# cat /var/log/messages | grep nvme
Apr 11 04:57:11 nuc kernel: [    2.021043] nvme nvme0: pci function 0000:3a:00.0
Apr 11 04:57:11 nuc kernel: [    2.030490] nvme nvme0: 12/0/0 default/read/poll queues
Apr 11 04:57:11 nuc kernel: [    2.033344]  nvme0n1: p1 p2 p3 p4 p5 p6
Apr 11 04:57:11 nuc kernel: [    3.604254] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 04:57:11 nuc kernel: [    3.905338] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: none.
Apr 11 04:57:11 nuc kernel: [    4.065875] Adding 1000444k swap on /dev/nvme0n1p4.  Priority:-2 extents:1 across:1000444k SSFS
Apr 11 04:57:11 nuc kernel: [    4.107595] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 04:57:11 nuc kernel: [    4.123826] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 04:57:11 nuc kernel: [    4.216924] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 20:02:49 nuc kernel: [    1.397951] nvme nvme0: pci function 0000:3a:00.0
Apr 11 20:02:49 nuc kernel: [    1.412712] nvme nvme0: 12/0/0 default/read/poll queues
Apr 11 20:02:49 nuc kernel: [    1.415762]  nvme0n1: p1 p2 p3 p4 p5 p6
Apr 11 20:02:49 nuc kernel: [    2.971730] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 20:02:49 nuc kernel: [    3.250802] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: none.
Apr 11 20:02:49 nuc kernel: [    3.406679] Adding 1000444k swap on /dev/nvme0n1p4.  Priority:-2 extents:1 across:1000444k SSFS
Apr 11 20:02:49 nuc kernel: [    3.439309] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 20:02:49 nuc kernel: [    3.440938] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 11 20:02:49 nuc kernel: [    3.441442] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 12 17:59:32 nuc kernel: [    1.396130] nvme nvme0: pci function 0000:3a:00.0
Apr 12 17:59:32 nuc kernel: [    1.412406] nvme nvme0: 12/0/0 default/read/poll queues
Apr 12 17:59:32 nuc kernel: [    1.415251]  nvme0n1: p1 p2 p3 p4 p5 p6
Apr 12 17:59:32 nuc kernel: [    2.955983] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 12 17:59:32 nuc kernel: [    3.304213] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: none.
Apr 12 17:59:32 nuc kernel: [    3.471838] Adding 1000444k swap on /dev/nvme0n1p4.  Priority:-2 extents:1 across:1000444k SSFS
Apr 12 17:59:32 nuc kernel: [    3.496214] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 12 17:59:32 nuc kernel: [    3.496695] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 12 17:59:32 nuc kernel: [    3.499688] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 01:52:55 nuc kernel: [    1.437639] nvme nvme0: pci function 0000:3a:00.0
Apr 15 01:52:55 nuc kernel: [    1.453691] nvme nvme0: 12/0/0 default/read/poll queues
Apr 15 01:52:55 nuc kernel: [    1.456548]  nvme0n1: p1 p2 p3 p4 p5 p6
Apr 15 01:52:55 nuc kernel: [    3.012376] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 01:52:55 nuc kernel: [    3.286245] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: none.
Apr 15 01:52:55 nuc kernel: [    3.427827] Adding 1000444k swap on /dev/nvme0n1p4.  Priority:-2 extents:1 across:1000444k SSFS
Apr 15 01:52:55 nuc kernel: [    3.466271] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 01:52:55 nuc kernel: [    3.468530] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 01:52:55 nuc kernel: [    3.470367] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 02:14:21 nuc kernel: [    1.382223] nvme nvme0: pci function 0000:3a:00.0
Apr 15 02:14:21 nuc kernel: [    1.397743] nvme nvme0: 12/0/0 default/read/poll queues
Apr 15 02:14:21 nuc kernel: [    1.400574]  nvme0n1: p1 p2 p3 p4 p5 p6
Apr 15 02:14:21 nuc kernel: [    2.967067] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 02:14:21 nuc kernel: [    3.296254] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: none.
Apr 15 02:14:21 nuc kernel: [    3.452466] Adding 1000444k swap on /dev/nvme0n1p4.  Priority:-2 extents:1 across:1000444k SSFS
Apr 15 02:14:21 nuc kernel: [    3.478669] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 02:14:21 nuc kernel: [    3.480230] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 15 02:14:21 nuc kernel: [    3.480586] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.

Are these info enough for us to say that this isn’t an OS issue?

1 Like

Hi @Kevin_Mychal_Ong ,

Thanks for the further checks here.

It is possible that something about that drive impacted the Roon database, can you please confirm, if you set up the fresh database and hold off on importing content from the sda1 drive, does Roon still have the OOM issue?

Please confirm with a small library of content not on the drive, and then after confirming try to import content from that sda1 drive and see if the system is still stable or if the OOMs start then.