Roon Server became unstable

Core Machine (Operating system/System info/Roon build number)

Ubuntu 18.04.03 running on a 24 cores machine with 32GiB of RAM.
the latest version of Roon Server as of November 18, 2019.
Roon server 1.6 (build 416)

Network Details (Including networking gear model/manufacturer and if on WiFi/Ethernet)

Server is on the wired 10GB network on Lenovo enterprise switch.
Clients are on the wireless and wired, same issue persists.

Audio Devices (Specify what device you’re using and its connection type - USB/HDMI/etc.)

N/A - server is unusable

Description Of Issue

“Lost Connection …” persists most of the time. Even when I manage to get to the initial screen, connection is lost almost instantly. It seems that the issue started around November 11. Ubuntu server where roon-server is running is updated regularly and nothing else seems to be broken. The server runs but I see some errors in the systemctl status:

roon-server:/var/roon/RoonServer# systemctl status roonserver.service
Display all 451 possibilities? (y or n)
root@roon-server:/var/roon/RoonServer# systemctl status roonserver.service
● roonserver.service - RoonServer
   Loaded: loaded (/etc/systemd/system/roonserver.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2019-11-19 03:33:33 UTC; 45min ago
 Main PID: 13137 (start.sh)
Tasks: 15 (limit: 1014)
   CGroup: /system.slice/roonserver.service
       ├─ 2733 /opt/RoonServer/RoonMono/bin/RoonAppliance --debug --gc=sgen --server RoonAppliance.exe -watchdogport=44003
       ├─ 2734 /opt/RoonServer/Server/processreaper 2733
       ├─13137 /bin/bash /opt/RoonServer/start.sh
       └─13150 /opt/RoonServer/RoonMono/bin/RoonServer --debug --gc=sgen --server RoonServer.exe

Nov 19 04:18:30 roon-server.ian.red start.sh[13137]: Running
Nov 19 04:18:41 roon-server.ian.red start.sh[13137]: Error
Nov 19 04:18:43 roon-server.ian.red start.sh[13137]: Initializing
Nov 19 04:18:43 roon-server.ian.red start.sh[13137]: Started
Nov 19 04:18:44 roon-server.ian.red start.sh[13137]: aac_fixed decoder found, checking libavcodec version...
Nov 19 04:18:44 roon-server.ian.red start.sh[13137]: has mp3float: 1, aac_fixed: 1
Nov 19 04:18:46 roon-server.ian.red start.sh[13137]: Running
Nov 19 04:18:56 roon-server.ian.red start.sh[13137]: Error
Nov 19 04:18:58 roon-server.ian.red start.sh[13137]: Initializing
Nov 19 04:18:58 roon-server.ian.red start.sh[13137]: Started

Effectively I can no longer use the service,

Hi @Ian_Matyssik
Check this topic
Roon Server, Ubuntu, super high CPU usage, no access to Roon
I have the same issue, I believe.
Hi @noris - can you confirm this is the same problem?

Yep, seems like a similar problem. Not sure how to go about fixing it. Restoring from backup fixes issue for 5 min and then back to the same high CPU and lost connection. Server does not use all of the available resources and still unusable.

Just for now I use clean database and wait for resolving my issue. This situation moves me toward ROCK solution :sunglasses:

Hi @Ian_Matyssik,

Can I please request that you send me a manual copy of your Roon logs by using these instructions right after this behavior occurs? The best way to get them over to me would be via a shared Dropbox / Google Drive link.

What is the model/manufacturer of your router?

I sent you a link to the logs (please check direct message), and I am using Ubiquity Networks router (UniFi Security Gateway 3P). Thanks and let me know if you need any other information.

Just tried installing the server on a fresh install of Debian 10.2 and after restoration of the backup, same issue. I suspect the issue is not with the OS but with the Roon Server. Please address it since some of us are loosing paid time while this issue is ongoing. I do not have a perpetual subscription, it is annual.

@noris please let me know if you were able to download the logs

Hi @Ian_Matyssik,

Thank you for sending the logs over. I can confirm I downloaded them successfully and have requested feedback from them from our QA team, I will be sure to let you know once I hear back. Thanks!

Hi @Ian_Matyssik,

I appreciate your patience here while I had a chance to discuss your case further with QA. I have some feedback for your here:

  1. Can you please let me know if the same behavior still occurs after updating to Roon v1.7 (build 500)?

  2. We should try eliminating some of the variables here. Can you confirm if the behavior is the same if you temporarily disable you watched storage locations?

  3. Do you have any further system-level logs which display this issue?

  4. You mentioned you started with a fresh database previously. Can you confirm that using a fresh database is actually stable for a day or two? This way we are able to pinpoint the behavior further to the something regarding the database.

Thanks!

Hi @noris,

Here are the answers to your questions and some more info:
1.Yes, even after upgrade to 1.7 same behavior, i.e., Roon server crashes (as can be seen in the logs) and being restarted by the process manager.
2. I can confirm that even if I unmount the music library, I see no improvements in situation.
3. I did check all of the logs across the system and see nothing that can remotely correlate with Roon crash/start cycle - dmesg, system logs, etc.
4. Starting the fresh DB solves the issue but looses all of the customization of the library. When I mentioned about new DB reverting to crash/start cycle, I think I failed to mention that only happened after I restored from the backup. Today I am using the same library in the same environment but with the fresh DB (no restore) and it is stable. When I tried to restore in the past few days, it starts crashing almost immediately with only few minutes of stability. During last restore I tried and when Roon server was stable long enough for me to be able to open Settings, I tried backing-up restored DB and it reverted to the crash/start cycle.

In addition, I tried eliminating different setting in the Registry folder in DB, removing all the files associated with the sound devices and everything that looked like endpoint_, raat_, zone_*, but that did not help.

I think that you are right and it is something to do with DB, but logs are not telling much. What I know so far is that on November 11, the Roon server stopped working properly and the reason I know is because I have a daily backup of DB and 11/11 was the last backup that is available. I only found that the Roon server did not work a few days later, since I was away from it during that time.

I just tried to restore from different backup (oldest available) and same story. Did the newest available as well and noticed at least one somewhat useful log entry that might help:

11/23 17:01:02 Info: Is     Upgrading @   at System.Environment.get_StackTrace () [0x00000] in <370a0c27f4b74d1a81431037df6d75bf>:0
      at Sooloos.Broker.Music.Library.OnUpgrading () [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Library.<Init>b__129_0 (Sooloos.Broker.Music.MusicDatabase <p0>) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.MusicDatabase.OnUpgrading () [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.MusicDatabase.TryGetValue (System.Int64 performerid, Sooloos.Broker.Music.PerformerLiteData& data, System.Boolean cache) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Library._GetPerformerByIdUnmapped (System.Int64 performerid, System.Boolean quiet) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Library+<_GetAllPerformerIds>d__181.MoveNext () [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Library.GetPerformerById (System.Int64 performerid, System.Boolean quiet) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Library.TryGetCredit (Sooloos.Broker.Music.CreditData credit, System.Boolean quiet) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.LibraryUtils.CopyCreditList (Sooloos.Broker.Music.LibraryMutationEnv env, Sooloos.Broker.Music.CreditData[] credits, System.Nullable`1[T] removes) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.LibraryTrack._Load (Sooloos.Broker.Music.LibraryMutationEnv env, Sooloos.Broker.Music.TrackLiteData track) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.LibraryTrack..ctor (Sooloos.Broker.Music.LibraryMutationEnv env, Sooloos.Broker.Music.TrackLiteData track) [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.LibraryMutationEnv.Finish () [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Library.EndMutation () [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.Broker.Music.Module.ev_exit () [0x00000] in <6e36cf2d655a42efaa0f88059dbc8db3>:0
      at Sooloos.SynchronizationContextThread.OnExit () [0x00000] in <6218ae34a77f4dc6afc6bb5d7ff309a3>:0
      at Sooloos.SynchronizationContextThread._Dispatch (Sooloos.SynchronizationContextThread+SendOrPostWrapper& ret) [0x00000] in <6218ae34a77f4dc6afc6bb5d7ff309a3>:0
      at Sooloos.SynchronizationContextThread._Go () [0x00000] in <6218ae34a77f4dc6afc6bb5d7ff309a3>:0
      at System.Threading.ThreadHelper.ThreadStart_Context (System.Object state) [0x00000] in <370a0c27f4b74d1a81431037df6d75bf>:0
      at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <370a0c27f4b74d1a81431037df6d75bf>:0
      at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <370a0c27f4b74d1a81431037df6d75bf>:0
      at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state) [0x00000] in <370a0c27f4b74d1a81431037df6d75bf>:0
      at System.Threading.ThreadHelper.ThreadStart () [0x00000] in <370a0c27f4b74d1a81431037df6d75bf>:0
    11/23 17:01:02 Info: [loadstatus] IsLibraryUpgradingDatabase False => True
    11/23 17:01:03 Trace: [library] finished with 12163 dirty tracks 846 dirty albums 13465 dirty performers 3598 dirty works 4405 dirty performances 1257 dirty genres 573 dirty auxfiles 104 dirty countries 7 dirty periods 39 dirty forms 1710 dirty places 825 dirty creditroles 341 dirty labels 0 clumping tracks, 0 clumping auxfiles 0 compute tracks, 0 deleted tracks, 12163 tracks to (re)load, 0 tracks to retain, 573 auxfiles to (re)load, 0 auxfiles to retain, and 35734 changed objects
    11/23 17:01:03 Trace: [dbperf] flush 0 bytes, 0 ops in 2522 ms (cumulative 0 bytes, 0 ops in 2522 ms)

I think I know what the problem is. It would seem that Roon server is trying to migrate the database to a new version (or something like that) and it fails for some entries and goes into the crash.start loop. Since I have a back up, I tried to delete directory under RoonServer/Database/Core with the hash looking name that was the biggest one. It had two directories in there one with the smaller size (assuming that is the destination) and one with the bigger size (the source). Deleting the biggest one clearly wipes the database and Roon just goes into importing of the tracks from the location but experiences no issues and runs stable.

So, my suggestion is to check your migration path in the Roon and see if it handles the old version correctly. Deleting images_1 directory in the DB does not help or show any adverse effects.

Hi @Ian_Matyssik,

Thank you for the further information, it sounds like the issue is related to the current database.

Thanks for taking a look at this, but please don’t try deleting parts of your database like that, this can cause issues with the database and if you start removing pieces of it like this, we can’t guarantee future stability.

Let’s proceed as follows:

  1. Clear out your current Roon Database (rename RoonServer -> RoonServer_old)

  2. Restore your most recent database (one where you did not touch the database folders themselves)

  3. Send me a copy of your database by using these instructions

Once I have this database, I’ll request QA to take a look to see if they are able to reproduce the behavior on their end.

You reported the issue 5 days ago while on build 416, so no updating should have been done then. I believe the 1.7 update just tries to re-update the DB which is why you are seeing the updating database traces.

Please send the database over when you have a chance and I’ll add it to the queue. Thanks!

A post was split to a new topic: Roon not responding Ubuntu 18.04.3 LTS

Hi,
Some other folks have same issue, including me:

I believe it’s not coincidence.

1 Like

Hi @Ian_Matyssik,

I didn’t receive the old database from you, but we’ve recently released Roon 1.7 (Build 511) which includes changes that should improve this behavior. Please try loading up your old database and give it a try and let us know how it goes!

You can read the full release notes here:

Thanks,
The Team at Roon Labs

1 Like

2 posts were split to a new topic: Roon crashing when playing to Sonos

@noris apologies for the delay, I was away and later busy for a while. It is great to see that you made some progress, really appreciate it. I will try to get back to Roon this week and check if the old backup is working. If I see further issues with the new version, I’ll share the backup with you. Sorry again for the delay and appreciate you making progress on the issue. Thanks.

1 Like

Hi @Ian_Matyssik,

No worries. We investigated a few similar cases and from the symptoms this does sound like it very well might be the same issue. Do let me know how it goes with restoring your backup, thanks!