Recurrent system hangs in Euphony OS with Roon

Core Machine (Operating system/System info/Roon build number)

Euphony OS (linux based) / Ryzen 1700, Asus ROG Strix B450-i gaming, 16 GB crucial DDR4-2666 memory, Samsung 960 Evo nvme boot drive, Western Digital 1 TB m.2 SSD music storage / Roon version 1.7 Build 521

Network Details (Including networking gear model/manufacturer and if on WiFi/Ethernet)

Gigabit wired network with Netgear and Uptone audio switches in use

Audio Devices (Specify what device you’re using and its connection type - USB/HDMI/etc.)

Euphony OS endpoint with RoonBridge, DietPi with RoonBridge. No audio devices are directly connected to the Roon Core

Description Of Issue

This represents my last attempt to problem solve an irritating issue. After previously running Roon Core on a Windows Server OS for years without problem, I migrated to Euphony OS. I love it in every way except one - every 24 to 48 hours, the system running Core will hang. By “hang,” I mean, the computer is on, but it is not responsive, all network activity as designated by activity lights on the mainboard is off. The passively cooled case is quite warm after being in this state for a while.

In trouble shooting thus far, I have replaced every piece of hardware in the machine except the processor and hard discs (this includes an external linear power supply, the DC-ATX converter, the memory, and the mainboard. The exact behavior has persisted through all of these changes. I would add that the problem had previously occurred when using a high end active CPU cooler, before moving to the passively cooled case (the problem cannot be cooling).

I have reinstalled a fresh copy of Euphony using a clean, new downloaded image, and the problem persists. I temporarily went back to Windows Server 2012, and the system ran fine.

I have contacted Euphony support and provided logs. I am told that these do not provide any clues as to the cause. I was also told that it was suspected Roon was the problem.

I really like the Euphony OS and would rather not move to something else, but I can’t have an unstable machine running Roon Core. Therefore, I would like to see if anyone at Roon can offer any suggestions before I give up.

Thank you.

Hi @Ryan_Fajardo,

Just to verify, if you left the machine running for the same amount of time without running Roon does the same problem occur?

I have had it running a different audio player (Stylus from within the OS) for up two days without a crash. Admittedly, I would probably need to run it for longer to be 100%.

This is a good point, and I will turn Roon off again and see if I can go a week.

Personally, I think it would be unlikely Roon itself were to blame. Have you heard of any such behavior before?

Hi @Ryan_Fajardo,

Generally, Roon shouldn’t cause the entire machine to hang/crash. It is possible that Roon is using drivers that have an issue / are corrupt that other applications aren’t using (we’ve seen this happen on Windows), but the underlying issue in these cases isn’t Roon, just what Roon is trying to interact with.

Hi Dylan,

I appreciate your helping me with this issue, and your response makes sense.

I have had Roon turned off on the system since my last post, and so far no hiccups. I will continue to update once I reach the 3 to 4 day mark, as it has never gone that long without crashing when Roon is running.

A general question to you: how likely do you think it will be to find and correct the issue? As much as I like Euphony, I am more wedded to Roon. If you think this will be a long, drawn out process or unlikely to succeed, I would probably look into other Linux OS’s or go back to a Windows environment.

In the interim, I may post the problem on the AudiophileStyle forums, as there are some people using the Euphony OS there. I can also send you the logs I have generated thus far once we get to that point.

Thanks again, and more soon.

Hi @Ryan_Fajardo,

I appreciate the update here. As for what the cause may be, and tracking it down, it’s hard to say for sure. I’ve not used Euphony OS myself, a search on our forums shows that it looks like some customers are running this type of setup correctly. It’s not something we regularly test with, but our technical team might be able to provide some further feedback on what to look out for here.

Here’s what I suggest — The next time you notice this happen, let me know the time in your local timezone that you see it occur. I’ll enable diagnostics so we can see what was happening in Roon at the time. While this seems like a system-level issue, we might be able to learn something about what Roon was trying to use that caused the problem. It’s hard to say for sure if this will have the information we need since the diagnostics report will lack system-level information, but it’s worth a shot. I’ll then bring this to my next meeting with the technical team and see what feedback they have.

Hi Dylan,

I appreciate your willingness to help me out, as I recognize this issue is in a grey zone related to inter-operability of two systems. I have had no crashes since turning Roon Server off. I am in the eastern time zone.

I discovered another user on the Audiophilestyle forums who has had the same problem as I have, and also experiences it when using Audio Linux and Roon Server. It seems this may affect a small number of users… who knows, it makes me wonder if there is a setting I need to adjust in the BIOS.

Regardless, I would like to solve it, if for nothing else to help out another user. So, I will re-enable Roon Server and let you know the next time the hang happens.

Thank you for going above and beyond.

1 Like

Apologies Dylan, this has been a busy week for me.

I have been intermittently troubleshooting without any real additional iinformation. The system was fine with Roon Server off. I tried to run Euphony OS directly off of a thumb drive, and it hung around 36 hours later (with Roon Server on). So, I just performed a fresh installation. I would expect another hang over the weekend sometime, so perhaps we can strategize to turn on diagnostics sometime next week if that works for you.

Ryan

I have discovered one possibility that I wanted to run by you before we go to diagnostics.

In moving my core to the Euphony OS, I performed a database restore. I have repeated this database restore each time I performed a reinstall. The previous core was setup in the same way as the current system, but drive letters were different. I finally went into the Library portion of the settings menu yesterday and saw the “clean up library” button. When clicking on this, I had some 6000 files that were not associated with a storage location plus around 30 deleted files. I cleaned these up and the system has been running for around a day and a half now (not long enough, but encouraging). Do you think this could be accounting for a system hang?

Fingers crossed heading into the week.

Ryan

Hi @Ryan_Fajardo,

It seems unlikely that this was contributing to things — Clean Up Library removes any database records that are no longer tied to a storage location, but there isn’t much that this would change. Roon isn’t try to reach those files or anything, it’s just a record of the files that used to exist.

Well, rats. That means I am likely to have a hang in the near term. Once that reoccurs this week, I’ll let you know.

1 Like

I had a system hang sometime between 1 pm and 5 pm EST. I realize that is a broad range, but that is between when my wife last checked for me and I arrived home to find out.

If that range is too broad, I can try to catch you on the next one.

Ryan

Thanks, @Ryan_Fajardo — I’ve enabled diagnostics and once the report comes in I’ll have the team take a look and will follow up with you soon.

Hi @Ryan_Fajardo,

Would you kindly use the directions found here and send us over a set of logs using a shared Dropbox link? Thanks!

Yes…busy at work this week but should be able to do this tonight or by the weekend.

1 Like

Dylan,

Here is a link to the support package in my Google Drive. If you have any issues viewing, please let me know.

Thanks for sending that over, @Ryan_Fajardo — I’ve passed it along to our team and will follow up with you soon.

Thank you.

I will not be surprised if you do not find anything. I have come across some information suggesting that there may be a problem with C-states or Power Supply Idle Voltage and at least some AMD processors when used with Linux. Only for interest, I will include a forum with the best information on the topic:

https://bugzilla.kernel.org/show_bug.cgi?id=196683#c194

Some of the behavior is eerily similar to what I am experiencing. I am currently testing out some BIOS settings, so won’t be sure for a few days. Obviously, these things should affect the system no matter whether Roon is the software - I can only surmise that I didn’t have alternative player on long enough.

Regardless, I will look forward to what you come up with in the logs.

Hi @Ryan_Fajardo,

I spoke with the team about this and we reviewed the logs you shared. Unfortunately, as you guessed, there wasn’t much in the logs that seemed to point to anything specific on the Roon side of things.

Yes, the issue that you’ve shared here definitely seems like it might be related to what you’re experiencing. Definitely let us know how things go with the BIOS changes!

Hi Dylan.

Today marks the longest interval the system has gone without a hang. I have disabled global C-state control in the BIOS and enabled a “performance enhancement” setting which increases the Vcore and presumably clock speeds a bit. I may have achieved system stability at the cost of increased energy consumption, but a step in the right direction.

As far as why this happens when Roon is activated and not another player? That is a curiosity at this point. The two options are that either the instability is still there, but I didn’t leave the system in that state long enough to detect it, or there is something different with idle processor utilization with Roon - for that, I’d have to defer to you.

Some people have used some kernel parameters to mitigate or resolve the issue in the attached thread, but that is not your issue per se. I mainly point it out for completeness.

I will update this thread with any new major changes or discoveries.

Thank you for your efforts in helping me out!

1 Like