[[ N.B. Original thread title “Roon Server suddenly stopped working reliably ~Wednesday December 16, 2018” ]]
Server is an Ubuntu 18.04 (was 17.04 till this morning), AMD Ryzen 1500X, with 32GB of 2667MHz memory, 512GB NVMe Samsung 960 root, 8TB Western Digital Red Pro.
Roon Bridge is an old Logitech Squeezebox Touch.
I’ve been using Roon successfully without any problems for a bit over a year and suddenly it’s completely flaking out. I first noticed an issue on Wednesday night when it refused to play a DSD64 DSF file. Obviously this needed to be downsampled to PCM 96kHz, but it had done that successfully for the last year. Things got progressively worse to the point where it wouldn’t even play normal CD FLAC files (44.1kHz/16bit). This morning nothing worked and I noticed that my old Ubuntu 17.04 was no longer supported. I upgraded to Ubuntu 18.04 and the Roon Server started working again but rapidly jammed. I’ver rebooted my server a couple of times and each time the Roon Server works for a while (less than 20 minutes) before jamming up and becoming unreachable by the Roon Controllers.
Is there an easy way to restart the Roon Server without having to reboot my server?
Where can I find Error Logs for the Roon Server so I can provide more diagnostic information?
Where can I find Upgrade Logs so I can see if a new Roon Server was automatically installed sometime this week?
What more information is needed to help diagnose this problem?
In /var/roon/RoonServer/Logs/RoonServer_log.txt I see this near the start:
Starting RoonServer v1.5 (build 363) stable on linuxx64
Also, I was watching that file when apparently it restarted on its own. I saw a “stack trace” go by in the file I was watching and by the time I got around to trying to copy it out, it appears to have rolled over to the RoonServer_log-06.txt:
(sigh) The system won’t let me post the section of log with the stack trace because it thinks that it’s full of links and, as a new user, I’m only allowed to post two links … Let me know how you’d like me to get you the stack traces and/or entire log files …
At the shell prompt:
sudo systemctl stop roonserver
sudo systemctl start roonserver
Thanks! At least that’ll make that part of the debugging easier!
So how can I provide expanded feedback on this issue, including the log which the Forum Software insists on interpreting as links? I’m completely dead in the water now. Nothing is working.
Let me tag @support for you, that’s how Roon Support gets to know what they need to look at. Typically, after getting the details of your configuration, they might provide a link where you can upload your logs. Good luck!
Thanks for contacting us regarding this issue, I see that you have been upgraded to Trust Level 1 and you should be able to post links directly to this page, you can also format log files using three accent mark (as in this: `) for cleaner formatting.
I have just tried enabling diagnostics mode for your account but since Roon Server is not operational, I fear that the report may not come in, I will PM you shortly an alternate upload method to get them over to us, but first I would kindly ask you to reproduce this issue once more, note the exact local time that it occurs at (ex.11:12PM) and after noting this timestamp only then proceed to upload those logs. That way we can take a look around that timestamp for any strange behavior on Roon’s end.
Please let me know the timestamp and I will PM you on how to get those log files over to us. Thanks!
Thanks you again. I had put Support Tag in the subject but someone named “Greg” edited it out — I resumed from Roon Labs because I hadn’t done it right. I’ll do it correctly in the future. I see that your use of the Support Tag has got “Noris” helping me. I’ll start interacting with him!
Thanks @Noris! I’ll respond to your PM!!
Thanks for sending those logs over, I can confirm they reached our servers and I have taken a look.
I am noticing a few DNS errors in the logs, I would try changing your DNS server to Google DNS or Cloudflare as the first step here just to get those out of the way.
As for the stack traces that you have mentioned, those do indeed seem strange. I will need QA to take a look and I have started a case for you with them.
Other than the DNS issues I saw, I also noticed your endpoints disconnecting randomly, and this leads me to believe that there is a networking issue as well here. Can you please provide a full topology overview of your network including model/manufacturer for your Router, model/manufacturer for any switches/range extenders/powerline adapters/ect? How is your Core connected to the network, is it connected directly to the router via Ethernet or by some other means?
Please let me know this info when possible and I will get it into your case for QA to review.
My internal network is pretty simple. I have a flat NAT 10.13.100.00/24 network behind a single ASUS RT-AC68U on Comcast. The RT-AC68U supplies wireless to the house and hardwired via a 16 Port 1Gb/s switch. The Roon Server and Logitech Squeezebox Touch are both hardwired.
The Roon Server machine is also my DHCP and DNS Server. The DNS Server is configured as a caching server, authoritative for my local network 100.13.10.in-addr.arpa and domain casacowper.com. It falls back to all the standard Root DNS Servers. I could add Google DNS and Cloudflare as we debug further, but that doesn’t feel like the direct issue here.
Nothing in any of this has changed in the last year that the Roon Server has been working successfully.
One thing that I should mention is that the local music collection is HUGE. There are 111,940 tracks currently in the system. Most of these are CD FLAC files, but there are an increasing number of High Resolution FLAC files and DSDs. I was thinking that I should try to trash the Roon Server database and rebuild it from scratch in case it’s become corrupted but I couldn’t see any controls to force that in the Roon Controller.
Let me know if I’ve skipped providing anything.
Tonight I’ll restart everything, reproduce the issues, and then send you new log files with the timestamp that I did all of this.
Thanks for uploading those new logs to us. From your latest PM it appears that the system has stabilized? Have you gone ahead and changed the DNS and this is possibly the reason why it has stabilized?
I recall seeing a similar thread to yours where an Ubuntu setup has seemingly gone awry and it was narrowed down to the Core’s RAM being faulty, it might be worth to check in that area as well if the issue comes and goes.
If the issue still persists, I agree with you here that trying with a fresh database in place will also give us a good data point. To do this on your Ubuntu setup, the following steps should work:
- Create a Backup of your current database and save it somewhere safe
- Exit Roon/RoonServer
- Locate your Database Folder and the folder called “Roon” or “RoonServer”
- Rename your Roon or RoonServer folder to “Roon_old” or “RoonServer_old”
- Start Roon or RoonServer again and see if you are able to reproduce the issue with a fresh DB
More info regarding how Roon is installed on Linux can be found here
Please let me know your findings when possible.
We’ll have to see if the system has “stabilized”. It’s started “working” again several times only to fail again within a day. Last night after several restarts it suddenly started playing files that it had refused to in the previous restart. I left it running in Radio Mode while going to dinner and it was still playing tonight. I’ll do more testing tonight when I get home … It’s so weird because everything was working properly for a bit over a year before the problems started last week.
If I do get to the point of rebuilding my Roon Database from scratch, I’ll let you know.
(sigh) As expected, it stopped working again. I’ll PM you with more logs.
Thanks for letting me know those timestamps and uploading those logs @Casey_Leedom. I have added that information to your case and it is pending review by the QA team, I will be sure to let you know once their report has been completed and they pass it back to me.
I had a meeting with the QA team regarding your case today and we have a few follow up questions as to better understand this issue, can you please let me know:
- Does the issue with your Squeezebox Touch happen for non-DSD256 content as well?
- Does this issue occur for any of your other non-squeezebox zones?
- Are you able to play DSD256 content to a Roon Remote such as your Macbook Pro?
- Have you tried going ahead and switching DNS servers to Google or Cloudlfare and has the issue remained the same?
Please let me know your answer to these questions when possible.
Thanks again for following up on this @noris!
For some of your team’s questions, I’ll have to answer them when I get back home tonight, but I’ll tackle those that I can now:
The issue with my Squeeze Box Touch as a Roon Bridge does happen with non-DSF DSD256 content. I’ve had it happen with plain CD Redbook 44.1kHz/16bit FLAC files. And again, note that all of this was working fine for just a bit over a year since I bough my lifetime Roon Subscription at the end of 2017. That is, things that used to work fine (playing DSDs, high-resolution FLAC PCM files (e.g. 96kHz/24bit, etc.) all worked flawlessly. (Have I mentioned yet how much I love Roon and your work? If not I apologize! It’s an awesome product with a beautiful architecture. (I’m a Software Engineer.))
I’ll test things out on my MacBook Pro as a Roon Bridge again. I tried this once last week and once the Roon Core Server was jammed, the MacBook Pro Roon Bridge also didn’t work.
I haven’t yet tried changing/adding new DNS Servers to my current list. My current /etc/bind/db.root contains [A-M].ROOT-SERVERS.NET. I can add the Google and Cloudflare servers as well.
And, last night after I sent you the latest logs (RoonServer-dec19-logs.zip) I bounced the Roon Core Server again with @Fernando_Pereira’s
systemctl stop/start roonserver commands and suddenly it was working again for the rest of the evening. Weird. I will be unsurprised if it’s hanging again tonight …
Thanks for providing that info and for the kind words . I have gone ahead and brought your case to the devs as well and this was their remarks:
This seems like some kind of hardware failure, especially since the issue is intermittent, it feels like the RAM or CPU is partially faulty, would you be able to run a RAM test on that machine? I have found this guide as how to check the RAM on Ubuntu
What is the behavior of the squeezebox if you try to temporarily host the Core on another machine? Does it behave the same? I would try running a full build of Roon on your Macbook (and not just as a RoonBridge device) and see if the behavior is the same.
Do you still have DHCP active on your router or are you using your current Core to manage all of the DHCP connections on your network? This might be some kind of double NAT issue where the requests are not making it through because of the DHCP & DNS hosting on your primary Core.
Please let me know when you have had a chance to run the above tests as they would provide some needed clarification as to what could be going on here. I look forward to your feedback and thanks again for working with us here to troubleshoot the issue!
I doubt that there are any Hardware Errors. Nothing is showing up in the kernel dmesg logs. But sure, I’ll try the memory test you pointed at.
Hosting the Roon Core on a different machine will be … “difficult” … the /music 8TB volume is a Linux ext4 File System. The only real option would be to grab a random x86 box and install the “disks” into that (the 512GB / (root) volume is an NVMe as well). “Interesting” challenge …
DHCP is disabled on the Asus RT-AC68U. Only the host running the Roon Core, “Hypatia”, is doing DHCP and DNS. [[ And, I’ll give you a dollar if you know the origin and appropriateness of “Hypatia” without having to look it up … ]]
Hypatia has always been one of my heros. She was a pagan philosopher and mathematician in the 4th century. An intellectual and a strong, independent woman before it was trendy and without any support structures. Although she was generally sympathetic to the Christian beliefs she was torn apart by a mob of Christian monks under the flimsy excuse that she was witch. In fact, she was caught in the middle of a political struggle between two Christian clerics. Nothing really changes.
You owe me $1.