Since my last post, I another restart sadly permanently broke playback. So, 12 out of 13h no playback (local nor Qobuz) is succeeding. Another restart just now provides no relief.
UPDATE Random observations to add:
ROCK refuses to reboot some of the time, while Roon is failing to playback on some device.
Then when rebooting succeeds, it still fails playback.
Meanwhile, ROCK being inscrutable makes it impossible for me to track any system health indicators beyond the Roon log files.
Loading the home screen takes about 2 minutes after a restart. O darn, itās still not fully loaded. It apparently loads top-down in sequence.
History also fails to load, maybe Iāll let that sit for some time too: Nope. Not even after 10 minutes.
No rescans stuck this time. Maybe itās because of the failed backup? Redirecting my backup to a different location⦠Just gets stuck doing database snapshot. Yup. Restarting the client shows it stuck in exactly the same state as usual: no playback, no artist page load, no history page load. No new backups have been made.
To rule out complete hardware failure, let me move this entire party to different hardware. It would be nicer if I still had my old linux installation so I could actually diagnose it live.
@benjamin Thatās why I told you guys⦠I assumed you got that the first time you replied
Iāve been running with the faulty DIMM removed, but without restoring any database. Looks fine now. However, I realized that a LOT of work went into tidying up my Roon library aside from the parts I already mentioned before (playlists, MUSE profiles, history and statistics). So Iāll probably select a suitably old backup (I should have one from early 2024) and re-test with that somewhere next week. If that works ( ) then I should be out of the woods.
Iāve run with a new database-from-scratch for a few days, realizing that I was missing a LOT of album edits (identifying box-sets and customizing cover-art between different re-releases) as well as all the play-lists and MUSE profiles.
So I went back to the recent oldest back-up I had (December 25th 2024). It ran well for about 8d, but then disaster struck and the database was found corrupt at a restart.
So now, Iām trying my luck with a very old backup that I found lingering on an old medium (Jan 13th 2023). Thatās really old. But at least contains some of the work Iāve done.
Sadly, I still have intermittent glitches
covers not loading until I zoom the art in Album View
specific library albums missing (e.g. āViola Sonata, Shostakovichā)
the search for that specific string consistenly crashes the Windows client, locks up the Android client (until force closed).
Worse, both these clients got stuck in a crash loop because a restart brings back the same search screen. The only way to get out was to reinstall the clients
Interestingly, the iOS client executes the same search without problems (literally repeated it from the ārecent searchesā drop down, no server restarts in between) and shows the album that was missing from the play history (no album art, āalbum not foundā when clicked)⦠Even AFTER that, the Android and Windows clients still kept crashing on the same search.
I donāt know what to do here.
[PS. Iāve repeated about an hour of memtest86 a few days back just to make sure no new failures had appeared with the remaining DIMM - it ran completely clean, canāt be paranoid enough]
Sorry to hear youāre still running into issues. Could you please share a set of Roon logs from the issue windows machine? Here are the directions found here and send over a set of logs to our File Uploader.
@benjamin Thanks for the interest. Iāve not been able to reproduce the crashes (I think (?) Iāve updated server from 1510 to 1517 in the mean time). However, the weird album art glitches still present themselves:
Weāve taken a look at the diagnostics you shared from the client-side and compared them to logs from RoonServer during the corresponding time period. Here are the patterns we observe:
RoonServer is logging generic network failures when requesting content from Qobuzās servers. Qobuz playback requests occasionally time out. This occurs when playing to System Output.
Network reachability changes interrupt the connection between RoonServer and the upstream servers responsible for certain background processes.
Certain image requests from Qobuz or Roonās discovery service return 404. There arenāt any caching issues.
Roon doesnāt show signs of corruption events during indexing.
Reviewing this thread, it seems youād like to target either RAM/resource constraints or latent corruption as the underlying problem here and not focus on loading failures.
For background: Roon attempts to detect corruption āon the flyā during background indexing by checksumming every page in the database, on a single core. That doesnāt guarantee that tiny changes canāt accumulate.
The only way to ensure that a DB doesnāt contain latent corruption at this point would be to fully recreate it; we recognize how time consuming and frustrating this can be. However, a current hardware failure inducing corruption independently in each restorated Backup is also a possibility.
What do you see in the RoonOS Web UI when you reboot and it fails? What happens if you reinstall RoonOS itself?
Also, we understand youāre resistant to troubleshoot your network. But what is the basic topology serving this ROCK?
I donāt remember the exact message, but the web page dryly informs me that āthe roon server failed to rebootā (or similar message). Iām not currently considering re-implementing ROCK (since it doesnāt give me the basic means of system monitoring or maintenance; I prefer to keep my system up-to-date and be able to monitor for health).
Not really hesitant, but I had the feeling it was unlikely culprit. I think the finding of the faulty RAM DIMM confirmed that hunch. The network topology is really simple: itās all directly-attached 1Gbit LAN cables to the source router (Asus RT-AX92U, which is 2.5Gbit capable) connected to a fiber-optic WAN operating 1Gbit network.
The LAN is rock solid (the non-wired clients use a WiFi mesh with 3 WiFi6 access-points, bridging the the same subnet as the LAN, but the server and desktop are connected by cable anyways).
Thatās good to hear. Iāll probably keep running with the current DB then. I have a suspicion that particular album I focused on to show āmissing artworkā is somehow no longer available on Qobuz. The fact that the search triggered a crash is no longer reproducible so Iāll write it off as a freak incident.
The only reachability changes should be:
A daily router restart at 4:30am (the router firmware has a bug in their TZ database so for one week that would appear as 3:30am in the logs). Iām unlikely to be listening at that time, but it does happen
Weāve gotten an upgrade enabling 8Gbit upstream fiber, which left us with no internet for a few hours on April 1st (not a joke, or at least not a good one )
I might have tinkered a bit with the firewall setup (to allow for ARC) but that was longer ago, I think. Iāll leave the network config stable for a few weeks at least, so we can see that the only āholesā are at 4:30am now.
slight update to keep this alive.
Another update made things unstable again. Now went back to regenerating entire database from scratch Will post conclusions later
Keeping it alive for a bit. Itās stable - but I lost all the data I would have liked to preserve. I still ow you a list of observations of how things kept looking/acting when unstable. Iāll get around to it
Sorry to hear youāve lost some data ā thatās really frustrating . Even if this thread closes before youāre able to follow up, donāt worry ā weāll gladly reopen it and link any related threads so everything stays clear and connected.
Some time has passed since our last correspondence, but I wanted to follow up with you here, as there was a bug fix for the specific error type you encountered in our last release. If you wish to try your old backup again, you are free to do so.