Roon library corrupted

Core Machine (Operating system/System info/Roon build number)

Ryzen 2700 running RoonServer under WIN10

Network Details (Including networking gear model/manufacturer and if on WiFi/Ethernet)

Ethernet from Core to RpI

Audio Devices (Specify what device you’re using and its connection type - USB/HDMI/etc.)

RPi==(Toslink)==>Mutec 3==Coax==>iFi Pro iDSD==PL pre-amp, etc., etc.

Description Of Issue

For the 3rd time in a week, I have found, in the morning, that my Core machine has rebooted overnight.
Whether this is because of a Windows update, RoonServer bringing the machine down, or something else, I don’t know.

Problem is that eveytime this has happened the Roon library has been corrupted.
I get screens like the following -
image

One night, as an experiment I set Windows to shutdown at a specific time and for the Core machne to start at a time the next morning. The library was also corrupted then.

It seems like if the Core machine is shut down while RoonServer is still running then there is corruption.

Dropbox link created for Roon logs sent in a PM to @noris.

An addendum -

I restored from an August 1st backup. NFG. Still the same problem.

I deleted all my Roon folders and downloaded a fresh copy of RoonServer.
Restoring from the same August 1st backup worked this time.

WTF?

Hi @xxx,

I took a look through the logs and I’m seeing traces of corruption from last month, 7/9/20, so I don’t believe this database has been in a stable state for a while.

Re-occurring corruption like this sounds like you could be having an issue with the hardware, have you tried performing a disk check and check the RAM using Memtest86? Have you tried using a different Core to eliminate it altogether?

Does Windows Event Viewer logs have any more context regarding the reboot?

I’ll check…

Ya see, that is my freaking complaint about Backup.

It’s been backing up every day (and then some) without a whimper, but you’re telling me that the library
has been corrupted since July 9th.

If you can find that there is corruption then the Backup code should be able to do the same.

I’ve done a lot of curating in the past month. :rage:

@danny, the Roon Backup logic is unacceptable, especially for what seems to be a fragile database.

I understand your frustration here @xxx.

I know dealing with corruption is painful, especially re-occurring corruption, but when we have investigated similar behavior, especially after starting with a fresh database, we have often found hardware issues on the affected user’s machines, whether that be RAM/HDD/or other environmental factors (improper shutdowns/power surges/ect).

There is a feature request regarding this aspect and I have actively mentioned this feature request to the QA and Technical team in our meetings:

This is not to say that this feature will be implemented, as I cannot make such decisions, but this is to let you know that we’re taking your feedback here seriously and that the team is aware of your request for better backup checking.

Well, there are a couple of maintenance things I would love to have

  1. A database integrity check that is either manual or automatic
  2. Details on the library page showing what files were deleted, or associated with removed storage
    locations. To review before cleaning the library.

Am I to understand that if a WIN10 machine that is running Roon or RoonServer is shut down, whether because of a restart for a WIN10 update or because there was a shutdown scheduled in Task Scheduler, that this is an improper shutdwon that can result in corruption on the Roon library?

I might add that there is no way to have a timed, and therefore proper shutdown, of Roon or RoonServer.

Also, why is this thread unlisted? Other people might want to be aware of the same thing.

When you unlist a thread like that it seems like there is something to hide.

A post was merged into an existing topic: Windows 10 - Roon can’t start

If the shutdown is a normal, proper shutdown where all the apps close before the PC is rebooted, there shouldn’t be any issues with Roon exiting.

But if you are press-and-holding the power button or there’s a power surge and the Core looses power while Roon is writing database info mid-write, corruption can occur.

You posted to #support, and here we help users resolve issues. We unlist threads when we want to work one-on-one with the user to resolve an issue and not have others jump in with unrelated comments.

You have an issue here of re-occurring corruption and I want to help you get to the bottom of it. My initial suggestion of checking the hardware aspects first and foremost still stands, especially if this occurred on a fresh database.

If this is just about providing feedback about our backup process, you can certainly do so in #roon:feature-requests or #roon as those sections are meant for a discussion.

OK, accepted.

Before I go about restoring my library as of July 9th, I have a final question that I need clarity on.

If I schedule my Core machine to shut down thru Task Scheduler, is this considered a ‘normal’ shutdown, since RoonServer will still be running when WIN10 shuts down?

One other thing -

A robust database design won’t have that problem or will be able to recover from it, but so be it.

how so? it backs up, with incremental changes… if it goes corrupt for some reason, the backup can not know that. It will continue to back it up. luckily it is incremental, so you should have a previous backup state that works from before it got corrupt.

if your backups only go back N days, but your corruptness happened N+1 days ago, you probably need to change N or notice that it went corrupt earlier.

As for the fragility, it is quite uncommon for the database to go corrupt. we are talking about less than 0.001% of our users. You will hear about them all on the community site, but that doesn’t mean the problem is widespread or the database is corrupting for all.

If support can scan the logs and find corruption, then the backup logic can scan the logs, but this point has already been made in the Feature Request referenced above.

I’m not asking that the corruption be fixed, just that it be recognized and reported in a timely manner.

This is true, but according to @noris, after looking at the logs the corruption happened on July 9th and yet my apparent troubles didn’t start until this week. I did a lot of work over the past month and now, when I restore as of July 8th, all the work is gone.

Fool me once shame on you, fool me twice …

Therefore, my current backup scheme -

Doesn’t really solve much. It probably will save me from completely losing my curations, but when existing library corruption doesn’t show up for a month, then not so much.

Still would like an answer to this -

How do we notice? If there has been no reason to restore for longer than N then we are affected by the corruption with no redress.

1 Like

The short and simple version of my complaint.

ive asked @mike to make sure this gets into the core – if we detect corruption, we should not try to recover and continue. id argue we should crash/shutdown and prevent backups from backing up corrupt stuff.

2 Likes

Thank you for the consideration on this matter.

Peace.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.