DB consistently corrupts after 1.1 upgrade

Hi,

Last night I upgraded Roon to 1.1 with success. Beautiful version by the way!
I then decided it was a good idea to back up my library, of course after letting all file rescanning to complete. I closed Roon, backed the folder up and called it a night.
This morning I launched Roon and got the message about the DB being corrupt and that I should contact support. About 1 second after that message appeared, Roon crashed.
I tried using the backup I did last night of the 1.1 folder, but got the crash again. I guess I backed up an already corrupt DB.

Luckily I had a backup from the 21st. I installed it, launched Roon and my library was back, and Roon told me that the update is available. I updated, waited for the rescans to complete, then closed Roon and launched again to see if it worked this time. The DB was corrupt again and Roon crashed.
I repeated this experiment with the 8/21 backup one more time, and this time Roon did not crash after loading, but I still had an empty library and also no configured audio devices.

Also, each time I launched after an upgrade, Roon gave the message of “upgrading DB due to software update” before telling me that the DB is corrupt.

It seems that something consistently goes wrong when my DB gets upgraded. Should I send you guys logs or should I just send you my library for you to fix?

Thanks

I posted about this same issue after I did my original update. I ended up with nothing showing in my library and no audio devices. Solution was for me to go back to an old Roon backup. At first I just replaced the database folder. That did not work. So I replaced the full Roon Library folder. Then when I reopened Roon it went through the database upgrade procedure which take around an hour. So I recommend replacing the whole Roon library folder and see if that gets you back in business. All library info, playlists and audio devices all then OK. All OK for the past day since I did this. Be sure to keep a good backup copy of your full Roon library folder. I use a Mac Mini running 10.9.5 for my core. I suspect they’ll send you a PM to help as they did for me. good luck.

Hi @PRE, thank you for replying. When I went back to my old backup I did replace all folders and not just the database folder. I did that twice, but still after everything gets upgraded again and is rescanned, once I shut Roon down it won’t start up again.

@mike, here are a few log traces from the first time Roon crashed while starting up this morning, that I found suspicious. Maybe this will give you a clue about what’s happening:

08/28 11:10:23 Warn: could not find profile from 1 profiles in account

08/28 11:10:34 Info: Is Upgrading @ at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at Sooloos.Broker.Music.Library.OnUpgrading()
at Sooloos.Broker.Music.LibraryMutationEnv.Finish()
at Sooloos.Broker.Music.Library.EndMutation()
at Sooloos.Broker.Music.Module.ev_exit()

08/28 11:10:34 Info: [loadstatus] IsLibraryUpgradingDatabase False => True
08/28 11:10:35 Trace: [clumping] clumping 183 tracks
08/28 11:10:41 Trace: [library/compute] computing 27 tracks
08/28 11:10:41 Critical: Library.EndMutation: System.InvalidOperationException: Album 2018863 not found
at Sooloos.Broker.Music.Library.ComputeEnv.GetAlbumById(Int64 id)
at Sooloos.Broker.Music.Library.ComputeTrack.get_Album()
at Sooloos.Broker.Music.Library.b__93(ComputeTrack x)
at System.Linq.Enumerable.Any[TSource](IEnumerable1 source, Func2 predicate)
at Sooloos.Broker.Music.Library.PerformWorkPartMunge(ComputeEnv env, List1 album_tracks, IDictionary2 load_tracklitedata)
at Sooloos.Broker.Music.Library.Compute(IList1 tracks, IDictionary2 load_tracklitedata)
at Sooloos.Broker.Music.LibraryMutationEnv.Finish()
at Sooloos.Broker.Music.Library.EndMutation()
08/28 11:10:41 Trace: [library] endmutation in 16105ms

08/28 11:10:42 Critical: while dispatching events: System.InvalidOperationException: Operation is not valid due to the current state of the object.
at Base.Result`1.get_Value()
at Sooloos.Client.Models.AlbumFocus.<>c__DisplayClass13.<restore_from_bookmark>b__12(Result`1 res)
at Sooloos.Broker.Music.LibraryApi.<>c__DisplayClass29f`1.b__29d()
at Sooloos.Concurrency.Version.DispatchPostHandlers(Object o)
at Sooloos.Concurrency.SharedState`1.FlushEvents()

08/28 11:10:42 Error:
Local Time: 08/28/2015 11:10:42 +03:00
Device Serial Number: 11DF89AE-C5E6-40DA-A0EC-D7D6E7849AEF
User Id: 1b8018e8-d3d9-4c91-9b6b-814689f7b689
Sooloos Version: 1.1 (build 51) stable
OS Version: Microsoft Windows NT 10.0.10240.0

Application Domain: Roon.exe
Assembly Codebase: file:///C:/Users/Lior/AppData/Local/Roon/Application/Roon.exe
Assembly Full Name: Roon, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null

CPU0 Description: Intel64 Family 6 Model 23 Stepping 6 3000 MHz (64-bit)
CPU0 Num Cores: Physical: 2 Logical: 2

GPU0 Description: NVIDIA GeForce 9500 GT 512 MB
GPU0 Video Mode: 1920 x 1200 x 4294967296 colors
GPU0 Driver Version: 9.18.13.4174

SCREEN0 Device Name: \.\DISPLAY1 (primary)
SCREEN0 Bounds: {X=0,Y=0,Width=1920,Height=1200}

Exception Source: Base
Exception Type: System.InvalidOperationException
Exception Target Site: Result`1.get_Value
Exception Message: Operation is not valid due to the current state of the object.
Exception Data: none

–[ Stack Trace ]------------
Base.Result`1.get_Value()
Base.DLL, IL 13, N 48
Sooloos.Client.Models.AlbumFocus/<>c__DisplayClass13.<restore_from_bookmark>b__12(Result`1 res)
Roon.Client.BrooModels.DLL, IL 0, N 31
Sooloos.Broker.Music.LibraryApi/<>c__DisplayClass29f`1.b__29d()
Roon.Broker.Core.DLL, IL 0, N 67
Sooloos.Concurrency.Version.DispatchPostHandlers(Object o)
Roon.Broker.Concurrency.DLL, IL 56, N 123
Sooloos.Concurrency.SharedState`1.FlushEvents()
Roon.Broker.Concurrency.DLL, IL 84, N 223
Sooloos.Broker.State.BeginAccess(List`1 out_mutated_objects)
Roon.Broker.Core.DLL, IL 38, N 83
Sooloos.Broker.State.EndAccess()
Roon.Broker.Core.DLL, IL 37, N 91
Sooloos.Broker.ApiRoot.EndAccess()
Roon.Broker.Core.DLL, IL 0, N 10
Broo.Engine.DeviceTarget.GuiMainLoop()
BrooEngine.DLL, IL 426, N 1257
Broo.Engine.Engine.GuiMainLoop()
BrooEngine.DLL, IL 0, N 10
Sooloos.Application.Main(String[] argv)
Roon.exe, IL 1531, N 2847

That last trace is the last line of the log file.
I’m also wondering why Roon would “clump” and “compute” anything if no changes were made to the library since the last time it was shut down?

Thanks

This is what I get after getting the “couldn’t load your library” message.

Anyone?

Another log snippet from now:

08/29 11:28:47 Info: Is Upgrading @ at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at Sooloos.Broker.Music.Library.OnUpgrading()
at Sooloos.Broker.Music.LibraryMutationEnv.Finish()
at Sooloos.Broker.Music.Library.EndMutation()
at Sooloos.Broker.Music.Module.ev_exit()
at System.Action.Invoke()
at Sooloos.SynchronizationContextThread.OnExit()
at Sooloos.SynchronizationContextThread._Dispatch(SendOrPostWrapper& ret)
at Sooloos.SynchronizationContextThread._Go()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
08/29 11:28:47 Info: [loadstatus] IsLibraryUpgradingDatabase False => True
08/29 11:28:48 Debug: server said:
08/29 11:28:48 Trace: [clumping] clumping 183 tracks
08/29 11:28:54 Trace: [library/compute] computing 27 tracks
08/29 11:28:55 Critical: Library.EndMutation: System.InvalidOperationException: Album 2016303 not found
at Sooloos.Broker.Music.Library.ComputeEnv.GetAlbumById(Int64 id)
at Sooloos.Broker.Music.Library.ComputeTrack.get_Album()
at Sooloos.Broker.Music.Library.b__93(ComputeTrack x)
at System.Linq.Enumerable.Any[TSource](IEnumerable`1 source, Func`2 predicate)
at Sooloos.Broker.Music.Library.PerformWorkPartMunge(ComputeEnv env, List`1 album_tracks, IDictionary`2 load_tracklitedata)
at Sooloos.Broker.Music.Library.Compute(IList`1 tracks, IDictionary`2 load_tracklitedata)
at Sooloos.Broker.Music.LibraryMutationEnv.Finish()
at Sooloos.Broker.Music.Library.EndMutation()
08/29 11:28:55 Trace: [library] endmutation in 19349ms

It still tries to clump 183 tracks and compute 27. But now it fails on a different album ID.
I don’t know if it’s the same album or not because I don’t know if these album IDs are global or local.

Hey @Chikolad – sorry for the trouble here. I know we’ve looked at the corruption before, but obviously something is still wrong here.

Can you zip up the logs folder, send it over, and let me know when it’s uploaded? I only need the logs for now, not the complete database.

We’ll look into the logs and hopefully then we can get this fixed once and for all. Appreciate your patience here @Chikolad. Talk to you soon!

I managed to “fix” the issue temporarily and to create a v1.1 backup folder that loads successfully. However, when I shutdown and restart, it fails again.

I took several steps to reach this stable state, so I don’t know which one(s) worked. This is after upgrading from a 1.0 folder, before shutting down:

  • First I went to Settings->Setup and saw that there were ~100 files still being audio-analyzed but it was stuck on the first one. I changed the speed setting to “fast” and it quickly processed all of them. I then turned background analyzing off completely so that it won’t try to analyze anything on startup.
  • I did a library cleanup of deleted files from the settings menu.
  • I went through all albums and grouped together albums that were spliced to 2 or more parts. This happens to me from time to time. I figured maybe if Roon had to “clump” together less tracks on startup maybe it wouldn’t fail.

I closed Roon and copied the folder in case it worked, so that I have a stable copy, which I now have.
When I start Roon from that copy (I have done this 3 times now and it’s always the same), I get the “updating due to software upgrade” message and then loading finishes. When I click on the turning circle in the upper right corner I see that there are 824 tracks left to rescan, and then that number drops to 808.

When I go to setting and turn “Background Audio Analysis Speed” back on, there are 389 tracks left to analyze.

Also, each time I start from this stable copy, a different track shows up as the currently playing one (i.e. the last track that was played before the shutdown). On the screenshot above it says “Funky Bounce” by “Zapp”. I never played that one. Obviously something is wrong with the track IDs if I get a different one each time.

Finally, each time I start from this copy I get the same critical trace in the log:

08/29 19:10:20 Trace: [clumping] clumping 1505 tracks
08/29 19:10:20 Trace: [library/compute] computing 336 tracks
08/29 19:10:21 Critical: Library.EndMutation: System.InvalidOperationException: Album 2012719 not found
at Sooloos.Broker.Music.Library.ComputeEnv.GetAlbumById(Int64 id)
at Sooloos.Broker.Music.Library.ComputeTrack.get_Album()
at Sooloos.Broker.Music.Library.b__93(ComputeTrack x)
at System.Linq.Enumerable.Any[TSource](IEnumerable1 source, Func2 predicate)
at Sooloos.Broker.Music.Library.PerformWorkPartMunge(ComputeEnv env, List1 album_tracks, IDictionary2 load_tracklitedata)
at Sooloos.Broker.Music.Library.Compute(IList1 tracks, IDictionary2 load_tracklitedata)
at Sooloos.Broker.Music.LibraryMutationEnv.Finish()
at Sooloos.Broker.Music.Library.EndMutation()
08/29 19:10:21 Trace: [library] endmutation in 1421ms
08/29 19:10:21 Trace: [library] endmutation in 61ms

It’s always the same album ID: 2012719

So now I have a copy of the folder that consistently fails with the same album, and I believe that’s what causes Roon to not load once I shutdown.

Any way to know which album is 2012719?

Hey @mike.
Sure, I’ll send the logs over. Where should I send it to?

I’ll send you 2 zips. One with logs from a folder that doesn’t load at all, and one from this stable copy I created.
Also, look at my findings above that I wrote before seeing your reply.

Thanks!

@mike, I emailed the zipped log files to contact@roonlabs.com

Thanks again!

I’ve got them @Chikolad – will take a look today and follow up. Sorry for the trouble!

Thanks @mike, I’m curious to find out what’s causing this problem.

Hey @mike, any findings?

Hey @Chikolad – I’m discussing this with the team, but like I mentioned a few weeks ago, this is really starting to look like a bad hard drive, or some other hardware issue.

Since May, we’ve only seen 2-3 other users experience this issue more than once, and one of them was ultimately traced to a failing drive. I’m happy to look at your database for the 5th or 6th time, but since we suspected this was hardware-related a few rounds ago, my recommendation again would be to check your hard drive and consider replacing it. Then, try running from one of your earlier backups/repaired databases again.

Like I said, I’m happy to look into the database again, – feel free to upload the Roon folder somewhere, PM me the link, and I will have it looked at as soon as possible.

I know you’ve had a number of issues here, and I really do apologize for the trouble @Chikolad – it’s definitely not what we want or expect. Because it’s so outside the norm, I just want to be clear what the signs point to, as opposed to wasting your time waiting for another repaired DB, that could go bad again once it’s running on your system.

Let me know if that makes sense @Chikolad. Thanks!

Hi @mike, thanks for your reply.
Yes, that makes sense to me and I will take the steps to check my hard drive(s). I was thinking it was about time to get a new desktop anyway, so I might just do that. However, I do think that you guys should put an effort to find out what kind of error that is and whether you can protect yourself from it. I’m not the only one who experienced this and I wonder if all others have failing disks too. If it’s a random disk failure error, why is it that every time I upgrade the db from 1.0 to 1.1 I get the same error consistently? The only way that makes sense to me is if the 1.0 db already has a corruption that doesn’t keep it from functioning well. I’m not saying that you’re wrong in your assumption, I just think it’s worth examining closely.

Any suggestions about how to find out if my HDs are failing?
I ran chkdsk /R (thorough test) on my system drive and these were the results (Not sure if OK or not):

Checking file system on C:
The type of the file system is NTFS.

A disk check has been scheduled.
Windows will now check the disk.

Stage 1: Examining basic file system structure …
272384 file records processed. File verification completed.
7114 large file records processed. 0 bad file records processed.
Stage 2: Examining file name linkage …
406120 index entries processed. Index verification completed.
0 unindexed files scanned. 0 unindexed files recovered to lost and found.
Stage 3: Examining security descriptors …
Cleaning up 660 unused index entries from index $SII of file 0x9.
Cleaning up 660 unused index entries from index $SDH of file 0x9.
Cleaning up 660 unused security descriptors.
Security descriptor verification completed.
66869 data files processed. CHKDSK is verifying Usn Journal…
40569600 USN bytes processed. Usn Journal verification completed.

Stage 4: Looking for bad clusters in user file data …
272368 files processed. File data verification completed.

Stage 5: Looking for bad, free clusters …
42840808 free clusters processed. Free space verification is complete.
CHKDSK discovered free space marked as allocated in the volume bitmap.

Windows has made corrections to the file system.
No further action is required.

312567807 KB total disk space.
140718724 KB in 168987 files.
96704 KB in 66870 indexes.
0 KB in bad sectors.
389143 KB in use by the system.
65536 KB occupied by the log file.
171363236 KB available on disk.

  4096 bytes in each allocation unit.

78141951 total allocation units on disk.
42840809 allocation units available on disk.

Internal Info:
00 28 04 00 56 99 03 00 db a7 06 00 00 00 00 00 .(…V…
d7 3f 00 00 2f 00 00 00 00 00 00 00 00 00 00 00 .?../…

Windows has finished checking your disk.

I also ran the Western Digital Data LifeGuard Diagnostics tool and the shallow tests show that the SMART status for all drive is OK.
I’m now running the extended tests for my system drive and for the drive where all my music is. These take a while and I’ll report here if I see any problems.

I would be happy to receive other suggestions about how to check if my drives are the problem for Roon.
Thanks

@mike, I just read your post in another thread about “latent 1.0 issues” and as I wrote above I believe this is what happened with my 1.1 upgrade.
While I try to check for problems with my hard drives, and until I get a new desktop, I think I will try installing a clean 1.1 and build my library from scratch. I will then see if it stays stable.
Would still love suggestions on how to test my hard drives.

Thanks again.

Sorry for the slow response here @Chikolad – there are a lot of ways to test this, but in the past, I’ve used Windows Event Viewer to look for issues. I was able to find some drive errors under Windows Logs > System, so maybe start there and let me know what you’re seeing ok?

I’ll discuss with the team and try to get some more feedback here as well. Thanks @Chikolad!

Thanks @mike, I looked in the Event Viewer now and didn’t find any errors or warnings that seemed relevant to HDD or file errors. The HDDs also passed all exhaustive tests I ran on them so far.

But now that I have installed a clean 1.1 the problem hasn’t recurred yet. Roon did crash once while idle with a Windows message “The application has stopped”, and when I looked at the logs I found some NullReferenceException(s) near the crash. The trace level was debug so I’m not sure if that’s an “allowed” exception or not. I can look for it if you want.

Cheers