Database Corruption After Unclean Shutdown Causing Backup Failures (ref#WP76NR)

Hi! What’s not quite right with Roon?

· None of the above quite fits

None of the above quite fits

· None of these quite match

Tell us what's going on

· Database Corruption After Unclean Shutdown — Backup Fails with VacuumOperation Error
Summary
After an unclean system reboot, my Roon Server database became corrupted. Roon functions normally (playback, browsing, library shows 93,319 tracks), but attempting a backup crashes with DatabaseCorrupt. I've exhausted all DIY repair options and need Roon's internal tools to fix the remaining corrupt records.

Account Details

User ID:
Machine ID:
Core Name:

Roon Version: v2.62 (build 1641) production on linuxx64
Platform: Docker on Unraid, BTRFS cache drive (NVMe)
What Happened
Unclean system reboot caused a BTRFS checksum failure on one LevelDB SST file (3236066.ldb) in broker_4.db
This file was unrecoverable (I/O error at filesystem level, confirmed by BTRFS scrub: 1 uncorrectable error)
I repaired broker_4.db by copying all 62 million readable key-value pairs to a fresh LevelDB (zero LevelDB-level errors)
Normal Roon operations work perfectly - 93K tracks, all storage online, playback and browsing fine
Backup fails — VacuumOperation.Run() → Database.Validate() hits corrupt FSE-serialized records and throws InvalidOperationException
Exact Error Stack Trace
[broker/database] corruption detected: Operation is not valid due to the current state of the object.
at FSE.BinaryReader.Read()
at FSE.BinaryReader.ReadUntilMarker()
at Messaging.FSEMessageDecoder.DecodeOneMessage(…)
at Messaging.FSEMessageDecoder.DecodeCommon(…)
at Messaging.FSEMessageDecodeSession.DecodeBinaryMessage(ByteBuffer s)
at Sooloos.Broker.Music.MusicDatabase.FSEDecodeWrap[T](Func1 cb)<br> at Sooloos.Broker.Music.MusicDatabase._TryGetValue[T](Object traceid, Byte[] key, T& msg)<br> at Sooloos.Broker.Music.MusicDatabase.TransactionCache3.TryGetValue(Sooid id, METADATA& msg, Boolean cache)
at Sooloos.Broker.Music.Library.NeedsUpdate(Sooid metadatasooid, Int64 contenthash, Nullable`1 type)
at Sooloos.Broker.Music.UpdateMetadata._UpdateMetadata
What I’ve Already Tried
LevelDB repair (plyvel.repair_db) - fixed structural issues but not FSE-level corruption
Deep-cleaned broker_4.db (copied all 62M keys to brand new DB, compacted, zero errors)
Deleted Cache directory
Deleted Orbit database (was showing Invalid Type: 10 on init)
Deleted all transport zone databases
Applied chattr +C to disable BTRFS CoW
Ran BTRFS scrub (clean except the one unrecoverable file, now deleted)
Excluded full disks (disk1, disk2) from Backup share to fix “No space left on device”
Docker image is latest available
Root Cause Analysis
The lost SST file contained FSE-serialized music metadata records. After the deep-clean repair, the LevelDB structure is perfect, but some remaining records contain FSE binary data that references objects that were in the lost file. When MusicDatabase._TryGetValue tries to deserialize these records via FSEMessageDecodeSession.DecodeBinaryMessage, FSE.BinaryReader.Read() hits invalid data and throws InvalidOperationException.

The corruption is in the FSE-encoded values within broker_4.db, not in the LevelDB structure. Only Roon’s internal tools can identify and remove/fix these specific records.

What I Need
Can Roon support use internal diagnostic tools to:

Identify the specific corrupt FSE records in my database
Remove or repair them so backup/vacuum can complete
I have the database available and can provide remote access or upload it if needed.

Backup Status
Latest Roon backup: October 19, 2025 (5 months old - not acceptable to restore)
I have a filesystem-level backup of the current database state
All music files are intact (93K+ tracks on array)
Log Files
Logs are available at: /data/RoonServer/Logs/ inside the container. The relevant entries are in RoonServer_log.02.txt and the current RoonServer_log.txt. I can enable diagnostics if needed. Tell us about your home network · Asus ax88u static ip

Restore a backup from before the unclean shutdown?

Update: Fixed it myself by reverse-engineering the FSE binary format

For anyone who hits this in the future, here’s what worked:

  1. The original problem was a BTRFS checksum failure on one LevelDB SST file (3236066.ldb) inside broker_4.db after an unclean shutdown. The file was unrecoverable at the
    filesystem level.
  2. I copied all readable key-value pairs from the damaged broker_4.db into a fresh LevelDB database using Python/plyvel. This fixed the LevelDB structural corruption but
    the backup still crashed with the VacuumOperation/FSE.BinaryReader error.
  3. I decompiled Messaging.dll and Roon.Broker.Core.dll using ilspycmd to understand the exact FSE binary serialization format that Roon uses internally.
  4. I wrote a full recursive FSE parser in Python that validates every record the same way Roon’s VacuumOperation.Run() does, checking document version, DataType bytes,
    field names, Sooid internal datatypes, and nested element structures.
  5. The scanner found exactly ONE corrupt record out of 62 million: an album metadata entry (MW0000205539) with an invalid DataType byte (0x54) at position 524 in its
    FSE-encoded value. This single 7KB record was crashing the entire backup system.
  6. Deleted that one record using plyvel. Roon re-fetches album metadata from its cloud service so nothing was permanently lost.
  7. Backup now completes successfully. 3GB uploaded, zero errors.

Preventive steps taken:

  • Applied chattr +C to disable BTRFS copy-on-write on the Roon data directory (known cause of Docker DB corruption)
  • Ran BTRFS scrub to confirm no other filesystem corruption

TL;DR: If your database is “corrupt” but Roon works fine for playback, the corruption is likely in a small number of FSE-serialized records, not the whole database. A
targeted scan and delete can fix it without losing your library.

1 Like

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.