Can Roon's database be loaded from a ramdisk?

evand · April 19, 2015, 4:00pm

Due to the size of my library and because I want my music interface to be as quick as possible I have taken to creating a ramdisk, copying my Logitechmediaserver database to ramdisk and pointing to it on startup of my server. On shutdown of the server a rsync copy of changed files only is done to a disk drive. This has proven particularly useful for scanning operations and general browsing/ interaction with my library (instantaneous!). Wondering whether it will be possible to do the same with Roon’s database?

ncpl · April 19, 2015, 4:07pm

I also use RAMdisk so interested in whether this would be viable.

danny · April 19, 2015, 5:14pm

sure, nothing stopping you from doing this… however, itll only impact your speed on startup when we load the dbs. @brian can speak more about this, but most of the performance of roon comes from the fact that we already use ram based indexes aggressively to make things fast.

brian · April 19, 2015, 6:57pm

As @danny said, there’s nothing stopping you, but I don’t think it’s worth the trouble.

Unless the data is coming over the internet, Roon is usually serving up data out of RAM anyways.
Most pauses experienced in the app are CPU-related, not I/O related. A ramdisk won’t help with these.
When Roon is loading data from disk, it’s coming out of a memory mapped file, which has very favorable OS-level caching characteristics.
Ramdisks aren’t durable–you lose data in a power cut. This is probably a bigger deal with Roon than it was with other software in the past.

I bolded the first point because it’s the most important one: everything else I’m about to say concerns the less common case when we’re not serving data out of RAM, because those are the only cases where a ramdisk could theoretically help. Also, this is going to get pretty technical. Don’t say I didn’t warn you

Ramdisks seem tempting because of a little bit of a false dichotomy. It seems like the comparison is between “ramdisk” and “ssd” or “magnetic media”, but for read-dominated apps like Roon, the comparison is between “ramdisk” and “filesystem cache”. This is a lot less obvious choice, since both are backed by RAM, so you’re really comparing two sets of implementation strategies for RAM-based disk caching, which is pretty murky stuff, even for an expert.

In case you’re unfamiliar with the concept of filesystem caching, here’s the short version: On all modern operating systems, unused RAM is automatically allocated by the kernel as a cache for disk accesses. When an application accesses the disk, it asks the kernel “give me data”. The kernel checks with the filesystem cache to see if that data is already cached in RAM. If it is, it returns the data to the application with no disk access. If not, the kernel talks to the disk, gets the data, caches it in RAM, then returns it to the application. When other applications or disk accesses need RAM, the kernel ejects the least recently used data from the cache. It’s a pretty fair bet that the majority of the RAM on whatever machine you’re using right now is acting as a cache for your disks.

The problem with caches is that they can be “hot”, meaning the data that you need is in RAM right now or “cold”, meaning that the cache needs to fetch the data from the drive.

On application startup, the cache is cold. This is why @danny mentioned ramdisks helping at startup time–copying the databases into a ramdisk will get the data into RAM, and then the application will read it out of RAM.

However, knowing what you now know about filesystem caching, think about what would happen if you simply read the same data from the disk and discarded it right before starting the application: the filesystem cache would be “warmed up”. The databases would be sitting in RAM anyways, and the app would load as quickly as if the databases were stored on a ramdisk.

This technique is called pagewarming. You’re taking advantage of the fact that sequentially accessing data is faster than accessing it in random order, so you load a bunch of data that you need sequentially from disk and then throw the data away. This has the side effect of warming up the filesystem cache. Then, when you go back to access the same data later using a random access pattern, it’s already in the filesystem cache, so it goes much more quickly. For random-access workloads, the combined time spent pagewarming + accessing the data is usually quite a bit quicker than if you’d simply accessed the data without pagewarming.

I’ve seen “hot” filesystem caches backed by spinning media outperform ramdisks on random access reads. This may seem counter-intuitive, but remember: in both cases the data is coming out of RAM without talking to the disk. It’s just a question of how optimized the kernel is in each area. Now think about how a busy kernel developer is going to spend his limited attention: are they going to focus on the filesystem cache or optimizing ramdisk infrastructure? Filesystem caching is the most significant consumer of RAM on most peoples’ PCs and ramdisks are a specialty tool used by a tiny fraction of users.

Now, back to Roon. I have an item on my todo list that’s been there for a while: “investigate pagewarming the Roon databases at application startup.” This isn’t motivated by slowness when using the app once it’s up and running (since the startup process warms the caches anyways as a side effect of loading your collection data into RAM), but it will probably make the app start up a bit faster, especially for large collections.

To sum it up:

Roon serves most data out of RAM anyways, so ramdisk-style optimization mostly impacts startup time for the app and not the actual user experience once you’re inside.
Ramdisks don’t always improve performance for workloads like ours, and sometimes do more harm than good.
Roon should be fast enough without having to resort to trickery like this.

rovinggecko · April 19, 2015, 7:44pm

Nice explanation @Brian, nice food for the inner techie.
I guess my conclusion is that, where in the past ram disks were a differentiator, that now with the advent of ssds and modern os it is less so.

Of course, another angle is whether you keep the hardware running (and thus rarely have the boot up) or that you start it up each listening session.

ncpl · April 19, 2015, 9:43pm

I think I followed that…thanks Brian.

evand · April 19, 2015, 10:37pm

Thanks @brian, that’s great to know. Now the geek in me is wondering what dbms Roon uses?

brian · April 19, 2015, 10:49pm

We don’t use a traditional DBMS–all of our indexing, queries, etc is custom. We use leveldb as a low-level key-value store and build up from there.

Akimo · April 25, 2015, 3:08pm

Brian,
Thanks very much for the explanation. I may be over interpreting, but this sounds like we can optimize performance by max’ing RAM on the machine to keep as much of our libraries in cache at any given time. Or, is that only effective for very small libraries, because the indexes don’t take up enough memory that extra RAM (beyond some useful minimum (8gb?) would help?

brian · April 25, 2015, 4:11pm

Akimo, yes, having enough RAM so that the databases can be effectively cached helps performance. Even at 8gb, you have already surpassed that point for a very large collection.

Akimo · April 25, 2015, 4:56pm

Thank you. That helps spec out what I’m going to put together.

evand · December 9, 2015, 8:00am

Having just re-read this thread I think @brian’s post should be in the FAQ.

brian · December 9, 2015, 8:43am

@mike, take note