Request for Database Table Size to Optimize Roon Performance (ref#NOAX9S)

Full form submission

What’s happening?

Something else

How can we help?

None of the above

Other options

Due to the slow nature of Roon, I'd like to know the database table size so I can optimise database reads and writes on disk. I run ZFS and one of it's features allows me to align to the table size per folder.

Where did you get that you can access the Roon database internals ?

I didn’t say I could. Not sure what you mean. I’m just after the database table size. I vaguely remember Roon uses some unusual database from google or something. I never did find out the table size, typically databases have one and if we can know it, then I can optimise my disk to align to it. This is a pretty everyday performance optimisation that results in a far bigger performance improvement than simply adding ram or CPU - which I have plenty of anyway. This is because of problems such as write amplification and so on where misalignment can make your disk write or read a whole lot more data than is being requested. This is a standard performance optimisation query.

It’s leveldb does that have tables to have sizes?

ZFS is a file system, i think you may be confusing performance optimization that you can realistically attain between different domains. They are only linked through the storage of data. I suspect Roon keeps most of its own DB in RAM on start and that writes are minimal, occurring only with Library updates. I doubt any ZFS optimisations you plan to do will make any difference to Roon dialogue and UI performance if that is what you are hoping for. In short, I don’t think ZFS is where you are going to get the performance gains you are hoping for. Let me know if I mis understood the premise.

Nope, I’m not confusing it at all. I’m in the tech industry and I know for a fact that optimising to table size is an important performance optimisation. It’s actually bigger than that depending on what your system is, for example LUN, Host OS, guest OS, etc etc. I have an uncle that is a Database guy, now at retirement age and like so many things in IT - when you understand the tech way back to it’s beginning, you understand things differently to all the younger generation and as such have a greater understanding of fixing the problem. Most young people just add RAM and CPU. He optimises what zfs calls record size to table size. And he always gets asked how did he speed up the system so much, because we’ve thrown heaps of RAM and CPU at it and he told me it’s always the same thing. Align the table sizes to the disk etc. You can literally get 100x performance depending on how much misalignment and amplification there is. I have worked at a large corporate and literally seen all it’s public facing infrastructure fall to it’s knees because of this problem, and yes, I’ve literally seen this done with ZFS as well, it truly does make a difference. How much depends on various factors as mentioned above.

So for the sake of finding out what the table size is, it’s worth a try as Roon is very very slow right?

What is true though, is I have no idea how leveldb works yet. I assumed this form would get an official response From Roon that it’s table size is x or leveldb works differently for whatever reason, but perhaps I’m wrong about that?

One of the reasons I wanted an official answer is table sizes can be changed from the default. I’m not a database guy so it’s not easy for me to find out. For those following along though, the default page size for mysql is 16k and the default page size for postgresql is 8k. Throwing in a bit of something extra for you to think about, the default cluster size for a qcow2 file (basically a VM file) is 64k.

Now think about this, I have a disk that has a default record size of 128k, lets say I have a qcow2 based virtual machine which has a cluster size of 64k, then lets say I have posgreql with 8k table size. I write 8k worth of data to the database. How much am I actually writing to disk? What happens if I need to write 16k or worse several MB? This is where read and write amplification comes in. For simple and small transactions you can end up absolutely hammering your disk and for no reason too - a few simple database transactions should not hammer disks like that. But this kind of misconfiguration sees it happen all the time.

Hopefully that’s interesting reading from someone - thought I’d just add a little scenario at the end. It’s always fun to discuss this stuff!

Marshalleq

This page seems to elude the default size is 4kb. I don’t know if I’m reading it right or not. If it is, that might explain some of the performance issues everyone is having as it’s quite small.

Mac OS (APFS) has a default cluster size of 64k
Windows (NTFS) has a default cluster size of 64k also, for disks greater than 32GB, but can be smaller
Linux’s most popular file system ext4 has a default cluster size of 4k I think.
ZFS is a little confusing with a default record size of 128k, but it’s variable (in a downwards direction), nevertheless, I do know from performance testing that despite being variable, providing a match does improve performance similar to other file systems.

It wouldn’t surprise me if Roon had overlooked this. Because leveldb being Google Big Table apparently, is made for a fairly large type of hosting which is not going to be able to be replicated on a home computer. It’s possible this is causing some of the issues. It seems like ext4 might be quite optimal though. I am assuming a 4k table size though, which could be wrong.

I may just create a dataset with 4k record size and do a performance test to see what happens.

There’s a difference between table size, which is by definition variable depending on, well, how much data you have in that table, (max) row size of any given table, and page size, or whatever it is that LevelDB uses as its minimum I/O unit. And LevelDB not really being a tabular/relational database, applicability of this optimization is questionable anyway.

Given the confusion, mucking with file system parameters without knowing LevelDB’s and Roon’s access patterns is far more likely to result in everything working worse than before.

2 Likes

Well that’s a false premise for your theories right off the bat. Some people under specific circumstances are having issues they seem to be structure based tags, large track pools etc given yesterday’s release to address them.

2 Likes

This — table size is a variable, not a constant; and

This — not everyone (this person included) is finding Roon to be slow at all.

I suspect the OP might be referring to record sizes of the most common writes and not table sizes, and those, too, are quite variable.

One of the things I find humorous in forums like this is the armchair engineering that assumes that an organization who makes a choice like LevelDB AND who gets performance complaints on occasion does nothing whatsoever to consider these things in their work and design. I’ve led and been part of a large number of software engineering teams, and nothing could be further from the truth.

6 Likes

Here is the performance monitoring from my Proxmox VM running Roon Core (for the month) - I just don’t see Disk IO being the issue you think it is. I refer to my memory usage comment above seeing how little disk goes into the read cycle, the writes are the only big events and these are library updates I forced over the the last few weeks - this is why I think much of the Level DB store data is in RAM for faster performance.

4 Likes

:+1:t3: It is in RAM, if you search for leveldb and @Danny you will see posts from the CTO about it.

4 Likes

Hello;
a very interesting topic and thanks for bringing it up, but please clarify the used term “table” - what exactly do you mean when you use this term?
There may be a kind of temporary table - do you mean this tables class or something else?

In other words, when writing “table” do you mean one of the database tables or perhaps the kind of unit of recording the database file on disk?

With ZFS this is a very small adjustment to make, it’s not a big deal at all. And you make my point for me. This is why I am asking the question, so that I do know these things. Questions are OK aren’t they? Trying to get more performance is OK too isn’t it? Why bash someone trying to make something better with a simple question - it’s not hurting you is it?

Ok, poor choice of words. It’s meant to be ‘The performance issues that people are having’.

Good info, do you have a performance problem though?

I’m talking about database table size, I think this might be interchangeable with a few different terminologies like for example database page size, but I’m not really sure what the terminology is exactly being that I’m not a database guy. As such I’ve emailed my uncle to see if he can clarify for me. Along with that, I’ve asked him if I’ve got it all wrong just in case, though I’ve seen other people doing this as recently as one week ago. I shall post back what he replies.

Questions are fine, it’s coming in, throwing around a lot of (completely misunderstood) technical jargon and appeals to authority of “being in tech industry” and “my uncle optimizes (completely different database systems),” and implying that Roon developers do not know what they are doing isn’t really.

No, it’s not the same, and it is not interchangeable. And in any case, with Roon’s access patterns (illustrated above) and ZFS capabilities there should be no need to mess with file system settings. Default sizes were chosen by ZFS developers for good reasons, and one needs to do something far more intense and esoteric than running a Roon server to need to make any changes there.

5 Likes

May I ask what your background is to make these kind of accusations, e.g. that I am completely misunderstanding technical jargon, and your judgement on these other parts and that it’s not the same etc etc. I’m just asking a question here, other people are raising other points which is requiring me to sort of defend that I’m asking a question which is a bit silly. The way this form was presented, I thought this was actually an official Roon support form, then the whole world is jumping down at me for asking a question and attempting to answer others questions and denying that it’s even a valid question. I don’t get it.

Somebody above is even calling me an armchair engineer, which is offensive and unhelpful, you are saying that I don’t understand technology (completely untrue) why all the hate? It’s just a question. And I’m just attempting to answer your questions. Perhaps in a rush and I need to read them a bit more, but I’m doing my best here.

Anyway, I’ve had an initial response from my uncle which I’ll post in a second.

This is copy and paste with just names removed. I’ve gone back with the sizes of my database 2.x GB for the main db and 1.xGB for the image cache along with an explanation of how much metadata the system pulls in. This will help to understand if there is opportunity for leveldb to have this kind of optimisation or not.

Hi!

all confusing – yes! Google BigQuery vs. LevelDB Comparison

Leveldb is not a relational database – it’s a key store database.
Leveldb is not Google BigQuery (which is a relational database)

Never heard of it. Aware generally of the type – haven’t worked on one of this type – stragley… but know generally the performance aspects

The ‘block size’ 4KB = io fetch size = page size in most databases
So you want to be sized so at least 66% of the time the’record’ resides in one block. The size of the record is ‘key size’ plus ‘data load’ size.
It is compressed so probably achieves a reduction of 10. Ie 100 chacter string is compressed to 10 and stored a that size within a block.

These types of databases are popular with rapid web development / Jason / configuration details etc
Fairly simple databases, performance would depend on performance of machine. Probably more memory = faster performance. Really a database for a specific use not a generalise one. That is a technical statement – as a key store great, ie grab a pair of strings as a data warehouse not so great, have to read all the string, get the pieces you need, aggregate them etc… all hard on IO. I don’t see any structures to assist with that)

I see the block structure has attached index (ie inline with the data). That would it tends to scan through the blocks to find data possibly using jump-scan to the index pieces… ie read the index – find the next index offset jump to that etc.
It mentions it keeps like keys on the same or adjacent blocks. Doesn’t specify how it maintains this (if it is a volatile database with many inserts / deletes /updates) but looks like it maintains the index while transacting – not necessarily slow but would cause some hesitations at times when it does the Mexican shuffle thing.

So the real thing is “What are you going to use it for”? If it is less that 100 mb? Don’t waste time trying to performance tune it – the hardware will make it so fast you are unlikely to gain anything with a tune

Regards

1 Like