System architecture comments: a tale of 3 systems

I’m new to roon, so this is going to sound a bit presumptuous… but I have decades of experience as a system architect on large, mission critical systems, parts of which had near real-time constraints, so maybe this perspective is useful.
I have a very large (> 10,000 album) music collection. I’ve used many music management/playback systems, and only a couple have scaled adequately. Roon has done an excellent job with database. I last used a system I’ll call S1. I liked it a lot, but hit a scalability wall. The user interface became very sluggish: sometimes it could take several seconds to respond to a control input, even as simple as “pause playback”. But even at its most sluggish, it would continue to play the current track without glitches or interruption. I could re-index my large collection without interfering with playback in progress (though the user interface response suffered greatly).
So, how about roon? first attempt: did a trial with the following setup:


Core:dell inspiron laptop, 64 bit windows 10 - dedicated to roon.
WINDOWS-HCUU2NA, 64 bit
intel core i5-4200U, 1.6 Ghz, 6 GB RAM
Music repository: smb mount from pi4 dedicated fileserver
network: switched 100/1000 ethernet, cat 6 cabling and linksys/intel switches
ROPIEEE bridge running on Raspberry Pi 4 Model B Rev 1.1
The ROPIEEE sw version is 3.094, kernel 5.4.83-5-SPCKFSH-v7l+
output: Benchmark DAC3 HGC, usb connected to the pi/ropieee bridge (no HAT/daughter board)
control: android tablet and various


I realized the core was likely underpowered, given my repository size. the fileserver DAC and network were well tested over a couple of years of previous use, including hours of DSD playback. I expected sluggish performance. In fact this setup was completely unusable. playblack often stopped within 20 seconds of initiation. and system activity caused the client to lose connection to the core.

so, second attempt: ROCK on roon labs suggested high end intel NUC10i7FNH with 1 TB M2 SSD:


core: As stated above, NUC hardware, roon optimized core kit, server Version 1.8 (build 790) stable, OS Version 1.0 (build 227) stable
network and fileserver: same as above
Same benchmark DAC, but no bridge: DAC connected directly to NUC via USB.
control: same android tablet


This system is fast to respond and has performed flawlessly once my large collection was imported. focus and search are fast.
So (if you have made it this far) - whats my message? Roon is a great system with a leading approach to database, but some basic architecture work could improve performance when there is resource contention or the system is overloaded. My previous to roon system (S1 above) did not scale because of poor database design, but even under severe overload it continued to play music well. To me, when I push “play”, we should have a contract. However busy the system becomes, playback should not be affected: this is a “near real time process with hard time contraints”. background processing, audio analysis, database updates etc - let them slow down or pause. S1, whatever its limits, managed this. With all its sophistication, roon should too. When resources are overloaded, its ok to slow or halt background, including slower user response, but loss of core connection and interrupted playback should not occur. My initial roon system had enough resources to get bits from the fileserver to the DAC, and should have done so under load/resource constraints, no matter if other processes had to suffer.

15 Likes

I think this is a very useful observation, Jim. The main purpose of the Roon core is to send audio data to an endpoint. This should take priority over information systems tasks such as catalogue updates.
Hopefully Roon staff will see this and consider this as a possible priority for future developments.

2 Likes

Hopefully
@support

2 Likes

What type of disk is in there? Also, ethernet or wifi?

There is nothing fancy about the dell inspiron laptop I used for the second system. its an off the shelf dell inspiron 3000, purchased new and I’ve never even opened the case. No SSD, just a 500 GB SATA disk. Its 5 years old. I wasn’t expecting good performance, but neither was I expecting playback issues.

The network is cat 6 cabled switched ethernet using linksys/intel 100/1000 switches. The only wireless in the system is to the android tablet controller.

S1 ran on a 4GB raspberry pi direct attached to my benchmark DAC. That system used the identical fileserver, music files and network as I am using with the roon system. In 2 years I hardly ever heard a playback glitch (but as my music repository grew, the user performance became terrible).

Not to belabor things, but my poor little laptop, fileserver, network, and the ropiee had plenty of resources to maintain a music stream. (I have streamed using other software directly from the laptop to the music system). Since nothing else was running, the streaming processes in the core were only competing with other roon components (and the inevitable windows overhead). Its my personal view that the playback should be sacrosanct, and this little system did have enough resources to manage that. This is why I questioned the process design/resource allocation/scheduling basics. More than most, I understand how difficult this is: on the mission critical systems I worked on, this is where the best engineers earned their pay. You can throw hardware at this kind of problem (I did in this case, very successfully: the NUC/roon system is the best music system I’ve used by far), but generally its not the best approach, especially if you want to grow your market into less technically inclined users and continue to add features…I’m suggesting firewalling playback and (to a lesser extent) basic user control.

4 Likes

You are focused on the wrong thing in your system.

This is why you are having issues. Your Dell’s 4th gen i5 is fine for Roon, (maybe a bit slow for 100k+ tracks, but nothing unusable) but your Roon database on a spinning disk is horrible for Roon in all ways. I/O stalls will cause massive slowness in Roon, and it will affect playback if it’s slow enough. Everything will get hung up at some point. It could be avoided, but you are also talking about systems that are below our minimum specs (not the network/cpu/etc, but the lack of ssd).

A $25 SSD part for that Dell would make it run well:

ok, thanks. I mostly agree with you and will close my comments on this.
I probably won’t be retesting my dell, since I have moved on to the ROCK system I mentioned. The recommendations in your knowledge base were spot on there.
To your point, I overlooked the explicit recommendation provided clearly by Roon for an SSD for large collections: my bad.

I do think my comments on architecture deserve consideration however. As you state, “it will affect playback if it’s slow enough”. In this respect, roon underperformed on playback compared to my S1 system running on a pi4. If the system as a whole has enough resources to deliver a bit stream with reasonable timing from server to audio system - and thats not a lot of resources with bit streams of a few Mbits/sec max - then it should do so regardless of other activity. If you can modify the architecture to that effect, you will improve user satisfaction over a broader range of hardware, enable users to continue to have a good experience as their collections grow in size, and enable new features without unacceptable impact on playback. 'nuff said…

3 Likes

final comment, promise.

the question danny answered (very well) is: why did my windows system fail?

the question I am raising is more general: what are the failure modes of the system?

4 Likes

What are failure modes?

And where is the third system in your tale?

A Systems Approach to Failure Modes

4 Likes

@danny I am afraid the issue can not be reduced to one subscriber who happened to use a system that is below minimum specs. Over the last six weeks, 35 (!) threads have been posted about skipping: Search results for 'skipping in:title after:2021-03-15' - Roon Labs Community. Add to that those who commented and confirmed having the same problem. Add to that those who did not mention ‘skipping’ in the title of their topic but are experiencing the same problem. To quote @Felix_P: “Flawless playing without random outages and skips is a MUST and not a NICE TO HAVE for ALL of US!” (Roon Refuses to Play AGAIN - #24 by Felix_P)

2 Likes

@Francois_De_Heel, you’ve done a disingenuous thing here – I clicked on your link and just looked at them.

This one was solved by removing a powerline network adapter that was causing the network to be flakey.

This one and the one split off it have zero feedback. Probably a network issue.

This one is about the playback stopping in the middle of the track and then moving on to the next track (most likely a network issue). The logs would show why the tracks were skipped.

This one is complaining that when the user manually skips the track by hitting the “next track” button, they want the Roon Radio questions to not pop up.

etc… etc… none of this has to do with @jim_hamilton’s question.

Even in your thread, the log posted shows the system getting I/O errors, most likely a failing drive or network that lost connection. You also refer to someone having TIDAL tracks being skipped, but the problem goes away when they use a VPN from their native Brazil to the US. That would make TIDAL deliver content to them via their US servers, and not their Brazilian servers. Sounds like the TIDAL servers in Brazil are having issues getting content to listeners (their network probably).

Most of the issues users have with “skips” in Roon has to do with either networking (LAN and/or WAN) not being able to keep up, or under-spec’d hardware.

4 Likes

system 1: a non-roon music system I used for a couple of years. It played back beautifully, but as my music collection grew the user interface became intolerably slow (though playback once started still worked very well)
system 2: my first roon install as outlined previously, using a dell laptop for core. unusable. danny has pointed out that this is due to a lack of SSD given the size of my music.
system 3: A Rock system using a roon-recommended NUC model. does everything well.

Danny, do you really want to be of any help for ROON-clients? @enno @Carl

Oh, great, the first sign of life from @danny , after very long time yet again in another thread and not where the questions have been placed.
I am not happy at all with the arrogant, defensive style of your reply, Danny, deeming the ROON-clients to be somehow idiots and have self-inflicted their problems. I understand that networks are a tricky issue. But the MINIMUM I expect from ROON is a good system of error messages giving hints on where an how the problem arised. I do not want to retrieve in a complicated way protocols while having poor response on them. In this case of skipping 1.7 provided at least one error message, with 1.8 there is no message at all while the system is stuck in an endless loop.

I would like to see the responsible head of department or the CEO involved in the dissatisfactory way, how ROON-customer-service is dealing with customer-problems.
Communicating this way between ROON and clients who are left down by technical problems is not acceptable anymore, danny!
Danny, please help, but do not ridicule ROON’s clients!
Do you want me to warn potential new clients about ROON or do you want a satisfied community?

Whenever I see reports of skipping tracks, in 80% of them I believe I can walk into that home and point out where the problem is or even fix it. That is after a significant time trying different things after having those issues myself. That is the reality. And it has to be said that as distressing as it is for the users with the problem, 35 examples of skipping tracks in a product with an estimated user base of over 100k users is actually pretty good. Being able to see every support thread raised can create a false illusion of fragility.

3 Likes

@Danny. Thank you for looking into this. It was not my intent to do “a disingenuous thing” and I am sorry if I did. But is it “a disingenuous thing” to try to make clear that there are tens of users experiencing playback issues? I honestly don’t think so.

Are most of the skips linked to either networking or under-spec’d hardware? I am sure some of them can be explained that way. But users with fast internet and hardwired connections to their core and endpoint plus decently spec’d hardware are experiencing the problem too.

Has none of this to do with Jim Hamilton’s question? Well, I can be mistaken, but it seems to me that skipping of tracks during playback is a perfect example of the problem that Jim states in his original post: “interrupted playback should not occur”.

I believe everyone would benefit if the recurring playback/skipping issue was treated as a problem that seems at least partially connected to Roon and that should therefore be addressed by the Roon team.

Just to try to draw your attention to the fact that users with a (seemingly) fine hardware configuration and network, experience the problem too, here are some examples:

Roon skipping tracks since last week This one experiences skipping with Tidal tracks and uses an Intel Core i5 with SSD and ethernet connections. He mentions that he does not use multiple zones.

Roon Radio stops or skips tracks This one reports skipping with Roon Radio and Tidal. He uses an Innuos Zen Mini Mk.3 and experiences it (among others) on a Naim Su-so 2nd gen endpoint connected via ethernet. Another user confirms having the same problem in the same thread.

Tracks Not Completing and Skipping To Next Track Here you will find 9 users experiencing the playback/skipping issue. The original poster uses an iMac Intel Core i5… but wifi. However, at least one of the users who report the same problem uses a 1Gb business grade internet and wired connections (Tracks Not Completing and Skipping To Next Track - #3 by Robert_Goodhope)

Occasionally Song abruptly stops and skips to next song A subscriber with a Sonictransporter as Rune core and ethernet connection to a Lumin T2 network player experiences songs abruptly stopping and going to the next song.

Playback skips to next track A subscriber experiencing glitches with playback (MacMini i7 core, ethernet connection to PS Audio DS Jr).

Titles Skipping constantly A subscriber with a Roon Nucleus and ethernet connection to Lumin U1 Mini & Bluesound Node 2i experiences skipping with Qobuz and Tidal tracks.

Etc… etc…

Maybe the basic thing to consider is:
Does Roon want to keep on repeating the mantra that if a user or tens of users experience glitches and skipping in playback… the problem is at the (nagging) user side of things?
Or does Roon want to solve the apparent problem using the subscribers and community posters who report the problem as allies who are more than willing to provide all the possible details in trying to help?

2 Likes

I’m not a habitual contributor to user forums, and would like to get off this thread, but I feel a need to address danny’s latest response to me. I appreciate the comments from others including francois…
this is a bit confusing since related content is contained in both this thread and the “playback skipped” thread, and danny’s latest response to me is there.
danny, peace. You stated that Roons software is “way better than you seem to think it is”. well, I’m not making any critical global comments, I’m trying to address a fairly specific shortcoming. As for my opinion of Roon, lets set it straight:

  • I have taken the time to install roon on two systems, and to incorporate my large music repository on both. despite problems with the windows system, I spent $800 on a NUC dedicated to Roon, and am about to end my trial with a year’s subscription. In these threads, I have repeatedly complimented roon for its database approach, which is the correct way to do this, as explained in the knowledge base, and based on my experience. I have taken the time to read pretty much the entire knowledge base. (I confess to not yet having learned enough from the forum). Roon is ambitious and doing a great job.
  • My NUC/roon system is fantastic. its is responsive, I have not experienced any errors, and the sound quality from the NUC USB output to my fairly high end system (benchmark dac/mark levinson monoblock amps/martin logan full range CLX electrostats) is the best I have obtained.

so lets not be prickly. I’m just trying to help, and I have a lot of experience with complex software and networks. (distinguished engineer, VP of software, chief network designer, Software research manager, etc). Of course, I know little about your code.

In my last message on “playback skipped”, I tried to present a couple of cases. case one was:
“the system slows or stops scanning and background audio analysis. metadata retrieval is sluggish. A scan of my repository takes 5 times as long as it should. My tracks still play, though search and selection are slower than usual.”
your response was:
“It does exactly this. It runs at a very low priority and uses a single core by default. Unlike playback, which is run at a higher priority and usually single core”

no. in my scenario the playback continued. In real life, with an overloaded core with no SSD, playback repeatedly stopped. Sometime it resumed by itself, most often I would hit play and it would continue from where it left off, and sometimes I would get a message that communications with the core were lost, (and restored after a pause, though the playback had to be restarted from the beginning). This problem was exactly my point. Please explain why the absence of an SSD should cause a playback stream of a few hundred kilobits/sec to be interrupted. My poor previous system, using a pi “core” and struggling with my large repository, never did this. To your second comment, as I indicated here the remote did not always die when the audio stopped. I also want to mention that there was no DSD processing involved. All the core had to do was to get those bits to the bridge/DAC. this is not a lot to ask, and other systems do this.

You seem to feel that all the users who report skipping or playback problems are under resourced or misconfigured. You even called someone trying to address this issue “disingenuous”, which seems really unfair.

I have simply asked for a look at how to isolate playback from background processing so overload does not impact listening as frequently. The evidence clearly shows this is not currently the case. I thought I was initiating a small thread to point this out and suggest improvements to a system I really like, not start an adversarial conversation.

5 Likes

FWIW, i host Roon on an i7 Mac Mini 2012 with a 1 terabyte SSD. All my content is on a Synology server. I also experienced playback skipping until I increased RAM from 4 to 16GB.

1 Like

interesting point, but similar symptoms can have multiple causes. I have little experience with WAN streaming, but can certainly see how a VPN might help by changing and stabilizing the data path.
I think the process and resource management in the core software is a more basic issue (and more under roon’s control). I’m new to this forum and to roon, but suspect that a lot of the frustration with 1.8 that I read may be traceable to a lot of new features which have exposed these underlying architecture issues, resulting in a greater likelihood of user visible errors on a wider range of platforms, especially towards the lower end.

2 Likes