System architecture comments: a tale of 3 systems

In this user’s situation, using a VPN from Brazil would use a totally different set of TIDAL servers. This solution working demonstrates no problem with anything at his location (Roon, hardware, endpoint, network, etc). It’s all about which TIDAL server he streamed from.

1.8’s hardware requirements are actually lower than 1.7’s because we did a bunch of optimization work in the local pieces, and most of the new “heavy” functionality is being driven from our cloud services.

its not the overall hardware requirements I’m addressing (though its great that roon did the optimization work), nor is it why my modest windows box failed (though thats nice to understand).
its ** what the system does when it runs into problems**.
here are 2 hypothetical results of an overload or other problem:

  1. the system slows or stops scanning and background audio analysis. metadata retrieval is sluggish. A scan of my repository takes 5 times as long as it should. My tracks still play, though search and selection are slower than usual.
  2. I’m showing off my spiffy system to friends, and while listening my miles davis “kind of blue” DSD abruptly stops. core disconnects and my user interface is gone.

for some failures, both errors are inevitable, but for many overload conditions, proper prioritization and sandboxing can preserve critical functions ( which I consider to be playback and core access)

which door do you choose? In many cases, with a properly architected solution, an overload or other resource availability issue need not bring the entire system to its knees. I have seen no recognition of this in any of this discussion, yet I feel it would be of value to users and to roon itself to look at this. Its routinely done on mission critical systems (which I worked on for decades), and it was done on my previous music system.

It does exactly this. It runs at a very low priority and uses a single core by default. Unlike playback, which is run at a higher priority and usually single core.

If the remote dies at the same time your audio stopped, it means your Core died (power or crash) or the network failed (and you are streaming with minimal buffer).

Because it’s basic software development practices for a system as complex as Roon. It’s not perfect, but it’s way better than you seem to think it is.

You’ve made a decision that Roon does nothing like this because you hit SSD vs HDD issues for a large db that blew it’s write locks and everything hung up. Roon isn’t an RTOS, but if you use reasonable and reliable hardware, it works well. If it didn’t, the # of people complaining about skips would be tens of thousands, not tens.

I’ve merged those posts into this topic as I feel they are more fitting here than in someone else’s #support topic … Apologies if the timeline looks a little odd now, I’ll see if I can fix it.

1 Like

A post was merged into an existing topic: No appropriate error-messages from Roon

The Support Team is working on those issues and neither you nor I know what the issue is. I’m looking at patterns of identified issues, and they are almost always resources or network.

I’m quite active, I just have nothing useful to say in the HRA thread that hasn’t already been said.

When statements are made based a few forum posts about the Roon population as a whole, I’m correcting that misunderstanding. The Support Team is dealing with support issues, not me. I was addressing a tale of 3 systems where one was under spec’s for Roon. @jim_hamilton wants Roon isolated with cpu time slice and thread guarantees like the way an RTOS has. I’m telling him it designed with this stuff in mind, but without a full RTOS-like guarantees. None of this is ridicule.

No one but you has said “idiots” and yes, most issues related to this type of issue are self inflicted. There are other areas (like metadata) where this is not true, but “skips” are almost always environmental, and usually due to setup or beyond the users system (TIDAL servers).

1 Like

Well, all those examples that “neither you nor I know what the issue is” come from my “desingineous thing” I did according to you. To be called disingenuous for pointing in the direction of an unsolved and recurrent issue… it is not the way I am used to being treated as a client, nor as a human being.

1 Like

You are painting a picture… The reality is there are a similarly small number of complaints at all times, and they generally get resolved.

Your situation will resolved by support, but you’ve decided to get up on a soapbox. As a person, that behavior is antisocial and would get you physically removed from most establishments.

No one is saying you don’t have an issue. No one is minimizing your issue. You are feeling my response to your soapboxing.


Seems to me like a case of shooting the messenger. But I will leave it at this and start some introspection about my disingeneous and antisocial behavior.

1 Like

7 posts were split to a new topic: There are new rules out there for the blog

This thread has run its course. Perhaps the prudent thing is just to let it, but I feel a summary might be useful, since a number of points were raised, and relevant posts were made in 2 different threads. The moderator nobly attempted to merge, but the flow is now confused, and my original summary is now in the middle of the thread. Here’s a coherent (ok, you be the judge)… summary:

I’m pretty new to roon, but I’m pretty experienced with many digital music management/playback systems. My initial look at roon, as already described, was on a simple intel laptop, which as it turns out did not meet clearly stated roon minimum system requirements due to its lack of a SSD. I did not expect this system to perform well. Part of this thread focuses on the shortfalls of this system, and fixing these shortfalls by migrating to a much more capable system. This misses the point. If roon had simply performed poorly on this system, I would not have posted. My point was the way roon failed: by completing database population and allowing connection and control, but then failing during playback.
The fact that the system had glitches is somewhat moot, given that it was under-resourced by roon standards. The fact that it failed during a mission critical operation (playback), which itself should require minimal resources, was the point. As a experience system architect, I know this can be improved using known techniques. I had hoped that highlighting this would be helpful. From the comments a number of forum members understood this. My observation is that roon had not isolated playback from other systems issues, could do so, and this would provide many benefits for both current satisfaction, future growth, and ability to use systems with less resources. Here are the key takeaways as I see them:

  1. @henry stated that 80% of playback problems are user issues that are easily diagnosed and fixable by a knowledgeable person. I completely agree. But that leaves playback problems that may point to areas of improvement for the system.

  2. @Bob_Lindstrom reported that he resolved playback issues by increasing his system RAM. I suspect there are more cases where people had issues and resolved them by using more capable hardware. System architects have long had a name for this: throwing hardware at the problem. you can mask many architecture issues by simply over-resourcing the system. Sometimes its even the right answer. However, as your system features grow, or user collections grow, this may come back to bite you.

  3. somewhere in the now-somewhat confused thread merger it was pointed out that problems with playback have affected only a small percentage of users. I have no way of evaluating this. This does not mean that insulating playback from background system activities is not beneficial.

  4. finally, since obviously I know little about roon architecture and software, how do I know playback is not well sandboxed currently? Well, it only takes a couple of test cases. I feel I inadvertently performed one. My windows system repeatedly (several times in a listening session) would stop playback, usually about 10-20 seconds into a track. sometime it would recover, most often I would hit play and it would resume (from the proper location), and occasionally the problem was likely more serious since my controller lost connection to the core (@danny suggested the core crashed in this case) and I had to restart the track after reconnection. I was playing a mixture of flac 16/44 flac 24/96, and DSD 64 and the problem occurred for all of these. To the @henry point that most problems are user configuration,
    a. I was doing no DSP (verified by roon), and no WAN streaming
    b. my fileserver had been used on multiple systems pre-roon and I had hardly heard a dropout in 2 years. The cat 6 switched 100/1000 ethernet delivering the bits has been solid for years. The ROPIeee bridge could be questioned, but this seems very unlikely.
    c. The roon system was not being used for other applications at the time of playback.

put the above together with what is required for glitch free playback in this case: the core’s only job (e.g. for 16/44 flac) is to deliver, in “near real-time”, a bit stream of a few hundred kbits/sec across a switched ethernet to the bridge. a raspberry pi model 1 could do this. I was doing this in 1980 on corporate networks. Many of the tens of playback systems I have used over the years did this (although they mostly failed in database performance, unrelated to playback, and become unusable)

I apologize for the lengthy message belaboring the point. Roon is a magnificent achievement. On the right hardware (NUCi7model10, the highest performing server listed in the knowledge base recommendations) it has handled my 17,000 album database with ease, provides snappy control, and, using direct USB to my benchmark DAC, gives me the clearest DSD playback I have achieved to date. Sandboxing playback (or at least providing more guaranteed resources if the system is busy) has many benefits. Its not trivial, but the techniques to isolate critical functions are well known. @danny says roon already does this. then why did my playback fail? clearly, in this case an under-resourced core failed to transfer bits. pity.