Trying to re-open an unsolved case the support system closed automatically for the second time. Please let me know if you are actively investigating it. Thanks
Reference:
Trying to re-open an unsolved case the support system closed automatically for the second time. Please let me know if you are actively investigating it. Thanks
Reference:
Hey @Marcelo_Bulgueroni,
Weâre sorry to hear youâre still having issues here! And apologies if this has already been glazed over - Does it happen with one zone at a time? Where are the zones themselves connected, near the primary router?
If you could reproduce the issue and share another specific track name, weâll gather a fresh diagnostic report to share with senior development. Thank you! ![]()
We just wanted to follow up with one more question from our developers, in addition to what my colleague @benjamin asked earlier. Could you let us know where the affected zones are located in relation to your router?
Thanks for getting back. I apologize in advance, but in my understanding the last contact from you was that you were analyzing the logs from the repeated crashes experimented.
We already have, through extensive tests:
1 - a very consistent way to repeat the crash, using the playlist I created with DSP enabled;
2 - lots of tests involving wi-fi grouped endpoints, ethernet grouped endpoints, and mixed grouped endpoints, all behaving in the same manner, as per my post Roon skips and stops during mixed RAAT zone playback (ref#GRVTPQ) - #6 by Marcelo_Bulgueroni
3 - Tests with zorloo and without zorloo as per your requests;
4 - A description of my network setup, with a mesh network consisting of a majority of routers connected among themselves through ethernet (only one is connected though wi-fi to the router and there was some tests in which I connected even this access point to ethernet). My house is almost entirely ran on ethernet through a gigabit tp-link switch, only three endpoints are wi-fi, and the three endpoints are VERY CLOSE to each mesh router with excellent signal quality (but again, the problem happens even on endpoints connected through ethernet).
5 - the problem occurs with multiple songs, and we already have a 24-hour test I made to demonstrate that.
The feeling I have is that there is a specific focus on finding the problem in my network, my devices constantly, and even though we proceed with this extensive A /B testing not for once I saw any consideration that maybe the RAAT protocol is the problem. This is kind of tiresome, respectfully, considering that I have been in this process for almost two months, having my tickets closed two times and when I reopen them being faced with a similar kind of questions.
I would greatly appreciate if you could help me understand why the tests and information provided are not enough for a diagnostic. It is ok as well to have an answer like âRAAT cannot deal with this currentlyâ so I can finally give up on the investment I made on âupgradingâ to Roonâs native protocol. What is not okay is feeling that the information is not being interpreted as a whole, since we are not talking about an wi-fi problem here as it was thoroughly demonstrated and tested.
Again, I really appreciate your help and hope to get this fixed with your help and I am fully available to test new variables or information.
Thank you for understanding.
Thank you again for your patience. RAATâs uncompressed protocols rely on multicast traffic - youâre not necessarily encountering something unresolvable here, but we will likely need to reconfigure more settings. We also need to pinpoint whatâs happening with the Zorloo.
We need to clarify two points to proceed:
When you say âcrash,â do you mean the Roon GUI itself actually freezes or closes down? Or are you referring to a dropout in the audio transport (stream gets paused or stops playing)? The team needs to clarify the precise symptom youâre experiencing with the Zorloo.
The line-of-sight between mesh nodes and routers isnât the problem. RAATServer distributes audio between endpoints that must also communicate with one another to synchronize clocking and playback. If the mesh network is attempting to optimize bandwidth distribution and packet handling, it might intentionally delay or re-route some of this traffic between endpoints. This can wreak havoc on RAAT and it will affect Grouped, not individual Zone, playback.
The logged dropouts are systemic and always include a second of audio (precisely 96000 samples). The dropouts occur simultaneously across each Zone, even when RAATServer itself is online and distributing audio actively. The endpoints arenât receiving the data that RoonServer is sending to them over the network, and they all stop receiving the data at the same time even though Roon keeps sending it. They eventually receive it again.
In the vast majority of cases, this is due to the STP implementation in any managed switches or the multicast settings of the mesh network. Itâs for this reason that weâre asking about network settings, not just network topology.
Have you tried setting up manual ethernet backhauls between all of the mesh nodes instead of relying on the mesh network?
Try disabling Beamforming, if applicable, in the Deco M5 settings admin page.
Weâll watch for your response.
Hello @connor
Thanks for the detailed explanation and questions!
Concerning the points you raised:
Sorry if I have been misleading in my wording. What I say is a âcrashâ is the behavior described since the beginning only - roon skips the next song entirely then plays 3 or 4 seconds of the next song stops playing. GUI remains fully functional and as soon as I press play the song resumes playing. It plays fully and sometime after a different song presents the same behaviour. It only happens when transitioning from one song to another.
Yes, I have already done that, using only ethernet connected router points in order to avoid the backhaul chatter through wi-fi, but the problems continued. I can try that again if you want.
Tried that as well and the problems remained, but will try again. I also disabled fast roaming but it seems unnecessary due to the fact that I set each endpoint to connect to a specific router in order to avoid being tossed around different routers.
A point I would appreciate your insight: it is clear that the setup on wireless has its challenges on any home. However the problems appear even when grouping ethernet-only endpoints. The difference is that it is less frequent but easily happen when testing. Shouldnât this eliminate all the wireless investigation and bring the focus to other technical point? Please forgive me if I am missing something in my interpretation of the current facts.
If you think it is productive I can run a battery of tests only with ethernet-connected endpoints so you can better compare the logs. Just let me know!
Best,
Marcelo
Thank you for precisely clarifying these points. Weâre making progress pinning this down and weâre very grateful for your patient and diligent troubleshooting so far.
RAAT can be sensitive to network topology and settings, but itâs an industry-standard, robust group playback protocol that has matured over more than a decade of harsh QA and user testing. We havenât received reports of protocol-level failures with RAAT in many years, which is why weâre being so infuriatingly scrutinous of your network here.
Letâs summarize what we know is not happening.
Weâve eliminated the possibility of WiFi-related interference or dropouts. Weâve discounted the possibility that the router or mesh nodes are failing to forward multicast commands.
The precise symptom youâve reported is a dropout of 1-4 seconds, accompanied by track transition issues and overall playback failure (you need to hit play).
With this information in mind, and considering the symptom, weâd like to focus on two possibilities:
The Deco M5 can still mishandle RAAT clock synchronization traffic, particularly during track transitions. Even when nodes are in ethernet backhaul. Deco firmware doesnât allow users to customize STP or multicast handling. This might not be a factor if all your endpoints (RoonBridges) are behind the unmanaged switch in your network setup. But letâs set that aside for now, because we likely test this without rearranging your topology.
A single Raspberry Pi might be slightly underpowered; resource constraints on that machine are causing slow response to RAAT clock sync during track transitions. This clock sync failure cascades across the grouped Zone.
What we recommend as a robust test for both possibilities:
Create a new grouped Zone, starting with whichever Zone uses the Raspberry Pi with the most processing power in your setup.
Play a queue containing 96KHz content in the same file format.
Add only a single Zone at a time. As soon as you encounter a dropout, note the Zone name that you just added to the group and share it here along with the track that was playing and stopped. Please also describe how this Zone connects to the network relative to your RoonServer.
Thank you and weâll watch closely for your response.
Dear @connor
Considering the possible power investigation, I did tests for almost 30 hours, with the following setup:
Started with an ethernet-only group, Leaded by the Khadas Tone, added with Coreaudio and The Meridian Explorer (NUC) - everything went smoothly, and continued ok when adding the three wi-fi endpoints. It ran fine with this six endpoints for hours.
After that, adding the RPI2 caused problems almost instantly.
Judging I had pinpointed the question to power problems, I started othe group, this time with the Meridian Explorer (NUC) first. Then the problems started to appear regardless of the components of the group.
Went back to starting the group with thre Khadas Tone Pro. Problems started happening again, with every configuration of zones possible.
Rebooted: whole network, roon core, endpoints. Problems continued.
Tried to adjust again re-sync intervals, buffers, to no help. I simply cannot establish a good baseline in almost 48h of continual tests.
The main problem is the transition from a 96hz song to a 44hz song on Tidal. This is when the general failure more consistently happens, so you will find in the logs I focused many times on this moment to see if the transition would happen or not.
I hope the logs can help bring some light to this crazy situationâŚ
Thank you for the above testing - we very much appreciate your thoroughness in your process and your reporting!
We reviewed a fresh set of diagnostics around the timestamp of your testing, and it looks like this could be related to timing problems. Your system is consistently missing chunks of audio that line up with exact time intervals, like one second or one-tenth of a second.
This usually happens when timing sync signals between devices are delayed or lost, which can cause the audio to drop out or stutter, like track transitions, which also cause the buffer to flush and refill with new data.
As another step in troubleshooting, try upsampling everything via MUSE to 96hz first and try to drop out on
Hello @benjamin
Did the tests as directed.
When zones were alone no problem occurred (which is normal, it never skips on isolated zones, only on groups).
On 96hkz it took a LONG while for the skip to happen. Many hours. It happened eventually, and only when I used a zone with more than 4 endpoints grouped together (all set to 96hz fixed) - I am sure if I added convolution equalizer to any of them, for example, the skips would happen much earlier
On 44hz the behavior was more or less the same. Needed a zone with more endpoints to reproduce the skips (3 or more), but it seemed to happen faster than on 96hz.
tests were run during the last 36h approximately.
Thanks!
Letâs summarize what weâve learned from these tests:
The dropouts occur when you play hi-res files (large files) on distributed endpoints across your mesh network. This includes Ethernet-only groups, but all are still managed by one of the two Deco mesh routers.
The dropouts include a full second of sample loss in each case.
While this might be exacerbated by underpowered endpoints (like the Raspberry Pi 2), resource constraints at the endpoint-level are not the sole source of the dropuots.
While you havenât bypassed the mesh network entirely, you have replaced one Deco mesh router with another Deco router using a different generation of firmware. The issues still occurred.
RAAT requires tight timing coordination and low jitter across zones. Older Pis or unexpected multicast handling by the network can break this - youâre up against an environmental limit of using RAAT within this ecosystem.
It sounds like youâve been able to pinpoint certain configurations that wonât manifest the symptoms for some time.
However, even when relying on ethernet backhaul, the Deco is likely imposing its own traffic management to the nodes. This affects multicast traffic handling and can impact Roon - this is why you see the symptoms on Grouped Zones that are communicating across the network with one another and with Roon.
The strongest recommendation would be to rely on longer cable runs and connect all of your endpoints to the unmanaged switch, which is in turn connected directly to the router. In other words, bypass all of the mesh nodes (even those backhauled). This is obviously a challenging topology to impose in a residential environment.
Where would you like to proceed from here? Your testing has been incredibly detailed, and itâs clear youâve taken every logical step to isolate this problem. However, based on logs and behavior, this appears to be a real-world edge case; RAATâs group playback synchronization is sensitive to both network timing jitter and endpoint processing latency with a Group this large and files of this sample rate on this specific network. Older Pis and Deco mesh firmware are likely amplifying these effects. Unfortunately, developers have uncovered no architectural flaws with Roonâs clock sync in the logs weâve reviewed extensively over the last three months. What we do see are environmental vulnerabilities on this network and with the older endpoints. Thereâs not a clear fix we can release that would improve or relieve this behavior for RAAT with the combination of hardware, topology, and firmware that youâre currently using. However, the recommendations and best practices weâve attempted to outline can hopefully optimize performance and prevent dropouts even at higher-resolution playback and across large Groups.
Please let us know if we can elaborate further. Thank you.
Hello @connor
Thanks again for the reply. Concerning this:
It is already done when I referred to each ethernet-connected endpoint. They all connect to the unmanaged gigabit switch directly, and the main router (Deco X60) to the same switch - the Other Decos M5 only provide connection to the other three wi-fi endpoints. Maybe that helps in reviewing the logs, maybe not, but I think it is important that this âoptimalâ setting is already used to the majority of endpoints, and the problem appears as well when using these all-cabled, all-connected to the same switch endpoints (although less frequently than when we add the wi-fi endpoints and then I am sure the mesh setting contributes negatively to overall stability).
The moment I decided to look for help was exactly when through A/B testing I found that even the âbestâ hardwired endpoints would struggle sometimes, that DSP would amplify the problems, and it is specifically worse with Tidal. I expected problems with wi-fi, did not expect the endpoints connected to the same switch to be a challenge to the protocol.
Thank you for providing such clear context. Logs in the last several days have illuminated some more context that may provide a next step.
Both the RAATServer instance and the RoonServer instance on the Mac Mini show instability during low-level TCP sessions. Sockets are closing while RoonServer is still sending data. This is wreaking havoc across connections to endpoints on this network because RoonServer has to reset the connection entirely, sometimes during playback.
The fact that some of these TCP errors occur during upstream cloud handshakes indicates that multicast handling and/or RAAT clock sync might not be the causal factors. Both coud and RAAT TCP connections are getting killed while Roon is still sending them data.
Again, itâs natural to suspect the protocol when facing persistent issues. However, RAATâs clock synchronization and multi-zone streaming have been rigorously designed and widely proven in complex setups. The consistent one-second dropouts and connection errors youâre seeing are classic signs of network-related interruptions when Roon has to reset a network connection to one of the many active Zones. This is why weâre focusing on the Deco meshâs handling of multicast and TCP connections.
Please double-check that Cloudflare (1.1.1.1), QuadNine (9.9.9.9), or another reliable DNS server is assigned in the router.
In the background, the Cast and Shairport-based devices are constantly broadcasting their availability. This broadcast/multicast traffic isnât competing with RAAT but it raises the floor for both a) bandwidth saturation and b) resource constraint across the network, particularly when you have the WiFi-based endpoints involved.
Just for due diligence, please power down all Chromecast devices during the next test or disable their network access temporarily. Please also disable Shairport Sync on all Pis, or turn off the Airplay receivers on any endpoints. This will greatly reduce background noise and traffic on the network and in logs.
Next, perform the same test of playing high-res Tidal content and adding in a single Zone at a time to the group. Please, per our usual cadence at this point, note the name of the track that first dropped out (or the approximate time).
Weâre very eager to help see this through and restore a reliable playback setup. Thanks again for your patience.
Hi @connor !
Thanks again for the thorough explanation!
I took me some time to be able to effect this test since I would need to manually turn off the chromecasts. I learned as well that ropieeee does not make it easy to turn off shairport so instead I turned all wi-fi pis with ropieee off as well.
I started only with the following endpoins, all connected through ethernet through the same switch, each with a different DSP setting (on purpose to âforceâ the issue).
Played back and forth for a while with no problems.
Then I turned on and joined in the group:
6. Pi zero 2 w with ropiee - wi-fi 5ghz (bedroom)
7. Pi zero 2 w with ropiee - wi-fi 5ghz (kitchen)
8. Pi zero 2 w with ropiee - wi-fi 5ghz (bathroom)
No problems happened.
Decided to turn on all chromecasts again to see if any ânoiseâ would force the issue to appear, again everything smooth.
However when I decided stop testing something interesting happened:
I moved the Zone â1â (Mac Mini coreaudio) out of the group in order to use the audio of the computer. Problems appeared instantly and could be reproduced all the time.
Intrigued, I added the zone â1â to the group again. Problems disappear.
Removing the zone â1â would make the problems appear again, instantly.
For reviewing in the logs:
20250811 - 3h29 pm - skips and stops after removing coreaudio
20250811 - 3h32 pm - plays fine with coreaudio included
20250811 - 3h36 pm - skips and stops after removing coreaudio
It seems the combination of endpoints is playing a major part in the problem. Maybe this could open other line of investigation? Let me know. And thanks again!
Hello @Marcelo_Bulgueroni,
Thank you for your detailed testing and for sharing your observations. We apologize for the delay in our responses.
To help us investigate further, could you please provide a bit more detail:
This information will help us understand the interaction between endpoints and grouping, and may guide further investigation into the behavior you observed.
Thank you again for your thorough testing and patience!
Hello @vadim,
Thanks. Here are the answers:
As per my previous post the group used comprises 1 to 8 of the endpoints described on my post.
I also tested now with a group comprising only of endpoints 1 to 5 to keep out the wi-fi ones.
Yes, just tested, the same issue appears, both with the 1-8 group and 1-5 group. When the coreaudio endpoint is included in the group there are no issues. As soon as the endpoint is not part of the group all issues come back
Today, August 25, at:
04:46 PM (grouped 1-8)
04:47 PM. (grouped 1-8)
04:51 PM - (grouped 1-5)
04:51 PM - (grouped 1-5)
04:53 PM - (grouped 1-5)
Thank you for your thorough report and testing.
Weâve examined these timestamps and see a continued pattern of precisely half the generated samples dropping before they reach endpoints. What we donât see is any evidence of packet loss directly. Weâd like to try and zero in on clock sync from here.
Adding the Mac Mini provides a RAATServer instance to serve as the lead clock sync instance. Removing this Zone - after having created the rest of the grouped Zone - forces RAAT to pick a new clock master. RAAT doesnât always pick the second Zone added to the group chronologically as the next clock lead, but it often will. What happens if you repeat the creation of the group Zone 1-8 (with Mac Mini as the first Zone, Rpi 4 to Khadas Tone Pro 2 as the second), but this time remove both of those Zones, leaving the remaining six?
The intention here resembles that of our earlier effort when we were testing to isolate an âunderspecâdâ Raspberry Pi with inconclusive results. We want to try to isolate (with your current topology, donât worry about rewiring at the moment) which Zone is not working well as a clock lead.
If youâd like, you can continue to remove Zones in the order in which they were added, hammering playback at 96Khz with DSP like before.
Let us know the results - weâre eager to ensure we can resolve this issue for you. Thanks again for your patience.
Dear Connor,
Testing now and will be back with results. Thanks!
Dear @connor
Canât say it will be of much use, but here are the new tests, always trying to find a group combination without coreaudio that would work to experiment with the clock sync, but I was mostly unsusessful.
Just keeping here for reference:
Tests conducted TODAY, 09.08.2025, times are Brazilian Time.
I will put the âorderâ in which I added the zones to the group
Group 3-2-4 : problems 10:05 / 10:06
As soon as I add the zone 1 problems stop
Group 3 - 2 - 4, freshly formed group - no problems
Added all the other zones except 1: problem repeats at 10h20
Went back to 3 - 2 - 4 - problem now happened at 10h21 AM and 10h23 AM - less errors though
Adding Zone 1 (core audio) instantly solves problems
Removing Zone 1 and keeping 3 - 2 - 4: problem at 10h25 AM
Group 8 - 7 - 6 (wi-fi only!) - worked fine at 96 hz with 3s crossfade and audio leveling
Added convolution equalizer to zone 8 - problem appears at 10h32 AM
Removed convolution from zone 8 - plays fine
Added zone 1 (thus 8 - 7 - 6 - 1): plays fine (expectedly)
Removed zone 1 (8 - 7 - 6) - still plays fine
Added zone 2 (8 - 7 - 6 - 2) - problem at 10h40 AM
Turned off convolution equalizer on zone 2 (keeping only 96 khz upsampling for all zones): played fine
Added zone 3 (8 - 7 - 6 - 2 - 3) - all at 96khz - problem appeared at 10h48 (it is much less frequent when DSP is âlighterâ)
Adding other zones (8 - 7 - 6 - 2 - 3 - 1 - 4 - 5) and enabling convolution for zones 1, 2 and 3: consistent problem at 10h58 AM, 10h59 AM
This with core audio included
Removing convolution from all zones and keeping upsampling at 96khz: seems to play fine A LOT of times, but ended having the same problem at 11:06 AM
This shows that even having core audio among the zones has its limitsâŚ