Hi everyone!
We appreciate the feedback you’ve provided on the changes introduced in Build 610. I spoke with the development team about this with the hopes of being able to provide some further insight into the changes that were made, as well as the work that is still to come.
The findings from our investigation
Through the acquisition of hardware, in-depth QA testing, and investigations into customer-reported issues, we’ve made a lot of progress in understanding some of the reasons that this high utilization is occurring, but this remains an open investigation.
The most important thing to understand from our investigations up to this point is that this is not one singular issue resulting in high resource utilization. Rather, there are a few issues that can, together or separately, manifest in high CPU and/or GPU usage.
Some of these appear to have resulted from changes to macOS or Apple hardware over time. Obviously Roon needs to be made to work in Apple’s constantly changing environment, but it can sometimes take time to figure out what they have changed and why.
What changed in Build 610
Build 610 contains some mitigations for potential causes of the high GPU utilization reported by certain users.
In alpha testing (and now release), it looks like one mitigation in particular is helping to reduce the most severe GPU utilization issue. This particular mitigation introduces an additional mechanism to synchronize graphics operations with monitor refresh, independent of traditional vsync. We have not uncovered the reason(s) why vsync appears to be buggy on these machines (we have yet to reproduce this specific condition internally) and are continuing to look into this.
Independent of changes intended to target this specific set of issues, Build 610 also includes optimizations intended to reduce graphics-related CPU/GPU utilization in general, as well as to reduce Roon’s consumption of GPU memory and GPU bandwidth.
What is still to come
It is clear that some recently-made Macs are still exhibiting exceptionally high GPU utilization even with these optimizations in place. This suggests that there’s still something happening on affected hardware that we have yet to diagnose.
Our team continues to tease apart other aspects of this issue, including networking stack involvement, meaningful commonalities shared among affected hardware/OS combinations and usage patterns, as well as the sequence of events that leads to the problems in the first place.
As of yet, we have not been able to make these problems happen on demand, and the issues are not affecting people inside of Roon Labs reliably, even people with the same hardware and OS configuration as affected users; this has significantly slowed our progress, since we are not able to see it in front of us and debug the problem directly. Any information that might help us make this reproducible on demand would be a big help.