Try option 2 first.
In his master’s voice:
Roon transmits LPCM to Squeezeboxes.
ffmpeg is not used to decode FLAC (only for mp3/aac, because those require codec licenses).
The next major release of Roon wll support on-the-fly FLAC encoding for Squeezeboxes to reduce networking demand (at the expense of CPU usage on both sides doing the extra encode/decode) as a setting.
@brian We may be on the verge of an “ah ha!” moment here. How does Roon do the FLAC to LPCM transcode? That may be a place to look. Is it an OS-provided component?
LMS has a user option to send native FLAC or transcoded PCM (WAV) from the server to the client. I seem to remember that there were issues with Enhanced Digital Output with PCM streams. In any case, I think prevailing practice was/is to send FLAC to the Touch natively. That’s the way my LMS was set to do it. Harry? Mind, this may be moot, because Roon is doing just fine running on Mac, Arch, and Ubuntu/Kubuntu, and it’s sending an LPCM stream in all those cases.
On a more general note, it appears that we are one or two experiments away from confirming that Harry and I can solve our own particular problems by rebuilding our servers on new OSes, or we can press on until we really figure this out. I see two things that argue against the first approach. For one: It would be, in my case anyway, an awful lot of just plain work. Two: It would leave the users and Roon with an unsolved issue that could easily bite again. I vote we keep going until we know what’s going on.
For my part, I’ll look to see if I have a Fedora 22 image, and if I do, I’ll whip up a VM and we’ll see if it behaves any different from Fedora 22 on hardware.
@evand Yes, did that. So did Harry, I think. And I did quite a lot to pretty throughly eliminate client malfunction from suspicion. If this is something the client is doing, it’s something the client is supposed to be doing that just isn’t playing nicely, like maybe an inappropriately sized buffer or something.
[quote=“Carl_Seibert, post:95, topic:14000”]
like maybe an inappropriately sized buffer or something.
[/quote]you didn’t ever experiment with all the sbt tweak toolkits that some did, did you?
We ship our own copy of libFLAC (libFLAC 1.3.1–the current version). I’m having a hard time seeing a relationship between the FLAC decoder and this issue, though.
My guess is, the “falling to static” behavior happens when audio data doesn’t make to some part of the driver/hardware in time. It’s in a class of symptoms that I generally associate with poorly written drivers (in this case, the audio drivers on the squeezebox). Symptoms like this point to a performance issue somewhere, but don’t indicate where, since the problem is happening at the far end of the chain.
Squeezebox streaming is TCP based, and flow control/buffering as well as hardware buffer size selection is all done within the SB Touch, so there’s no buffer sizes to tweak in Roon’s world.
I think transmitting as FLAC might resolve this since it does shift performance costs around the system, and it’s also what LMS does (so that approach has a lot more hours of reliable use behind it). I can’t be sure because we haven’t pinpointed the cause of this issue yet, which is the case because we haven’t been able to produce those symptoms here.
The moment when we see the problem here, progress will become much easier and much more certain. What you’re doing–eliminating variables one by one–is the best way to get to that point.
We’re going to deploy the FLAC change regardless of what happens in this thread, so worst case, you’ll have something to try when that release comes out. I’d be more comfortable if we could figure out how to reproduce this and be able to fully validate this as a fix sooner rather than later.
Cue the spooky Halloween music…
I built a Fedora 22 VM, no updates, just Fedora 22 straight from the ISO and I installed ffmep when the Roon install script complained about it. And it worked. No fail-to-noise, no missing queue entries. It just worked. (With some network issues) I don’t know what’s more frustrating: when something doesn’t work that’s supposed to or when something does that isn’t.
Then, just to make sure the planets hadn’t realigned, I went back to Fedora 22 directly on the hardware. And it failed. Instantly now.
What. The. Heck.
Harry mentioned hardware and I sort of dismissed the idea at the time because all the host’s hardware is used by the VM. But and that was then and now is now. I think Harry is also making a VM with the same OS as his hardware. I guess we’ll see if the same strangeness happens there.
Short of something where the virtual hardware works and the real hardware doesn’t, the only (sort of) plausible explanation I can posit would be that something installed on the real machine, that isn’t on the clean VM, might be causing the issue. But I have no clue at all where to start looking.
Played around again with my hardware box, trying to understand what the difference is with the VM.
I’ve become a little bit suspicious about my CPU freq scaling that I see on the hardware box, which is not present (for obvious reasons) on the VM.
On the hardware box the governor was set to ‘powersave’ in stead of ‘ondemand’ or ‘performance’. The latter is what I did. Now, it seems that it takes longer before the noise kicks in. The ‘seems’ is because I don’t have hard evidence, and in the end the noise is still there.
I’m going to investigate this more. Strange thing is that I’ve monitored load extensively, and at no point in time my system is running out of breath, at least from where I’m sitting.
Ah well, the story continues, but possibly only in the weekend.
Perhaps run memtest on your boxes and confirm your RAM is good.
So I created a VM, based on openSUSE Tumbleweed, and ran Roon.
Guess what: this works beautifully with 192K content.
So this means we can take a distribution specific issue of the table as well.
This also means that unfortunately I’m running out of options on what could be the issue, besides real hardware problems. Which, I still can not explain why they would kick in only when playing 192K and not with other content. And believe me, my Roon setup made quite the hours over the last year
I’m still gonna follow up on my remark about processor throttling. See if I can create a reproduction scenario on my VM.
On a side note: @brian while setting up I remembered you mentioning sometime ago working on a Roon specific Linux distro. And I was thinking when doing all this stuff…
- how about a Docker container? Not because it makes sense, but purely because it’s fun Do you know if someone already did this? because otherwise I’m gonna give it a go.
- the tiny Roon distro… is this not something we can turn into a simple community driven project?
This is why I’m suggesting a memtest to verify your ram is good throughout its addressable range.
You’ve read my remark the issue only appears when playing content besides 192K right?
If it would be memory then it seems logical that the issue would appear ‘all over the place’.
But, in general, you’re right and I need to test it to rule it out.
sorry to have been absent. Was fighting another battle. Synology took 4 days to incorporate 2 new 3TB disks, everything went dark with dashboard Volume at 100%.
Roon was inaccessible, switched it off eventually.
After all is up again, the NAS dashboard is totally calm, no excessive use of anything, and still the whitenoise persists.
I need to re-read and chew on the previous posts, all way too technical for me.
Investigated memory, investigated CPU throttling and performance in general. Investigated networking/infrastructure… everything ok.
For now I revert back to 96K max.
Actually, I think that is exactly where @spockfish is at the moment. What Motherboard/CPU/Memory are you using?
Just shoot me now.
I installed a second hard drive in the machine that hosts my files. I installed Ubuntu Server 16.0.4 on it Installed RoonServer and practically nothing else. Then I mounted the original hard drive as a data drive. It’s at the same mount point in both OSes, so I didn’t have to reconfigure a ton of Samba stuff.
And it fails. But a little differently now. Now I’m seeing failure to noise followed by an interval of music, failure to noise for a while, then music again for a while. Seconds, a minute or so tops.
Now this machine is not very powerful, compared to the Fedora machine on which I’ve been trying to run RoonServer.
[root@asok ~]# lscpu
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Vendor ID: AuthenticAMD
CPU family: 15
Model name: AMD Athlon™ 64 X2 Dual Core Processor 3800+
CPU MHz: 2008.927
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
It has only two gigs of RAM.
At first, I was getting noise on both 96K and 192K files. I had the data mounted via CIFS. I changed that to a local file system mount and things improved a little, to where 96K files play alright, but no change at 192K. This felt like lack of resources to me, so I tried lessening processor and memory load a little by turning off background processing and reducing memory reserved for pictures in Roon. Might have helped a little, but didn’t resolve the issue.
I tried setting bit rate limiting. If I limited the bit rate to 48K, my 192K file played perfectly. Despite the increased processor load, go figure, right? Limiting bit rate to 96K didn’t help - same pattern of failure and music in intervals.
Whether playing a 192K file, a 96K file, with bit rate conversion on or off, system load looked like this:
root@asok:/home/carl# ps -aux | grep RoonServer
root 2003 0.0 0.0 12540 1900 ? Ss 16:35 0:00 /bin/bash /optRoonServer/start.sh
root 2009 0.0 0.4 544288 13920 ? Sl 16:35 0:01 /opt/RoonServe/Mono/bin/RoonServer --debug --gc=sgen --server RoonServer.exe
root 2020 80.5 33.6 1762752 1036668 ? Sl 16:35 80:05 /opt/RoonServe/Mono/bin/RoonAppliance --debug --gc=sgen --server RoonAppliance.exe -watchdogport=33452
root 2027 0.0 0.0 7328 780 ? S 16:35 0:00 /opt/RoonServe/Server/processreaper 2020
root 2050 0.0 0.2 830992 8836 ? Sl 16:36 0:03 /opt/RoonServe/Mono/bin/RAATServer --debug --gc=sgen --server RAATServer.exe
root 4262 0.0 0.0 14228 980 pts/0 S+ 18:15 0:00 grep --color=auto RoonServer
That’s CPU hovering just under 80% and memory at about 30%, give or take a point or two.
Now as I write this, I’m playing to an Oppo HA-2, through my laptop as a Roon endpoint, and a 192K file plays perfectly, with a nice purple “lossless” signal path. Which suggests to me that the meager power of this server machine isn’t really the issue. UPDATE: I just checked and indeed, streaming to the Roon endpoint does take less CPU - 56.6%, compared to ~80%. Memory use is actually up a little, at 33.4%. Hmmm. But before I get excited, I remember the Fedora machine has ample resources.
I suppose I could go back to one of the virtual machines and cut back memory and CPU capacity in increments until it fails to see if I can make the same failure mode.
EDIT: If I could make Roon fail on a VM, I’d be happy to send you guys the whole VM, so you’d have it. But so far, it seems bulletproof on VMs. At least I suppose I’ve shown that it is possible that Harry’s machine and my original one are not the only two in the world with this problem. Maybe you will be able to replicate.
Yeah, I thought “network” again. But there’s nothing in really common between Harry’s environment and mine and both fail. Everything in the network where I’m getting failures is also there (and more) when I put a VM on that same network. So that thought exercise didn’t go anywhere.
Hopefully, I’ve at least injected a few new data points into the mix and maybe something will ring a bell with somebody. Hopefully. To tell the truth, I did this so I might be able to just listen to music for a while without having to mess with IT work every time I sat down to listen. But that was not to be, I guess.
And in case… here are all the log files from RoonServer on the new machine:
And another promising thought leads nowhere…
@brian Your post got me thinking.
Based on what we’ve seen, especially the fact that it doesn’t appear that I’m having resource issues on the server end, and that we’ve eliminated the network as a suspect (although a 192K PCM stream stresses my setup right to the max) I figured it would be a reasonable guess that we overwhelming the I/O capabilities of the Squeezebox. Then the least little thing happens and kerplewy.
So the least little thing appears to be happening after RoonServer and before the NIC of the server’s hardware.
So I thought “buffer”. It turns out that there are buffer settings on the client side, exposed in the EDO UI. I couldn’t find any documentation as to exactly WHAT buffer, but what the heck. So I tried all of the available settings. And no joy.
So then I set the transmit buffers on the server box to about 8MB (About the double the old max value and some 40x the default, which was 212K-ish. And. Still no joy.
Now what I knowing about buffers wouldn’t fill one, so I could be on the right track but not going in the right direction. So this line of thought might still be of value.
And the lower level reason may yet provide a clue: We know that we’ve never made a failure on a virtual machine, even when the virtual machine is the same OS version as the host it’s running on, and RoonServer directly on the host fails. So, our signal path, so to speak, is going from RoonServer through the virtual machine’s kernel and network stack, through some virtual machine magic that functions as a router, then from the virtualization software to the host machine and out its network stack, and on through the same NIC as when the system fails. And despite all that complexity and latency, it works. In other words, passing through the OS once fails, but twice succeeds! Thus my thought that doubling the size of the transmit buffer might help. It didn’t, at least the way I did it. But what else could there be, through which two passes instead of one might be doing us some good?
BTW, I think that streaming as FLAC may be of great help. We know that the little processor in the Touch can decode FLAC at 192K. But I/O-wise, 24/192 PCM is four times more than the device was designed for, and about ten times more than was probably routinely sent through it. (At least at my house. I stopped sending PCM from the server when the Touch came out. It was a benefit with the old SB-3s, but with the more powerful Touch, the only effect seemed to be network load.)
I hope maybe something here rings a bell.
In the meantime, back to cleaning up. One of my cats barfed on my favorite headphones.
I agree regarding the FLAC streaming.
My impression is, without FLAC streaming, the SB Touch is very near “the edge” of its networking capabilities at 192k. When it goes over the edge, the symptoms happen. Moving to VM, and other things that you guys are testing is pushing it closer or further from the edge a little bit, but they’re not central to the issue.
From our perspective, the real next step here is to test again once FLAC streaming support rolls out with 1.3.
That sounds sensible. Agreed about the Touch working at the edge of its capabilities.
I would love to know why on earth we’re OK with virtual machines and not real ones. But if it’s to become a moot point, it’s just one of those things. We could put our effort to better use elsewhere in the interim. (Like adding Focus to Radio
Any timeframe for the release of 1.3?