This aligns well with my observations. From also looking at core logs, I’ve formed the hypothesis that Roon Core 234 somehow mistakenly senses that an Android Remote is unreachable, and then sets some internal state that blocks any further handshake with that Remote.
This is driving me to strange desperate lengths to try to get my handheld Roon controller back.
Just in case something somewhere in this communication was allergic to the Asus WiFi access point I’ve been using (reliably with everything except Roon Remote), I thought I’d give a Ubiquiti UniFi access point a try.
I got it configured, switched my Pixel XL to its wireless network, and… Roon Remote worked!
For awhile. Then it seized up again.
I have no idea what made it work, and what made it stop working. Powering off and rebooting the access point didn’t make Roon Remote connect. Stopping and re-starting RoonServer didn’t make a difference. Rebooting my phone didn’t make Roon Remote work. I didn’t try shutting down and restarting the host RoonServer runs on, because that’s a giant pain in the ass which didn’t make a difference the last time I tried it.
Note that when the Android phone went from not working to working, RoonServer had been running continuously for many days; all that had happened was tearing down and rebuilding the WiFi network… but see my failure to recreate this effect via power cycling.
Since I’m guessing few people’s networks include the copper<->fiber media converters at each end of a run of fiber featured in our house on the way to the WiFi access point, I began suspecting them (even though laptops running Roon and phones running everything but Roon are using that network path successfully); so I strung a long messy run of Cat6 to feed the access point so the whole run between Roon server host and access point was plain copper.
Is there anything we can identify as common across setups where Roon Remote is failing? Are we all using WiFi access points which are separate from our routers and bridged to the rest of the LAN? Are there commonalities in Android versions or Android-device hardware? [The latter seems a bit unlikely, because even just the Android hardware I’m trying spans a few flavors of hardware and software.]
Did the UDP-to-TCP protocol change spoken of as included in the recent Roon release change communication to the Roon app, or just communication to playback zones?
Is the server not just trying to connect to the app as a controller, but also to connect to it as a potential playback zone? I think I see hints of that in the logs. If the Core is trying to hook into the Roon Remote app as a potential playback zone, could something about that negotiation be poisoning the well?
Is there any way to configure the Roon app to force it to act as purely a controller, with no attempt to be a playback zone?
These are the things I keep thinking about as I flail around trying to see if I can get functionality back.
Do I need to buy an iPad? That would truly be an act of desperation.
Your experiments and analysis match mine. I’ve been using a UniFi AP all along, which worked perfectly for years. Like you, I have a funny feeling that the UDP-to-TCP change in RAAT together with the remote being treated as a potential playback zone might be tickling some bug in Android networking or the Android Mono implementation that causes the core to decide the remote has gone AWOL.
And yes, my iPad works perfectly as a remote. If I had not got an iPad last month for other reasons (long story), I’d be as stumped as you are.
I haven’t found a systematic way to the Android app to connect to the core. Sometimes just rebooting the core fixes it, sometimes I have to reboot the core multiple times and/or reboot the Android (Samsung S6 on Android 7) to get it to reconnect…
It’s set to always. Not tried disabling enabling the WiFi might try that next time it happens.
First off, we appreciate everyone’s patience as we’ve worked through this issue. When we get a clear set up steps to reproduce an issue, our team can almost always identify the issue quickly, and in many cases we can resolve it for our next release.
In this case, the reports have stretched across a number of releases, platforms, and configurations, and to complicate things further it’s not something anyone inside our company has seen with any regularity, despite the fact that more than half our team uses Android devices in their homes.
With many, many users running our Android app without issue, obviously there is something unique about the affected environments that is triggering this behavior, and we need to figure out what this is so we can investigate this issue in a controlled environment and resolve it.
Our team has been actively discussing what could be causing these connectivity issues in certain configurations, and our QA team has spent time trying to identify the conditions that trigger this issue. Particularly because these reports often appear to be intermittent the results have been largely inconclusive…
After numerous internal discussions, we’ve decided to take a more methodical tack and gather all the reports and troubleshooting steps in one place. We are aware that, for some of you, this will feel like you are repeating yourselves, but the goal here is be comprehensive – to get all the setup details in one place, and to get all the troubleshooting results in one place, so we can identify any patterns that might help us consistently reproduce this issue in-house.
At your earliest convenience, it would be great if everybody could fill out the survey found here. We really appreciate everyone taking some time to run through the questions and steps listed in the form, even if you’ve already provided this information to us elsewhere.
Looking forward to making progress on this and getting it resolved soon. Thanks all!
I did the survey but it is a bit shaky, for example the last question when you shut-down the remote and try to reconnect. That worked the first time i.e. I answeared that it worked. After finishing the survey I shut-down the remote again and then it could not reconnect …
Also the question “Have you tried disabling all firewalls”, when you have no firewalls to disable … then you need to answear “No I have not disabled the firewalls” …
Going through the survey was interesting. For one thing, I learned that the particular sequence of:
- shut down RoonServer
- start Roon Remote on the Android device
- while the above is still feeling around for the Core, restart RoonServer
resulted in the remote connecting for the first time in pretty much forever. So, fascinating.
It also resulted in my assigning myself the project of trying to figure out if there could be any jumbo-frame action between RoonServer and Roon Remote. Seems unlikely, but when I get back home I’ll at least doublecheck what MTU the Ethernet port in use on the server is set to use. I’m expecting to find it at (the default of?) 1500, but will check. Or is there some other setting or some characteristic of intermediate equipment I should be looking at?
I just double-checked MTU from my Ubuntu NUC:
$ traceroute --mtu 192.168.2.69 traceroute to 192.168.2.69 (192.168.2.69), 30 hops max, 65000 byte packets 1 192.168.2.69 (192.168.2.69) 357.827 ms F=1500 71.934 ms 2.449 ms
192.168.2.69 is an Android device (Pixel) that is unable to talk to the core unless the core is stopped, the Android app is restarted, and then the core is restarted.
Well, I did it. I bought a freakin’ iPad so I can control Roon conveniently while resolution of this issue is playing out. It’s good to check out the other popular mobile technology family every half-decade or so, I guess.
I’ve confirmed that I still pretty strongly dislike iOS - even in its very latest flavor on current hardware, its clunky user interface just makes devices feel like duller tools than up-to-date Android devices - but I guess its network stack and apps involve a different codebase from Android, and it’s been working solidly as a Roon remote.
So now I’m experiencing less daily frustration and aggravation from the current bug, but I’ll of course continue to do whatever I’m asked to do to help debug it.
I agree. I got my current iPad when I left my Nexus 7 on the seat pocket of an intercontinental flight and I could not find a satisfactory Android replacement. Two months later, the airline found the lost Nexus 7 and returned it, but by then too late. However, the iPad as a beautiful screen.
Some good news, and some confusing news.
The good news: my Android Roon app seems to be connecting quite happily to RoonServer.
The confusing news: I saw that a new Roon Remote app was in the pipeline available to install, so I tried the old one again just to confirm that it didn’t work, before trying the new one.
And the old app (build 233) seemed to work fine!
I installed the new one (build 242) and it’s working great as well.
This was all still before updating the build 234 Core.
So this doesn’t provide as much useful information as hoped.
In between when build 233 was not working for me and now, when it is… the only significant change I can think of along the path between app and Core is that I replaced our faithful central Juniper switch with one from Ubiquiti. I didn’t actually re-test the Android Roon app after than change because I was sunk so far into despair that I’d given up on life and begun using an iPad as my remote.
So… huh. I’m sorry that I don’t have good data to give you about what impact the new app build has on a busted setup.
The last build (233 ?) had become stable for me - with my 'phone only rarely losing contact, but regaining it quite soon afterwards.
Since updating to 242 this morning - 1 (NUC) core & 2 (Windows PC) remotes - my 'phone has failed to connect.
I’ve rebooted everything in the chain, but to no avail - still looking at the ‘Remote Connection’ screen.
Yes for me also, 242 totally killed my Android Remote. Only way to make it work for a very short time is to restart the ROCK server … not very good user experience needing to restart the server every time you want to change track.
@mike please tell us there is some good news on a fix comming in the near future.
No difference for me with the 242 update.
I have also noted that the android device connected to my dac is still available to play music through, even though it shows the ‘Remote Connection … Waiting for remote core’ and doesn’t appear connected to the core.
Same here, 242 makes no difference on my Pixel, still stuck.
@support Do you happen to have urls to previous versions of the apks of the android apps, so we can go back to the ones that work for us until you get the issues fixed?
My setup is the latest roon core on a archlinux machine. The core goes through hQplayer, running on the same machine. It connects with my dac through NAA via microrendu. Remote is through windows 10 laptop. All is working well. The remote app on my Meizu Android 6.0 phone doesn’t work. It can’t find the core, also when I let it scan for IP address. When my laptop is still on the app sees the windows10 remote, But never the archlinux core. My VNC app on the phone contacts the archlinux without any problem.
I checked all the settings on the phone, but can’t find any strange things. Now I am out of ideas. Do you have a clue?
I moved your post into this thread. Be sure to read this message above by @mike and fill out the survey – it will help getting this issue resolved more quickly:
Thank you for placing my post in this topic. And thanks for the link to the survey. I filled it in… Lets hope they will figure this out. In the mean time I noticed that starting, or restarting the core when running the app on the Phone is a good workaround.