Android app having trouble finding Linux Core

I have not had any issues on my Samsung Galaxy S8 for a while now (at least a couple of weeks), since two things happened:

  • update to Android 8.0,
  • add Roon to Settings/Device Maintenance/Battery/Unmonitored Apps.

Note that it was in there before the update to 8.0, but it made no difference.
No other changes in my network that I remember, except maybe a software update to the Google WiFi pucks.

Curious, some others who have reported this problem at some point, including me, have UniFi APs and Ubuntu too… OTOH, my Pixel 2 has been connecting well for quite a while.

I rebooted my APs yesterday which had no effect on the situation. I rebooted the Roon Core afterwards. Android 6x. devices are still not able to connect. My Roon Remote on Android 7.x tablet (which was working without problems before) got stuck in the “intializing” screen (never had that beefore). After rebooting the tablet I was able to connect again. My windows client is working all the time. I will reboot my router tonight and see if this helps. If not, there is nothing left to reboot and I am pretty clueless what to do next…

I restarted my router but this didn’t change anything. I started to dig a little deeper and found one strange thing:

The device with the Roon Remote is a Sony Xperia Z3 Compact (modelno. D5803) running Android 6.0.1. The IP address is assigned via DHCP and is 192.168.0.204 in my internal network. This IP address can be seen in the “About the phone” menu in Android and is also visible in my UniFi management tool for the WiFi.

When I start the Roon Remote app I can see the following lines in the RoonServer Logs

03/21 19:07:35 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] lost client connection. Retrying
03/21 19:07:35 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] connecting (attempt 1)
03/21 19:07:35 Warn: [rnet/RnetJsonClient] failed to connect Connection refused
03/21 19:07:35 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] client connection failed. Retrying in 500ms
03/21 19:07:36 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] connecting (attempt 2)
03/21 19:07:36 Warn: [rnet/RnetJsonClient] failed to connect Connection refused
03/21 19:07:36 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] client connection failed. Retrying in 750ms
03/21 19:07:37 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] connecting (attempt 3)
03/21 19:07:37 Warn: [rnet/RnetJsonClient] failed to connect Connection refused
03/21 19:07:37 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] client connection failed. Retrying in 1125ms
03/21 19:07:38 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] connecting (attempt 4)
03/21 19:07:38 Warn: [rnet/RnetJsonClient] failed to connect Connection refused
03/21 19:07:38 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:48149] client connection failed. Retrying in 1687ms
03/21 19:07:40 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:42344] connecting (attempt 5)
03/21 19:07:40 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.218:42344] connected
03/21 19:07:40 Trace: [rnet/RnetJsonClient] SENT {"request":"enumerate_devices","subscription_id":"0"}
03/21 19:07:40 Trace: [rnet/RnetJsonClient] GOT NONFINAL {"status": "Success", "devices": [{"auto_name": "Jörg's D5803", "device_id": "default", "is_system_output": true, "type": "android", "name": "Default Output", "auto_enable": true}]}
03/21 19:07:40 Info: [raatserver] GOT DEVICE 854862d3dbc3ce7a970100efbe4705a9::default Type=android Name=Default Output 

The IP address in the logfile is part of my DHCP address pool but is not identical to the one currently assigned to the phone. In my UniFi management tool I see that no other device has this ip address assigned at the moment.

I don’t understand why this is happening but it seems obvious that server and client cannot communicate with each other using different addresss.

I am a newbie to Linux and Ubuntu but I checked the arp cache (arp -n) on the server which shows those entries:

192.168.0.204 ether c4:3a:be:a9:a3:e5 C eno1

which is the Xperia phone with the actually assigned adress.

192.168.0.218 (incomplete) eno1

which is the currently unused address the Roon server is trying to talk to.

Any thoughts on this ? Any help is highly appreciated…

The Roon server is now detecting the correct ip address for my Xperia phone but the connection still doesn’t work (see log below)

03/21 21:30:04 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] connecting (attempt 1)
03/21 21:30:14 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] client connection failed. Retrying in 500ms
03/21 21:30:15 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] connecting (attempt 2)
03/21 21:30:15 Info: [stats] 2568mb Virtual, 850mb Physical, 136mb Managed, 0 Handles, 36 Threads
03/21 21:30:25 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] client connection failed. Retrying in 750ms
03/21 21:30:26 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] connecting (attempt 3)
03/21 21:30:30 Info: [stats] 2568mb Virtual, 850mb Physical, 137mb Managed, 0 Handles, 34 Threads
03/21 21:30:36 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] client connection failed. Retrying in 1125ms
03/21 21:30:37 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] connecting (attempt 4)
03/21 21:30:45 Info: [stats] 2568mb Virtual, 850mb Physical, 137mb Managed, 0 Handles, 36 Threads
03/21 21:30:47 Trace: [raatserver] [RaatServer Jörg's D5803 @ 192.168.0.204:41152] client connection failed. Retrying in 1687ms

I checked again for IGMP snooping settings but the Technicolor router of my ISP has no option to turn this feature on or off. One of my switches is unmanaged (so no configuration options there). On my managed UniFi switch I turned off all spanning tree features.

I also did firmware upgrades on my UniFi Switch and APs.

@ProperlyDeafened: you made my day :slight_smile:
restarting the roonserver with the commands you posted did the trick

I don’t get it… what is the difference between rebooting the server or just stopping/starting the service ?

Is it possible that the boot order is the problem on my Linux system ? Is Roon starting too early or at a point where there some unfinished network tasks on the OS-level ? (I only know that my KODI media center has a configuration option which basically says “wait for network before KODI is launched”)

If you go back to the initial discussions of this problem last year, you’ll find a suggestion (I think from Roon) to do exactly that: stop and restart the Roon Core binary while the Android remote is trying to connect. That (almost) always works, but it is a nuisance. I’m not convinced the problem is a rece condition between Core start and Linux network stack start, though, the Core starts at the very tail-end of the boot process:

Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0231] device (eno1): link connected
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0235] device (eno1): state change: unavailable -> disconnected (reason 'carrier-changed') [20 30 40]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0241] policy: auto-activating connection 'Wired connection 1'
Mar 17 23:13:24 rishiri kernel: [    8.023755] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Mar 17 23:13:24 rishiri kernel: [    8.023800] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0251] device (eno1): Activation: starting connection 'Wired connection 1' (11d20ead-7450-349f-9a38-3a6c841578f4)
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0252] device (eno1): state change: disconnected -> prepare (reason 'none') [30 40 0]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0253] manager: NetworkManager state is now CONNECTING
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0265] device (eno1): state change: prepare -> config (reason 'none') [40 50 0]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0270] device (eno1): state change: config -> ip-config (reason 'none') [50 70 0]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0295] dhcp4 (eno1): activation: beginning transaction (timeout in 45 seconds)
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0335] dhcp4 (eno1): dhclient started with pid 912
Mar 17 23:13:24 rishiri dhclient[912]: DHCPDISCOVER on eno1 to 255.255.255.255 port 67 interval 3 (xid=0xf711744e)
Mar 17 23:13:24 rishiri dhclient[912]: DHCPREQUEST of 192.168.2.58 on eno1 to 255.255.255.255 port 67 (xid=0x4e7411f7)
Mar 17 23:13:24 rishiri dhclient[912]: DHCPOFFER of 192.168.2.58 from 192.168.2.1
Mar 17 23:13:24 rishiri dhclient[912]: DHCPACK of 192.168.2.58 from 192.168.2.1
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0586]   address 192.168.2.58
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0586]   plen 24 (255.255.255.0)
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0586]   gateway 192.168.2.1
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0586]   server identifier 192.168.2.1
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0587]   lease time 86400
Mar 17 23:13:24 rishiri avahi-daemon[754]: Joining mDNS multicast group on interface eno1.IPv4 with address 192.168.2.58.
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0587]   nameserver '192.168.2.1'
Mar 17 23:13:24 rishiri avahi-daemon[754]: New relevant interface eno1.IPv4 for mDNS.
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0587] dhcp4 (eno1): state changed unknown -> bound
Mar 17 23:13:24 rishiri avahi-daemon[754]: Registering new address record for 192.168.2.58 on eno1.IPv4.
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0596] device (eno1): state change: ip-config -> ip-check (reason 'none') [70 80 0]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0601] device (eno1): state change: ip-check -> secondaries (reason 'none') [80 90 0]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0604] device (eno1): state change: secondaries -> activated (reason 'none') [90 100 0]
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0606] manager: NetworkManager state is now CONNECTED_LOCAL
Mar 17 23:13:24 rishiri dhclient[912]: bound to 192.168.2.58 -- renewal in 42263 seconds.
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0636] manager: NetworkManager state is now CONNECTED_GLOBAL
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0637] policy: set 'Wired connection 1' (eno1) as default for IPv4 routing and DNS
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0638] dns-plugin[0x17d94f0]: starting dnsmasq...
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.0654] dns-mgr: Writing DNS information to /sbin/resolvconf
Mar 17 23:13:24 rishiri dnsmasq[923]: started, version 2.75 cache disabled
Mar 17 23:13:24 rishiri dnsmasq[923]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC loop-detect inotify
Mar 17 23:13:24 rishiri dnsmasq[923]: DBus support enabled: connected to system bus
Mar 17 23:13:24 rishiri dnsmasq[923]: warning: no upstream servers configured
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.1044] device (eno1): Activation: successful, device activated.
Mar 17 23:13:24 rishiri nm-dispatcher: req:2 'up' [eno1]: new request (1 scripts)
Mar 17 23:13:24 rishiri nm-dispatcher: req:2 'up' [eno1]: start running ordered scripts...
Mar 17 23:13:24 rishiri NetworkManager[769]: <info>  [1521353604.1067] dnsmasq[0x17d94f0]: dnsmasq appeared as :1.15
Mar 17 23:13:24 rishiri dnsmasq[923]: setting upstream servers from DBus
Mar 17 23:13:24 rishiri dnsmasq[923]: using nameserver 192.168.2.1#53(via eno1)
Mar 17 23:13:24 rishiri systemd[1]: Reloading OpenBSD Secure Shell server.
Mar 17 23:13:24 rishiri systemd[1]: Reloaded OpenBSD Secure Shell server.
Mar 17 23:13:24 rishiri systemd[1]: Reloading OpenBSD Secure Shell server.
Mar 17 23:13:24 rishiri systemd[1]: Reloaded OpenBSD Secure Shell server.
Mar 17 23:13:25 rishiri avahi-daemon[754]: Joining mDNS multicast group on interface eno1.IPv6 with address fe80::7e7e:d73e:6427:264c.
Mar 17 23:13:25 rishiri avahi-daemon[754]: New relevant interface eno1.IPv6 for mDNS.
Mar 17 23:13:25 rishiri avahi-daemon[754]: Registering new address record for fe80::7e7e:d73e:6427:264c on eno1.*.
Mar 17 23:13:29 rishiri NetworkManager[769]: <info>  [1521353609.0041] manager: startup complete
Mar 17 23:13:29 rishiri systemd[1]: Started Network Manager Wait Online.
Mar 17 23:13:29 rishiri systemd[1]: Reached target Network is Online.
Mar 17 23:13:29 rishiri systemd[1]: Started crash report submission daemon.
Mar 17 23:13:29 rishiri systemd[1]: Started RoonServer.

To update my own post:

After paying closer attention to what I’m doing, I see that to restore the connection to my Android remote, I hit “Save” in the Ethernet section on the ROCK home page. (I use an Ethernet connection.) That’s all. I’m mystified why this works, but it seems to work reliably. And I like it better than restarting Roon, because it doesn’t interrupt the music.
I wondered if there are others for whom this does not work. And if anyone has a theory why it works for me.

My theory, for what’s worth, is that when you hit “Save,” Roon Core resets ROCKs network stack to a well-defined state. My suspicion about this bug, from back when it started and I spent a lot of time trying to understand it, is that there is an interaction between core and remote network stacks and router/switch configuration around multicast that makes the core believe that the remote is not reachable. In my network’s case, the problem pretty much went away when I reconfigured a managed Netgear switch to stop doing IGMP snooping. But others have tried that without success.

1 Like

Hi all & @support,
After making the leap from Apple to Android I’m disappointed to find that I am suffering with something similar to that described at length above!
I have a Pixel 2, updated to latest version.
I run Core/remote (not server) (latest build) on Windows on an Intel NUC i5 - not Rock though.
I run devolo homeplugs and a BT Homehub 5 does DHCP (nothing static).
I have tried firewalls and battery optimisation. I have yet to try any network changes or simplification.
Happy to add another data point to a clearly challenging situation to get this resolved as it’s earily quiet!
I can get it working by clicking the setting in Roon that accepts remote connections but as soon as my phone sleeps I’m back to square 1!

This issue came back today after being gone (not solved) for months. I have not touched any network configuration to fix or to make the issue reappear.

Core runs on a win10pro nuc. Auto update enabled.
Pixel 2, Sony tablet and iPads as clients. All machine always updated to most recent version. iPads have always been unaffected.

Recalled a tip in one of these threads to disable igmp snooping in a managed switch. Did that and voila all is working again!

Weird though that it worked for months with igmp snooping enabled…

Now the problem spread to my iPad and Win10 Notebook, too. It seems Roon on Linux server is getting really bad now.

I have this same problem with the Android App on my Google Pixel 2. I just tried it for the first time on this device. I am running Roon Server inside a Bhyve VM on FreeBSD.

If I restart my RoonServer while the phone is on with the roon app running, it can find my server. After that if I kill and restart the app it cannot connect.

Windows 10 Desktop <-> switch <-> Roon Server - Works
Windows 10 Laptop <-> Ubiquiti AP <-> switch <-> Roon Server - Works
Google Pixel 2 <-> Ubiquiti AP <-> switch <-> Roon Server - Cannot Connect

Do you have any documentation on what the connection network protocol is supposed to be? A simple dump of the network traffic between Server and Phone should identify the problem.

If only it was that easy… If you look at this whole thread, you’ll find a lot of evidence that the problem has been very hard to track down. In my own system (which curiously also has an Ubiquiti AP), what has worked for the last several months was to turn off IGMP snooping on the switch that connects the Roon Core server to the rest of my network.

So I just tried it with my wife’s phone and it works. On my network its only my Google Pixel 2 that exhibits this problem.

Google Pixel 2 (Android 8.1) <-> Ubiquiti AP <-> switch <-> Roon Server - Cannot Connect
Samsung Galaxy S7 (Android 7.0) <-> Ubiquiti AP <-> switch <-> Roon Server - Works

This could be a bug in the google pixel phone or Android 8.1. Or some kind of configuration incompatibility. Or maybe the roon android app is doing networking slightly incorrectly and the Samsung phone’s sloppy design lets it work while the Pixel doesn’t.

I wonder if the guys at roon have any google pixel phones as part of their test environment?

IGMP Snooping has always been disabled on my switch.

Anyway, I can reproduce the error 100% so if roon devs want me to do some tests on my end let me know.

I did some more digging. It looks like the roon server is constantly multicasting out mdns queries and ssdp queries.

I installed SSDPTester app and verified my phone does see the SSDP packets.

I also tried this test app for multi-cast:
https://cafbit.com/post/testing_multicast_support_on_android/

Within that app, I see the multi-cast mdns packets coming from my roon core to my phone.

So it looks like the multicast traffic from roon core is indeed reaching my Google Pixel 2 phone over wifi. Either the app itself is not correctly receiving the packets from the Android OS or it receives them but then something goes wrong when it tries to send a reply back to the roon server to establish a connection.

I checked my RoonServer logs. When I start the app on my phone, I see that the roon core is able to connect to RAAT on my phone. (IP address hidden)

04/30 11:58:00 Trace: [raatserver] [RaatServer Matt Fioravante's Pixel 2 @ XXX.XXX.XXX.XXX:39387] connected

When I open roon from another machine and go to the audio settings, I can see my Pixel phone listed there as a networked device. I can go in and change its settings and rename it like any other audio device. Despite that, I cannot actually tell roon to send audio to it. The device is missing from the list in the main gui.

After that I looked at some packet captures. I’ve been so far unable to get a full capture on the actual phone, so I ran tcpdump on my roon server vm.

When I start the roon app on my phone, I can see UDP traffic going to port 9003 from my phone. From what I understand I believe that’s the RAAT network traffic which appears to be sort of working.

Next I filtered on TCP and noticed a specific difference in behavior.

When I open the roon app with my Windows client or with the Samsung Galaxy S7, the client initiates a TCP connection on port 9101 to the server. Some of the packets contain ASCII text saying “Sooloss.Msg.DistributedBroker.ConnectRequest” so this looks like the necessary traffic to establish a connection to the core server.

However when I open the roon app with my Google Pixel, I never see any TCP traffic over port 9101. Either the roon app on my phone never tries to establish this connection, or it does and then the TCP packets get lost somewhere (very unlikely as this is a normal unicast TCP connection).

Now finally one more piece of data. When I start the roon app on windows I can connect to my linux server as normal. Now after the windows app is running, if I try to start the roon app on my phone it will locate the windows machine but not the linux server. When I look at a packet dump, I can see my phone establishing that TCP connection over port 9101 to the windows machine. So somehow it works with windows but not linux.

Finally, going back to the logs. For windows and samsung, I see this:

04/30 13:15:42 Info: [brokerserver] Client connected: XXX.XXX.XXX.XXX:43367

I never see the above log message with my pixel. I never see any logs with the [brokerserver] tag.

So in summary, my google pixel 2 phone somehow does not try to establish the TCP connection over port 9101 to my roon core server on linux. I believe this is what is causing the problem. Without additional debugging/logging in the android app itself I’m not sure what else I can do to diagnose further. I have no idea why my google pixel exhibits the problem but my wife’s samsung galaxy S7 does not.

@support, let me know privately if you’d like me to try a debug package or want copies of my logs, packet dumps, or anything else. I’d really like to be able to use roon from my phone.

EDIT: One more data point.

There is one strange sequence of events that actually works. First I start the roon app on my pixel 2. It doesn’t find any servers. Next I reboot my server vm while leaving the app open. Once the roon server starts up, the pixel 2 finds it and is able to connect. In a packet dump, I see the TCP port 9101 traffic. Next I kill the roon app on my phone and restart it. After restarting I’m unable to connect indefinitely.

I hope this is enough information to help get at the bottom of this issue.

EDIT 2:

I was able to run tcpdump on the ubiquiti AP itself. The AP never receives any TCP packets on port 9101 from the Pixel 2. Very unlikely its a wifi or network configuration related issue.

3 Likes

My Pixel 2 worked as a remote for a day after installing 1.5, and although I saw nothing in the release note I was pleasantly suprised!

It appears to be unrelated though as it’s back to the same problem.

This is driving me mad!

Update:

I changed the Devolo settings not to use the DHCP settings from the router, I have not changed the DHCP settings on the Devolo but this has resolved my issue and I have two functional remotes again!

@mike Any updates on the work?