Intermittent loss of connectivity to Roon Server and music stops

@dylan - it has now been 8 days since I last recorded a crash in my log files, the music has not stopped playing once without human interaction either.

Today I reconfigured the networking setup on my Ubuntu Server box to use NetworkManager both for the ethernet and WiFi interfaces, but using a static IP and turning off DHCP4 for the ethernet interface. I believe there is no problem using NetworkManager, as long as DHCP4 of active interfaces is turned off. The coming days will tellā€¦

The Netplan configuration is as follows:

network:
   version: 2
   renderer: NetworkManager
   ethernets:
      enp3s0:
         dhcp4: false
         addresses: [10.0.4.116/24]
         gateway4: 10.0.4.1
         nameservers:
            addresses: [10.0.4.116, 8.8.8.8, 8.8.4.4]
   wifis:
      wlp2s0:
         dhcp4: true
         access-points:
            "your SSID":
               password: "your password"

Itā€™s now 72 hours since I reconfigured the ethernet interface of my Roon core server back to be managed by NetworkManager. I can report that all is well, there have been no more crashes of RoonAppliance. It seems that so long as the interface doesnā€™t use dhcp4 there is no problem.

So, for those who run their Roon core on Ubuntu Desktop or other distributions derived from Ubuntu or dependent upon NetworkManager, the simple recommendation is to configure the ethernet interface with static IP, gateway and DNS servers, turning effectively off the use of DHCP4 for this interface.

On Ubuntu, you can do this simply by modifying the Netplan configuration in etc/netplan. E.g., this does the job:

network:
   version: 2
   ethernets:
      enp3s0:
         dhcp4: false
         addresses: [10.0.4.22/24]
         gateway4: [10.0.4.1]
         nameservers:
            addresses: [8.8.8.8, 8.8.4.4]

Use the correct device name of your ethernet device and your corresponding IP and gateway addresses. Be careful with the indentation and use spaces, not tabs, to indent. Try the configuration with sudo netplan try and, if all is well, install and activate it with sudo netplan apply.

Users on Ubuntu Server should not be affected and need not reconfigure the ethernet interface.

1 Like

Many thanks for the continuing investigation and reporting.

Hi @dylan , what has become of this?

Hi @Andreas_Philipp1

This is something I spoke about with the team this morning. We are still working on this, but I donā€™t have any specific updates I can provide just yet. Weā€™ll be sure to provide more information as soon as we can. Thanks!

Hello @dylan ,

I wonder if you could supply an update on this reported issue and perhaps specifically answer a few quick but important questions?

  1. Has the root cause been identified? If so can you share?
  2. What is the suspected scope of impact? Is this just Ubuntu or are other platforms susceptible? How about Nucleus? Sonic Transporter? Are there any Linux based platforms that are less prone to be impacted?
  3. Do you have a specific timeline for possible resolution?
2 Likes

Iā€™m a new roon user (as of today) and Iā€™m seeing something similar. I am running Ubuntu 20.04.2 LTS server (no GUI/Desktop, fresh install today) in a VM on a mac mini and Iā€™ve had music playback stop abruptly on me 3-4 times since setting everything up. It seems to happen when SSHing into ubuntu after a while.

Can we please get an update @support?

I seem to have fixed my issue by removing the network share mount that I created in Roon server during initial setup and mounting the same share directly in Ubuntu and then pointing to it in Roon server as if it were a local directory.

Thanks for reaching out about this one! I can confirm that we have a ticket in with our team, but I canā€™t provide any specific timelines just yet. Itā€™s something we are actively working on, though, and weā€™ll be sure to let everyone know when we have something available. We appreciate your patience as we continue looking into this.

Hi @dylan,

Can you please attempt to answer the questions that I posed?

Regards.

Hi @Robem

Sorry for the late followup here.

I donā€™t have the specific technical reasoning behind it, but ultimately we have a memory leak in some cases seemingly stemming from certain repeated network calls.

Iā€™ve asked the team if they can provide some information on this. Iā€™ll follow up when I hear back from them.

I cannot comment on timelines just yet, but weā€™ll be sure to let everyone know when we have news here.

@dylan I have to be honest with you, my annual subscription which has been active for a few years now is up for renewal here shortly and I am seriously considering not renewing.

This issue was handed to Roon Labs on a plate with analysis, evidence, workaround and everything that is needed for your team to recreate the issue, identify root cause and implement a resolution. Even now, over six months later, there is no resolution in sight, minimal communication and repeated vagueness in all attempts at dialogue. This is totally unacceptable and this is not the only example of an inadequate support model that I have similar direct experiences with, this lack of paying customer support is now presenting itself as a trend that I am not normally prepared to live with.

Why canā€™t or wonā€™t you answer the questions posed?
When can we expect a resolution?
Why should I renew my annual subscription that is about to expire when your support is almost non-existent when things get complicated?
Why do Roon Labs struggle to communicate effectively? (These forums are riddled with examples).

I now wonder if Roon Labs will be able to rescue a customer service concern that will result in an annual subscription slipping away or whether they will just allow that to happen.

Sincerely,
Robem

@dylan to add some context to the unacceptable response received. In my workplace this is what we call a CII (change induced incident) that was introduced when 1.7 667 was released. These types of incidents are the most critical in nature and should receive the absolute highest priority. Instead of addressing this, Roon Labs has seen it fit to release a major upgrade to the interface and introduce even more bugs that have taken priority over fixing what was broken back in October.

The most important question that I am looking to get answered is what other Linux platforms are there that are not susceptible to this memory leak?

Please advise as a matter of priority.

Hey @Robem, I know how frustrating it must be to run into trouble like this ā€“ You were getting along fine, then things went sour.

I wanted to reveal a bit more about how we approach resolving problems like this so that you have better insight into what we (the support team) can do for you today.

Our first step is always to work with affected customers to gather data about their systems. Then we raise the issue with the QA team who does further research to figure out the nitty-gritty of whatā€™s going on, determine the scope of customers affected, and provide the development team with reliable steps to reproduce the issue.

The support team doesnā€™t write the code, but we do make sure that issues like yours get appropriately reported and escalated. From there, the product and development teams consider it against fixing other bugs, developer bandwidth, building new features, company goals & roadmap, etc., and then they have to go actually spend time fixing the thing.

For this particular issue, the scope of customers affected is small, and there is a known workaround for those who are. This should be enough of a stop-gap until a proper fix is prioritized, built, and released.

All that said, it doesnā€™t mean that resolving this for you isnā€™t important to us (itā€™s something we regularly bring up and advocate for). I know this doesnā€™t change the fact that the issue isnā€™t fixed, and itā€™s not meant to be an excuse, but I hope this at least gives you more helpful context.

Hello @kevin

Thank you for your response, however it doesnā€™t address the questions asked a few times now. Perhaps I should give some background on the importance of getting answers to at least one of the questions that I posed.

This issue is acknowledged as stemming from a memory leak. I also have another issue on these forums that is also unresolved and also appears to be related to memory handling.

My questions regarding platforms impacted by this/these memory leaks would be useful in guiding me towards an alternative platform that I could install core on that would be more stable. As of yet, not one single attempt has been made to provide an answer to me and I am getting discouraged by this.

I have about 10 days left on my subscription and I am struggling to make a decision on if I should renew. The most challenging part of this decision is not the issues with the software as much as it is with the quality of support provided. In all honesty, it seems to me that as an organization you guys are inundated on the support front and are unable to respond in a timely manner to customer needs. This is the factor that I am struggling with the most when I am considering whether or not I should paying for another 12 months.

Sincerely,
Robem