Possible API issue


(Harry ten Berge) #1

Hi @support

In RoPieee we use the Roon API for getting updates to display on the screen. Furthermore we use it for remote controls. So there’s an extension that talks to the Roon API, and this extension exposes an interface that is used by the RoPieee software.

Since a month or so I’m receiving reports of sudden stops: the information on the screen isn’t updated anymore. I’ve tried to figure out what it could be, suspecting my extension to be broken. From the logs from my users it appeared as if the extension all of a sudden does not receive updates anymore.

What makes it difficult that all this time I only had this situation myself twice. However, in most cases a restart of Roon core solves the problem. So lately I’m wondering if something could trigger Roon core to stop sending out updates.

Today I ran into the issue myself. Saved the logs of Roon core, and restarted it. This resolved the problem (where restarting the extension does not).

So, when everything is working I see this in the logs:

12/01 10:50:15 Trace: [roonapi] [apiclient 192.168.1.181:49584] CONTINUE Changed {"zones_seek_changed":[{"zone_id":"1601ba4eefe2dbc281c82643f9480c9e52bf","queue_time_remaining":26,"seek_position":569}]}
12/01 10:50:16 Trace: [Hugo2] [Enhanced, 16/44 FLAC => 32/44] [100% buf] [PLAYING @ 9:30/9:55] Happiness Is a Warm Gun - Tori Amos
12/01 10:50:16 Trace: [roonapi] [apiclient 192.168.1.138:42670] CONTINUE Changed {"zones_seek_changed":[{"zone_id":"1601ba4eefe2dbc281c82643f9480c9e52bf","queue_time_remaining":25,"seek_position":570}]}
12/01 10:50:16 Trace: [roonapi] [apiclient 192.168.1.181:49582] CONTINUE Changed {"zones_seek_changed":[{"zone_id":"1601ba4eefe2dbc281c82643f9480c9e52bf","queue_time_remaining":25,"seek_position":570}]}

And so on.

During the night something happened in my infrastructure, causing endpoints to disconnect from Roon server. This looked like this:

2/02 01:24:35 Info: [stats] 4475mb Virtual, 2079mb Physical, 830mb Managed, 0 Handles, 444 Threads
12/02 01:24:50 Info: [stats] 4476mb Virtual, 2080mb Physical, 831mb Managed, 0 Handles, 444 Threads
12/02 01:25:04 Trace: [roonapi] [apiclient 192.168.1.138:43512] CONNECTION TIMEOUT
12/02 01:25:04 Info: [roonapi] [apiclient 192.168.1.138:43512] disconnected
12/02 01:25:04 Trace: [roonapi/registry] removing [apiclient 192.168.1.138:43512] => [org.ropieee.remote, RoPieee Remote Control V2 [ropieee] v2.0.6 by Spockfish <htenberge+ropieee@gmail.com>]
12/02 01:25:04 Trace: [rnet/RnetJsonClient] no data received for >10000ms. Killing connection.
12/02 01:25:04 Trace: [raat_ll/client] [Metrum Acoustics Ambre @ 192.168.1.121:36585] no data received for >10000ms. Killing connection.
12/02 01:25:04 Trace: [push] restarting connection (Unable to read data from the transport connection: interrupted.)
12/02 01:25:04 Trace: [rnet/RnetJsonClient] no data received for >10000ms. Killing connection.
12/02 01:25:04 Trace: [push] retrying connection in 38151ms
12/02 01:25:04 Trace: [rnet/RnetJsonClient] no data received for >10000ms. Killing connection.
12/02 01:25:04 Trace: [raat_ll/client] [snd_allo_digione] no data received for >10000ms. Killing connection.
12/02 01:25:04 Trace: [raatserver] [RaatServer prince @ 127.0.0.1:33285] lost client connection. Retrying
12/02 01:25:04 Trace: [raatserver] [RaatServer prince @ 127.0.0.1:33285] connecting (attempt 1)
12/02 01:25:04 Trace: [raat] [Metrum Acoustics Ambre @ 192.168.1.121:36585] => Disconnected
12/02 01:25:04 Trace: [raat] [Metrum Acoustics Ambre @ 192.168.1.121:36585] lost client connection. Retrying
12/02 01:25:04 Trace: [raat] [Metrum Acoustics Ambre @ 192.168.1.121:36585] connecting (attempt 1)
12/02 01:25:04 Info: [transport] destroyed zone ambre was playing? False
12/02 01:25:04 Trace: [zone ambre] Suspend
12/02 01:25:04 Trace: [raatserver] [RaatServer ropieeetouch @ 192.168.1.181:33479] lost client connection. Retrying
12/02 01:25:04 Trace: [raatserver] [RaatServer ropieeetouch @ 192.168.1.181:33479] connecting (attempt 1)
12/02 01:25:04 Trace: [raatserver] [RaatServer ropieee @ 192.168.1.138:42129] lost client connection. Retrying
12/02 01:25:04 Trace: [raatserver] [RaatServer ropieee @ 192.168.1.138:42129] connecting (attempt 1)
12/02 01:25:04 Info: [transport] destroyed zone My Digione was playing? False

And then it starts to reconnect, but I see this error in my log:

12/02 01:26:25 Critical: scx: in OnExit: System.NullReferenceException: Object reference not set to an instance of an object
 at Sooloos.Broker.RoonApi.TransportService_2.EndpointToJson (Sooloos.Broker.Api.Endpoint ep) [0x00028] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at Sooloos.Broker.RoonApi.TransportService_2.<ZoneToJson>b__37_2 (Sooloos.Broker.Api.Endpoint x) [0x00000] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at System.Linq.Enumerable+SelectIListIterator`2[TSource,TResult].MoveNext () [0x00048] in <ae162b7061064bfaa55021254699ac67>:0
 at System.Collections.Generic.List`1[T]..ctor (System.Collections.Generic.IEnumerable`1[T] collection) [0x00077] in <370a0c27f4b74d1a81431037df6d75bf>:0
 at Base.JList..ctor (System.Collections.Generic.IEnumerable`1[T] l) [0x00000] in <8e2bbba085c347469d381b30fe73c82a>:0
 at Sooloos.Broker.RoonApi.TransportService_2.ZoneToJson (Sooloos.Broker.Api.Zone zone, System.Boolean for_diff) [0x000e6] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at Sooloos.Broker.RoonApi.TransportService_2._ActuallyChanged (Sooloos.Broker.Api.Zone zone) [0x00000] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at Sooloos.Broker.RoonApi.TransportService_2.<_update_subscriptions>b__25_0 (Sooloos.Broker.Api.Zone x) [0x00000] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at System.Linq.Enumerable+WhereEnumerableIterator`1[TSource].ToList () [0x0001b] in <ae162b7061064bfaa55021254699ac67>:0
 at System.Linq.Enumerable.ToList[TSource] (System.Collections.Generic.IEnumerable`1[T] source) [0x0001f] in <ae162b7061064bfaa55021254699ac67>:0
 at Sooloos.Broker.RoonApi.TransportService_2._update_subscriptions () [0x000d1] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at Sooloos.Broker.RoonApi.TransportService_2.OnThreadExit () [0x00034] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at Sooloos.Broker.RoonApi.Module.ev_exit () [0x0000b] in <bbda6a553eb048f1b5c5c7aa2e58a1a5>:0
 at Sooloos.SynchronizationContextThread.OnExit () [0x0000a] in <474fc6b46f7646c587f524cda12acaf9>:0

This error is then flooding my logs. Nothing else, only this. Music playing works obviously, but everything related to the API seems broken. Or more precise: updates on the transport subscription are not send anymore.

I’ve copied the logs, so if you need them ping me. Hopefully this gives you a hint or something what could be wrong.

Thanks!


(Dylan Caudill) #2

Hi @spockfish,

I’ll get this over to the team so they can take a closer look. Can you send me the logs as well so I can include them in my report to the team?

Thanks!


(Harry ten Berge) #4

Sure!

How would you like me to do that?
Do you have an email addr for me?

Thanks


#5

Same here. Once a zone gets powered down while grouped, the core starts spamming that unhandled exception until the core is restarted. @support Any progress addressing this?


(Harry ten Berge) #6

Yeah an update would be great @support


#7

For the record - I run into this too yesterday. Roon complained once or twice every second:

02/01 11:04:51 Critical: scx: in OnExit: System.NullReferenceException: Object reference not set to an instance of an object
  at Sooloos.Broker.RoonApi.TransportService_2.EndpointToJson (Sooloos.Broker.Api.Endpoint ep) [0x00028] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.<ZoneToJson>b__37_2 (Sooloos.Broker.Api.Endpoint x) [0x00000] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at System.Linq.Enumerable+SelectIListIterator`2[TSource,TResult].MoveNext () [0x00048] in <ae162b7061064bfaa55021254699ac67>:0 
  at System.Collections.Generic.List`1[T]..ctor (System.Collections.Generic.IEnumerable`1[T] collection) [0x00077] in <370a0c27f4b74d1a81431037df6d75bf>:0 
  at Base.JList..ctor (System.Collections.Generic.IEnumerable`1[T] l) [0x00000] in <1e9187d858a94158b2a2c81586b0ef84>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.ZoneToJson (Sooloos.Broker.Api.Zone zone, System.Boolean for_diff) [0x000e6] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at Sooloos.Broker.RoonApi.TransportService_2._ActuallyChanged (Sooloos.Broker.Api.Zone zone) [0x00000] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.<_update_subscriptions>b__25_0 (Sooloos.Broker.Api.Zone x) [0x00000] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at System.Linq.Enumerable+WhereEnumerableIterator`1[TSource].ToList () [0x0001b] in <ae162b7061064bfaa55021254699ac67>:0 
  at System.Linq.Enumerable.ToList[TSource] (System.Collections.Generic.IEnumerable`1[T] source) [0x0001f] in <ae162b7061064bfaa55021254699ac67>:0 
  at Sooloos.Broker.RoonApi.TransportService_2._update_subscriptions () [0x000d1] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.OnThreadExit () [0x00034] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at Sooloos.Broker.RoonApi.Module.ev_exit () [0x0000b] in <dcbcf4f91fec401e800a83ba373c6044>:0 
  at Sooloos.SynchronizationContextThread.OnExit () [0x0000a] in <7f0a74b68d2a4a0ba3084b62b8028591>:0

I cannot say when it started because Roon had been running out of log files already but I think it was the same root cause: powering down and starting up again selected or all devices from grouped zones in between.

First I thought to post into this thread:

… but it looks here things are more recent. :wink:


(Dylan Caudill) #8

Hi all,

Apologies for the delay here. I checked in with the team on this and they’re still actively investigating the cause of what you’re seeing. We appreciate your patience while the investigation continues. I’ll be sure to update you ASAP when I have more information from the team.


(Nathan Wilkes) #9

Thanks @dylan. I can corroborate that a reboot of ROCK is the only way to consistently fix the problem when it occurs, at least in my setup (4 RoPieee endpoints, 3 RoPieee displays, ROCK core).

Only for context, I have gone to significant lengths to remove any possibility of networking issues (everything now wired, everything around 1ms or less).


(Dylan Caudill) #10

Hi everyone,

I wanted to reach out and let you know that we resolved an issue here in our latest update, Build 401. You can read more about the update here:

Please give it a try and let us know how it goes!


#11

@dylan - seems like it didn’t go away. Had this spamming the logs again two or three days ago:

03/25 00:06:57 Critical: scx: in OnExit: System.NullReferenceException: Object reference not set to an instance of an object
  at Sooloos.Broker.RoonApi.TransportService_2.EndpointToJson (Sooloos.Broker.Api.Endpoint ep) [0x00028] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.<ZoneToJson>b__37_3 (Sooloos.Broker.Api.Endpoint x) [0x00000] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at System.Linq.Enumerable+SelectListIterator`2[TSource,TResult].MoveNext () [0x00048] in <ae162b7061064bfaa55021254699ac67>:0 
  at System.Collections.Generic.List`1[T]..ctor (System.Collections.Generic.IEnumerable`1[T] collection) [0x00077] in <370a0c27f4b74d1a81431037df6d75bf>:0 
  at Base.JList..ctor (System.Collections.Generic.IEnumerable`1[T] l) [0x00000] in <021a9476948a45d9ad15e766804a7ee7>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.ZoneToJson (Sooloos.Broker.Api.Zone zone, System.Boolean for_diff) [0x00102] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at Sooloos.Broker.RoonApi.TransportService_2._ActuallyChanged (Sooloos.Broker.Api.Zone zone) [0x00000] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.<_update_subscriptions>b__25_0 (Sooloos.Broker.Api.Zone x) [0x00000] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at System.Linq.Enumerable+WhereEnumerableIterator`1[TSource].ToList () [0x0001b] in <ae162b7061064bfaa55021254699ac67>:0 
  at System.Linq.Enumerable.ToList[TSource] (System.Collections.Generic.IEnumerable`1[T] source) [0x0001f] in <ae162b7061064bfaa55021254699ac67>:0 
  at Sooloos.Broker.RoonApi.TransportService_2._update_subscriptions () [0x000d1] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at Sooloos.Broker.RoonApi.TransportService_2.OnThreadExit () [0x00034] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at Sooloos.Broker.RoonApi.Module.ev_exit () [0x0000b] in <75d52ccbda9342358cc2e0d9eacb8d79>:0 
  at Sooloos.SynchronizationContextThread.OnExit () [0x0000a] in <cd1db2e4c22841c29dacd496dc8d6afd>:0

Since I had rebooted the ROCK core in between / had it offline I may still have the logfile showing what happened before the exception spamming started. Seems like using an nvidia ShieldTV as an aditional zone in between triggered this.

BTW I was wondering if you have anything in ROCK which checks for stuff like this in the logs and eventually triggers a core restart if the exceptions don’t disappear themselves? Kind of a “hard self-healing”?


(Harry ten Berge) #12

Yeah I’ve got more users complaining the issue still exists…


(Dylan Caudill) #13

Hi @u_gee and @spockfish,

Thanks for the info update here. We’d like to gather some more info about what you’re experiencing so I can pass it along to the team:

  1. Can you verify that you (or affected users) are running the latest version of Roon?
  2. How exactly are things failing now? Is it the same as before? Is information not reaching extensions or is something else occurring?
  3. Are there any patterns that you’ve notice that tend to lead to Roon getting in this state?

Thanks!


(Nathan Wilkes) #14

Yes, always latest Roon, RoPieee versions.

The display screen will freeze (not update). Or, not start displaying anything when music starts for the first time. This is the same behaviour as before, starting with the Roon update late last summer (if I remember correctly).

It seems to happen when no music has been playing for awhile, it seems. It only happens with grouped zones, at least for me. And, the fix (always) is a reboot of ROCK (no change to RoPieee displays or end points required, or indeed helpful).

Harry (@spockfish) instituted some functionality a few RoPieee versions ago to restart the extension under certain conditions, and in my environment the normally grouped two zones always display a lower number of minutes since the last restart than a zone that is normally single (again, in my environment).

Does this help?


(Harry ten Berge) #15

The pattern is as follows:

People use RoPieee in combination with a screen. The screen uses an extension (and thus the API) for gathering information to display. As long as this is on a single zone it’s fine.

It goes wrong as soon as people group zones together. Then, after a certain amount of time, the extension fails to retrieve information.


#16

The latest one available - currently 1.6 - 401. :slight_smile:

Except from the log spamming I did not yet notice any impact. But then my ROCK NUC regularly gets shut down and I do only check log files now and then. Also, I’m not using any extensions and am posting here just because the issue gets tracked here.

Something with grouped zones together with ungrouped zones is playing a role, so it seems. I’ve saved two core logs from the last time I noticed the log spamming to happen - let me know if you want those.


(Nathan Wilkes) #17

One more thing: restarting the RoPieee extension usually reloads a snapshot of the album art / title etc, but this is frozen (no updates while music plays). Restarting core is the only solution.

Occasionally we experience the problem with an ungrouped zone, as happened tonight. Fortunately it is easy to initiate a restart of core via a URL.

Thanks for investigating.


(Dylan Caudill) #18

Thanks for the info here, everyone.

I spoke with the team, and we’d like to take a look at some individual reports here so we can enable diagnostics and take a look at what is occurring.

In Build 401 of Roon, we made some changes that resolved an issue causing a similar type of behavior. The issue was centered around endpoints becoming unavailable due to networking issues. There may be other things causing the same behavior, so we are hoping to gather a bit more data.

The next time this occurs please make a note of the time in which you experience this behavior. We can then enable diagnostics on your account so the team can take a look. Please also include some information about your setup and the endpoints that are in use here.

Thanks!


(Nathan Wilkes) #19

@dylan

Arrived home after a long weekend. Problem immediately occurred initiating play on grouped zone (Kitchen Pi and Living Room Pi), 18:50 Pacific, March 31. Music played, but display did not work. Restart of core fixed the issue (my normal course of action).

What further information do you require? 4 audio zones, 3 RoPieee displays, all connected via Ethernet. Ping times captured via rrdtool indicate no issues (between .5 and 1ms with almost no variation or abnormalities). ROCK core.

Also use the great Alarm extension, but not on the same zone as the normally grouped zones. Occasionally the alarm fails to stop the alarm (the second action), but never the first (start playing).

Music on Synology NAS, on own RAID1.

Let me know what else would be helpful, with thanks.


(Dylan Caudill) #20

Thanks, @Nathan_Wilkes.

I’ve enabled diagnostics on your account so the team can take a look. Can you provide some additional information about your current networking setup? What networking hardware is in use?


(Nathan Wilkes) #21

Hello @dylan . Here is a simple network diagram and a sample of ping times (times are typical – “Lemon Pi” is wireless and used for non-Roon things. Let me know if anything is unclear.