Started/Not responding/Error loop

I was running RoonServer in a Docker container; however, given the discussion about the current Docker images, I’ve moved it to running on a Linux machine. I’m suddenly seeing the same behavior with both the container and the bare metal install, though – the service is caught in a loop of starting, erroring out, and restarting:

Aug 08 06:40:23 UbuntuOne start.sh[20244]: Initializing
Aug 08 06:40:23 UbuntuOne start.sh[20244]: Started
Aug 08 06:40:26 UbuntuOne start.sh[20244]: Not responding
Aug 08 06:40:49 UbuntuOne start.sh[20244]: Error
Aug 08 06:40:51 UbuntuOne start.sh[20244]: Initializing
Aug 08 06:40:51 UbuntuOne start.sh[20244]: Started
Aug 08 06:40:52 UbuntuOne start.sh[20244]: Not responding
Aug 08 06:40:58 UbuntuOne start.sh[20244]: Running
Aug 08 06:41:08 UbuntuOne start.sh[20244]: Not responding

I’ve reviewed the RoonServer logs and there’s a gap in the logs where the Not responding occurs and a gap between logs where the Error occurs.

Not responding at 06:40:26:

08/08 06:40:25 Info: [stats] 1033mb Virtual, 41mb Physical, 0mb Managed, 0 Handles, 16 Threads
08/08 06:40:36 Debug: [broker/filebrowser/volumeattached] initial listing found drive mounted at /

Error at 06:40:49:
(RoonServer_log.07.txt)

08/08 06:40:41 Trace: [fiveaccountserver] GET https://accounts5.roonlabs.com/accounts/3/profileslist?token=ea857cf8-0062-424c-b6be-c37758b70add
08/08 06:40:41 Trace: [fiveaccountserver] GET https://accounts5.roonlabs.com/accounts/3/userinfo?token=ea857cf8-0062-424c-b6be-c37758b70add
08/08 06:40:41 Trace: [broker/accounts] updated token. New expiration is 9/7/2017 6:40:41 AM
08/08 06:40:41 Trace: [broker/accounts] Data updated. AccountStatus=LoggedIn MachineStatus=Licensed UserId=abc7fa18-ddb4-4ab5-a358-bbbcfe19fa91
08/08 06:40:41 Info: [brokerserver] Client connected: 192.168.1.9:3702
08/08 06:40:41 Trace: [SOOD] Adding User IP 192.168.1.9

(RoonServer_log.06.txt)

08/08 06:40:51 Info: Starting RoonServer v1.3 (build 247) stable on linuxx64
08/08 06:40:51 Trace: Checking if we are already running
08/08 06:40:51 Trace: Nope, we are the only one running

This is very similar to Build 102 RoonServer Not Responding, Error, Started, Running - Loop but there’s no resolution or further discussion there. How do I go about resolving this?

Perhaps an @support might be the order of the day.

Hi @Neil_Carpenter ---- Thank you for the report and sharing your observations with us, the feedback is appreciated.

Moving forward, may I very kindly ask you to please provide me with the following:

  1. A brief but accurate description of your current setup.

  2. Using the instructions found here, please upload a complete set of logs.

-Eric

Sure thing.

I’m running a Thinkpad t430s (https://browser.geekbench.com/v4/compute/755427, upgraded to 16GB RAM and a 1TB SSD). It’s running Ubuntu 17.04. Music is stored on a NAS, folders are mounted via SMB.

Logs are available at https://www.dropbox.com/sh/2ncm8d7bw1760k2/AACCUuDdigkhe13QnKc5Bsxqa?dl=0.

Awesome, thanks @Neil_Carpenter! Confirming that the logs have been received and are in our queue to be evaluated by a member of our tech team.

Once my report has been updated with the teams thoughts/findings, I will provide you will an update promptly. Your patience is very appreciated!

-Eric

I was playing with the new steefdebruijn/docker-roonserver docker container this morning, which initially worked until I configured my library. Shortly thereafter, I saw the fault below, then Roon went into the same loop.

Logs – https://www.dropbox.com/s/eko3b6k4qlu7m3t/DockerLogs.7z?dl=0

*** Error in `/app/RoonServer/Mono/bin/RoonAppliance’: corrupted double-linked list: 0x00007fe3810798f0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bcb)[0x7fe3a0a57bcb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76f96)[0x7fe3a0a5df96]
/lib/x86_64-linux-gnu/libc.so.6(+0x77338)[0x7fe3a0a5e338]
/lib/x86_64-linux-gnu/libc.so.6(+0x78dca)[0x7fe3a0a5fdca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7fe3a0a61f34]
/app/RoonServer/Mono/bin/RoonAppliance[0x8aebee]
/app/RoonServer/Mono/bin/RoonAppliance(mono_mempool_alloc+0x10f)[0x641bff]
/app/RoonServer/Mono/bin/RoonAppliance[0x61a3b2]
/app/RoonServer/Mono/bin/RoonAppliance[0x5dd765]
/app/RoonServer/Mono/bin/RoonAppliance[0x5dfb59]
/app/RoonServer/Mono/bin/RoonAppliance(mono_class_get_methods+0x39)[0x5eff5b]
/app/RoonServer/Mono/bin/RoonAppliance[0x5ffa93]
/app/RoonServer/Mono/bin/RoonAppliance[0x5ffcf1]
/app/RoonServer/Mono/bin/RoonAppliance[0x4c8492]
[0x4175ac68]

Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.

1 Like

Hi @Neil_Carpenter ---- Thank you for the update! Confirming that I have received the latest set of logs and attached them to my report which is with our techs.

-Eric

Hi @Neil_Carpenter —— Thank you for your patience while our techs have been reviewing the information found in the provided sets of logs.

Moving forward, before we go further here in trying to figure out what is causing this behavior can you please define how your are currently trying to setup RoonServer. In your initial report you mentioned the following:

“I was running RoonServer in a Docker container; however, given the discussion about the current Docker images, I’ve moved it to running on a Linux machine.”

Then in your most recent, you stated:

”I was playing with the new steefdebruijn/docker-roonserver docker container this morning, which initially worked until I configured my library. Shortly thereafter, I saw the fault below, then Roon went into the same loop.”

Are you back to running on the linux machine OR are sill using the docker?

-Eric

I’ve tried both approaches and seen the same results. Presumably the root cause is in what they share (my library, Roon) rather than in their differences (bare-metal Linux or Docker container).

Currently, I’m using neither because the service won’t start and stay up long enough for me to do anything. It’s now been that way for 4+ days.

Hi Neil,

Thanks for your patience here. Based on what we’re seeing in the logs, this looks like it could be some kind of memory corruption – that doesn’t rule out a bug, but we’ll need to figure out how to get this reproducible for our developers and QA before we can move forward.

The fact that you’re seeing this whether Docker is involved or not is interesting – are you able to get a new install up and running stably in both cases? What happens if you only add a subset of your collection?

Can you confirm that all the dependencies listed here have been met? Can you also confirm what version of glibc you’re running? I would ask about resource limits in the container, but since your issue isn’t limited to Docker that’s probably not it.

Hopefully your answers to the above help us reproduce whatever’s going on in-house, or at least give you a steer about what might be happening in your environment – this isn’t something we see very often, so something is unique in this case, and we just need to figure out what it is. Thanks again for your patience @Neil_Carpenter!

I can bring up a fresh install, on either platform, and it seems to be stable (albeit useless) until shortly after I add my collection.

I have not tried a subset of my collection because, as tests go, that’s a pretty unworkable test. Even if, say, half my collection works, then I’m stuck ratiocinating through increasingly smaller fractions of my collection until I find something amiss. If there’s a data or metadata problem in the collection, this is a thing that Roon should be catching – I would much rather have the problem logged, indicating the problematic media, than have to manually move thousands of files around until I get lucky and find the one that’s broken.

All dependencies are installed and this was previously working.

Yeah, finding bad media this way is no fun – trust me, we know :wink:

My suggestion was more that you just watch a single album/folder or similar. It would be a good data point to know whether that was stable or not, but you’re right – if things are stable with a different set of media, doing a binary search of your content might be the next step.

I would rather it be logged as well, but there are times when crashes bring down the app in a way that logs aren’t captured, and these are often the trickiest issues to debug.

I absolutely understand the frustration here @Neil_Carpenter – we’ll discuss this internally and see if there’s additional guidance we can offer about what might be causing your system to perform differently than other Linux installs.

My questions above were an attempt to understand what might be unique here, so we can make this issue reproducible for our developers. We’ll work at this until the problem’s resolved, but the sooner we can identify what makes your environment different, the sooner we can get this resolved for you.

Is there any way to enable debug logging in whatever code indexes/scans the library?

Hey @Neil_Carpenter – you can try running Roon with these flags and then dump us a zip file of the entire logs folder:

-storagetrace -searchindexreplay

I would start by adding a folder or two, and then add more of your collection until this issue recurs. Thanks for your patience here!