Network shares unreliable after reboot

I am running RoonCore on Ubuntu Linux 16.04 server. All my music is stored on a SMB file share that is served up by a very old machine. I am looking at upgrading my storage solution here to a synology NAS, but for now, this is a Windows Home Media server 2003 solution.

Windows Home Media server 2003 only supports SMBv1. Here is what I have noticed

  • When I initially add my share in roon, it takes a while but eventually works
  • While roon is working on getting the share mounted I can see that it tries to mount with newer versions of SMB first. I can see this looking at ps xa | grep cifs from the RoonCore CLI. It starts with version=2.1. When that fails it faills back to version=2.0. When that fails, it eventually tries version=1.0 and is successful

However, if roon reboots for any reason, it can get stuck. For example, I had a power outage recently. RoonCore came back up, the NAS came back up, but roon would never remount the share. I had to leave town and when I came back 4 days later it was still not mounted and the mount.cifs process trying version=2.1 was still running.

Oftentimes, the only thing I can do is delete the share completely, then re-add it. That of course is a real pain since it then has to scan my entire 20,000 track library again. Deleting the share and re-adding it is also still problematic, as it still has to try version 2.1, version 2.0 first and fail before it can finally mount my drive.

What can I do?

Since you can see that Roon is just running mount.cifs, can you reproduce without Roon?

I’m not exactly sure what you mean. Other machines like my Mac can access the share OK. If I manually run the same mount.cifs command Roon is running but I specify version=1.0 it mounts the share

It hangs if you specify 2.1?

Yes, of course. As I said, my windows server only supports smbv1

Can someone from @support help? My Roon system is not very useful when this happens constantly. To summarize:

  • Initially adding a share works after some time
  • On reboot, it may eventually re-map or it may not
    • Best case scenario, the mount attempts using SMBv2.1 and SMBv2.0 fail, and a subsequent attempt using SMBv1.0 is successful. Note this takes about 10 minutes

    • Worst case it just never remaps at all after reboot or more than one reboot. At that point, deleting, re-adding, rescanning is my only option

It would be great to add a drop down during network share add to set the SMB version if you want to. Perhaps there is a way to do that manually on the core?

At the moment, I love Roon when it works, but having to mess around with SMB mounts on every reboot is getting tiring especially on a product I pay for.

Thanks

we are looking into why the 2.1 mount is hanging… it shouldnt hang, it should just exit with an error.

I already suggested this to the developer working on this, if we can not find a solution to the hanging problem.

if you are comfortable writing a new shell script that mimics mount.cifs – then you could make it alter the version to 1.0 always, and call the real mount.cifs with that altered commandline.

understood… we are looking into this – in the meanwhile, the script workaround above should avoid the hang.

Thanks. Is there a script that roon core uses to handle the mounting via mount.cifs on startup? If so, if you could kindly point me in the right direction as to what the script name is and where it is located on the roon core filesystem, I’d be happy to look at that script, copy it, and modify it to use v1.0 initially.

Also, a bit more information. When roon initially tries the mount using mount.cifs and v2.1 it does eventually fail with an error logged to syslog. This takes about 5 minutes though. Then, it repeats the same command, but tries v2.0. This again fails with an error involving CIFS logged to /var/log/syslog, but again this takes 5 more minutes. Finally, after both those fail, it tries with version 1.0 and that works. So basically, on a reboot, it takes 10+ minutes to get back up and running. The other day I was in a situation where that didn’t work at all though. Days went by, and it was still trying v2.1. A reboot didn’t fix it either, and I was forced to delete and re-add the share.

it uses mount.cifs command directly… I would do something like this:

cd $(dirname $(which mount.cifs))
mv mount.cifs mount.cifs-orig
echo "$PWD/mount.cifs-orig \"$@\"" > mount.cifs
chmod +x mount.cifs

then you can edit the mount.cifs script you just created to filter out the 2.1

Sorry Danny, I’m struggling here. It looks like your script above basically would:

  • backup the original executable mount.cifs, renaming it as mount.cifs-orig
  • Create a new mount.cifs executable (not exactly sure about the “$”" part of the echo command)

Wouldn’t roon just then call the new mount.cifs with the same parameters as it does now? I’m not getting something about the solution. Are you saying roon would then call mount.cifs which in reality would be a shell script I would have to write that in turn calls the REAL mount.cifs with the version 1.0 parameter?

This seems like a lot of work to go through just to get roon to work as intended.

Another thing I noticed is that watched folders don’t really work for me with SMBv1. I think this may be a limitation of SMBv1 in general not updating the OS with changes to the directory like it should. Perhaps the long term solution for me is to simply upgrade my storage to something newer that supports SMB v2+

Joe

yes, but once you have this working, you can write something the script to intercept the args and rewrite.

we are working on the proper solution – the above was an interim hack to not have to wait.

this is best idea :slight_smile:

If you are planning a server replacement, let me suggest FreeNAS (https://www.freenas.org). You can build a very nice FreeNAS server for about $1000 plus disks. Mine has been in service for 6 months w/o issue. The server filesystem is ZFS, very robust with data and metadata error correction. Only ZFS offers this level of integrity checking at the moment (discounting EMC professional stuff). I have 2 jails up for UniFi network management and Plex, and Debian-Roon in a VM. All without a hiccup. The server has file systems for each VM and Jail, a shared file system for iTunes, Photos, and Capture One, and a shared filesystem for TimeMachine.

I did an extensive review of commercial SOHO NAS and found none I was confident would really protect my data. Most were built straight out of Linux plus a web server offering proprietary management interfaces. None could assure me that file system metadata was error correcting or that they would detect and correct bit rot of data sitting on the device. At the time, only Netgear was using BTRFS. the others were all built on EXT4, a journaled non-ECC file system. All used Linux Volume Manager to combine disks and all were using the Linux kernel RAID libraries.

If building home brew is not your thing, IX Systems offers 2 small file servers for SOHO use for a small premium over the parts cost. I elected to home brew because I had the collateral tasks running music servers, etc and wasn’t sure if I’d occasionally run some interactive stuff so I wanted a modest current Xeon CPU.

Hello. I wanted to share with the community a temporary quick and dirty fix I came up with to solve my specific problem.

So, in my case, as I explained, my NAS is old and only supports SMBv1. What I found by playing around is that on reboot, or service stop/start, roon will try to remount it’s network shares using mount.cifs. First, it tries using SMB v2.1 (vers=2.1 parameter). That will fail in my case after 2-3 minutes. At that point, it tries SMB v2.0. Again, 2-3 minutes later, that will fail. Finally, it tries SMB v1.0, and that is successful.

So, any time my roon core reboots or the service is stopped/started, I am looking at about a 5-6 minute delay to get back up and running.

Now, what I figured out is that if you kill the mount.cifs processes on the roon core when it is attempting the v2.1 mount, it will immediately respawn a new process and try it with v2.0. If you kill those processes, it again respawns and immediately tries v1.0. So, if I could find a way to kill the processes in an automated way, it would speed up the whole thing.

Here is my solution. Basically what this does, is it looks at all the running processes on the system (Ubuntu 16), and looks for the string “vers=2.0” or “vers=2.1”. If it finds that, it writes the process ID (PID) of that process to a variable, then it kills that process. This repeats until there are no more processes running.

#!/bin/bash
# This script will kill any mount.cifs processes that are trying to mount SMB shares using SMB v2.1 or v2.0 so that roon will fall back to trying v1.0 without waiting for the other attempts to fail first

#Get the PID of the mount processes. If a PID exists, kill it until no more exist
PID=`ps -eaf | grep vers=2.[01] | grep -v grep | head -n1 | awk '{print $2}'`
while [[ $PID != "" ]]; do
  echo "killing $PID"
  kill -9 $PID
  sleep 5
  PID=`ps -eaf | grep vers=2.[01] | grep -v grep | head -n1 | awk '{print $2}'`
done

Then, what I do is put this in my root crontab. When the system reboots, we wait 10 seconds (to make sure the mount.cifs processes have started “trying”), then run the above script. I also do this daily at 6:00 AM which is when my backups run. My backup job is set to another SMB share destination. This also seems buggy. It seems like Roon doesn’t even attempt to mount the share for the backup until it is time to do the backup. Thus, at 6:00 AM, what would happen is that Roon would try to mount my backup share, and fail for the same reasons, or at least be delayed a good 5-6 minutes. This should make sure the backup share gets mounted right away.

@reboot sleep 10 && /home/roon/scripts/smbfix.sh
0 6 * * * /home/roon/scripts/smbfix.sh

Cheers,

Joe