General discussion: What is killing boot SSDs?

This happened to my Nucleus, and I was so happy Roon fixed it and had it back to me quickly. I didn’t know it was a thing that was going around. Now I’m worried that the next time I lose power it can happen again. To my understanding the only real difference between a Nuc running ROCK & a Nucleus is the case and the ssd partitioning. Pretty scary.

@Rugby @wklie @Graeme_Finlayson Thank you for your recommendations, I will keep them in mind in case I need to purchase yet another drive.

The thing is, though, I bought a Nucleus because I wanted a turn-key solution. I did not want to go through all the effort of building my own server, including all the research required. This is also why I ordered the same drive Roon Labs decided to use for the Nucleus. What I am wondering now is why Roon Labs are continuing to use a drive with this kind of track record?

I am not completely sure about what you mean with this, but there are several cases of SSD failures mentioned in this topic already and you can find more by searching this forum.

You can’t find the support topic about my specific case though, because customer support made the initially public topic a personal message for some reason. And I can’t seem to find another support topic I replied to just recently, I wonder if the same thing has happened to that as well.

1 Like

I purchased the Nucleus back in February and today had my THIRD Nucleus SSD failure in five months. I was away for a few days, tried to start up the app, and the whole thing fizzled yet again… This time I’m done. I don’t want a product where I’m holding my breath wondering if it will work.

Like some, I bought the Nucleus because I thought it would be an elegant turnkey solution because I don’t have the expertise to build a server. Big mistake. It would work for a month, then there be a month of back-and-forth with @support and getting a replacement SSD, then it would work for a month, break again, another month of back-and-forth with support with a whole new unit, another month of music, another break…

I tried external USB storage, internal SSD storage, changing cables, changing routers, giving it more breathing room in case it was overheating. And it’s not like there’s anything special at all about my setup. Ethernet from a router, Bluesound Node from same router, decent power conditioner, regular Apple apps. Nothing fancy or customized. Just a completely unreliable brick of a product.

I tried cloning the boot drive from a ROCK drive just to find out if I could. Didn’t succeed. What was your technique?

I was thinking about using an AppleTV Gen 1 boot clone program. It used ubuntu and got all of the partitions and data straight. I’ll have to dig up that program off of one of my old mac minis, though.

I used a Mac and this to determine what drives needed to be to and from

diskutil list

sudo dd if=/dev/diskx of=/dev/diskx conv=noerror,sync

then loaded the copy into gparted on ubuntu and expanded the largest partition to fill the SSD (in my case from 64GB to 128GB)

you need appropriate drive cradles/carriers etc.

if this is too much to comprehend then maybe its above your capabilities and I suggest you seek geek expertise :wink:

This sounds really really bad… Can you please share the brands, models and capacities of the drives in question?

It’s a Kingston A2000, 250GB. I don’t know what the other two were since they’re long gone. I didn’t bother to look since with the first two fails I was using an external USB for storage.

This failure appears far more catastrophic than the first two. This time you can’t even boot it or flash it from a USB.

They’ve been replacing them obviously, but each time it worked fine for about a month before it died. I guess it’s possible it was overheating, but I hadn’t used it in days so that would seem to be an issue on its own. It’s a shame because I wanted to love it. Cutting my losses on it and trying a NUC.

Just to be sure I’m not misunderstanding something, Roon Labs sent this to you as a replacement?

This is what happenened to me too. Could not even enter BIOS, and the Nucleus kept power cycling on its own.

Correct. This Kingston SSD was provided by Roon in a replacement Nucleus unit.

At the first failure, I shipped the unit back to Roon. They returned the same unit with a replacement SSD. When the Nucleus with the replacement SSD failed, I then received an entirely new Nucleus. I don’t know what model either of the first two SSD’s was because I was using exclusively external storage. I never opened the case.

When I received the new replacement unit I switched to internal storage to see if it would help. The Kingston was in the new replacement Nucleus unit from Roon. It appears to have failed in a different way from the other two occasions so I wouldn’t be surprised if it was a different brand, but that’s just speculation.

Obviously Roon has been helpful getting replacements shipped and so forth. But there’s a limit to one’s patience. Waiting on the RMA to return it for the final time. Unfortunate all around.

1 Like

dd command is not too difficult to comprehend since that was my first method to try. I wonder if I had a bad drive cradle that caused an error mid-copy when I tried it. It “looked” like it copied well. I’ll give it another go with different equipment and see if I have a better outcome.

Thanks for the info.

Clone it to a Samsung 970.

Previous track record for this particular brand and model:

I would not use it even if someone give it to me for free. I would have kept it for emergency uses or testing only.

If one cannot wait for replacement, or is unable to clone it, just install ROCK. Then ask support for Nucleus firmware by providing proof of purchase and serial number.

I reiterate the point that one should only use a SSD from a manufacturer that manufactures its own NAND chips.

Set this thread to “Watching”, in case it gets unlisted.

1 Like

I’m sure Roon are totally excellent with warranty replacements, but I don’t think I want the trouble. At least with Rock, you can just download the image and quickly get back up and running.

I opened my support topic on May 19 and am still waiting for my case to move on. So while Roon Labs customer support might be excellent in many ways, I would not say speed is one of them.

I was able to get my Nucleus up and running by installing ROCK OS on a new drive and restoring from a database backup. I was told by customer support that if I can get ROCK OS installed, they “could connect remotely and apply the Nucleus OS”. Everything seems to work fine, but since I don’t know about the differences between ROCK OS and Nucleus OS, I have not wanted to use my Nucleus before customer support gets back to me.

Had a look at that topic and another one linked from it. Absolutely horrifying stuff. How can so many companies be producing such bad hardware? It is hard to accept that it is just bad M.2 drives on their own that are the problem.

I would love to know 1) whether M.2 failures are just as common on non-NUC hardware and 2) whether M.2 failures are just as common on NUC hardware not running Roon OS.

It’s not a problem of NUC, although heat may be related. It’s not a problem of Roon, which is software. Quality matters.

If that is the case, and I have no evidence to prove otherwise, then what puzzes me is why Roon Labs keeps using these poor quality drives for the Nucleus? Does not seem to make any sense.

It is not impossible for poorly written code or a bug, especially in the OS layer, to cause issues with hardware. I am not saying I suspect Roon OS to be the problem here, but I don’t think it can be ruled out so easily?

Firstly, there should not be any software means to actually kill hardware, unless you include the firmware inside the SSD drive itself. (SSD firmwares have a horrific history, but that’s another story.) No matter how many Windows crashes you experience your hardware was never killed by those crashes. SSD writes have limited lifetime so in theory an infinite loop doing writes can do that, but the failures you see is not an exhaustion of the write endurance.

Secondly, Roon OS is based on Linux, which I believe is the most used OS on this planet.

This fairly recent blog post for example seems to describe similiar symptoms with Linux and a SSD brand and model discussed here: Fixing NVME SSD Problems On Linux – TEKBYTE

While that particular bug might have not caused physical damage to the SSD, all major OSs, including Linux, seem to have had bugs shortening the lifetime of SSDs, for example by defragmenting the drive on every boot.

I agree that software most likely is not the problem, just would not like to automatically rule that possibility out.

This is a timing issue that can cause hang or kernel panic, it cannot kill the SSD. The description “kept power cycling on its own” is a dead SSD, not a timing hang.

I believe the patch mentioned in this blog was already incorporated into Roon OS in August 2020.

Let’s assume it’s an OS issue as you say. In that case, one could simply install Windows and run Roon on it and it’ll work.

@wklie Thanks, after your comments and more research I am happy to accept that software cannot actually kill SSDs at the speed and displaying the symptoms as many in this forum have reported.

I have used my older laptop with an SSD drive for 8 years, and have also used many other laptops with SSDs for many years, and this is the first time I have come across any kind of SSD problems. So the shockingly bad quality of some hardware today has been truly eye-opening for me.

Knowing what I know now, I’m not just wondering why Roon Labs are using these poor quality drives for the Nucleus, I’m wondering why they want to be in the consumer hardware business at all, with their limited resources. At the very minimum I would radically change the way the Nucleus is marketed.

I’m using Seagate Nytro 1551 and Ironwolf 110 SSDs for Roon on my Synology Rackstation. So far, flawless performance.

FWIW, some of the SSDs on the market are little more than expensive paperweights or are destined to become so in short order.

I know Enterprise class drives cost more, but they’re so worth it for the reduction in hassle compared to the cheaper options.

Look closely at the DWPD spec (number of times the entire drive capacity can be written per day) and the warranty period. 3DWPD with a 5 year warranty (which is offered by the best in class SSDs) usually means that the drives will live longer than you!

1 Like