Has a core died in my dual-core processor?

Just bought a brand new NUC7i5 and installed ROCK a few days ago. Everything was working fine. Out of curiosity I wanted to check a few processing speeds to compare to the processing speeds I was getting from my late 2012 i5 Mac mini running Core. Using NUC+ROCK, upsampling 44.1kHz/16bit WAV files to 176.4/24bit showed a processing speed of 95x to “off the chart” (i.e. > 100x). Which was as expected as the Mac mini displays a processing speed of about 65 - 70x for the same task.

Yesterday, I checked the processing speed again, same task, and it’s running at about 31x. Also, when all was running well a few days ago, I checked DSD64 to PCM conversion and the processing speed was about 20x, and that’s now about 6.5x.

What could have happened to cause the processing speed to drop so dramatically and now significantly worse than my elderly Mac min.

Wondering if one of the NUC’s processor cores had died, I went into the BIOS and switched to 1 core only to see what impact that has on processing speeds, hoping it would drop significantly again, and proving both cores were working as expected. However, not really much difference in processing speeds, oddly, a little higher if anything, but not sure if this is a valid test or not as I understand that DSP is normally dedicated to a single core anyway.

Is there anything I can do in Roon Remote or ROCK, e.g. some form of diagnostic, that would indicate that both processor cores are operating normally? I’m a bit blind trying to get any info directly from ROCK or the NUC. Would the BIOS report an issue if one of the cores was faulty?

Another possibility, maybe there’s some form of simple operating system I can burn to a USB drive that I can boot from that has a CPU performance monitor?

Obviously the reduced processing speeds I’m seeing now are still adequate for what I need and are not causing an actual problem, but it’s bugging me that something may have gone wrong somewhere that can be rectified or there’s a fault with my new NUC.

What is your setting for “Parallelize Delta-Sigma Modulator”?
Switching it on should enable more than one core. Was it on before and off now?

If one core is dead all processor shoulb bevi suppose.in the bios you can select how many phisicla cores do you use and the clock speedband other factors.have you applied some kimd of optimization on the machine? Such as fidelizer pro who or AO??

Unlikely.

Enable Parallelize Sigma Delta Modulator, and disable Native DSD Processing.

If you have different processing speeds for the same settings at different times, make sure you’re testing with music placed in internal storage (not from network). I’d also check whether the drive has SMART errors, and clean and re-apply the thermal paste on the CPU.

Thanks all for the suggestions so far.

I’ve enabled conversion of PCM to DSD128 and compared the processing speed with Enable Parallelize Sigma Delta Modulator set to Yes and No. Worryingly, both settings have roughly the same processing speed, approx 2.8x.

I notice that with Parallelize Sigma Delta Modulator set to either yes or no the processing speed seems to start off at a higher value and settles down over a duration of maybe 10 or 20 seconds to a lower value. With the setting set to Yes, it seems to start at a higher value than if set to No, but both settle to about the same figure.

As per @wklie’s suggestion regarding thermal paste on the CPU, it could possibly be a thermal issue. I’ve completely shut the NUC down for now to let it cool. When I get a chance later I’ll start it up and immediately start playing something with heavy processing and try and observe how the processing speed varies. I expect it’ll start off quite respectfully and drop to about half or a third of the initial value over the course of a minute or so. Don’t want to dismantle the NUC just yet, especially if I need to return it.

If returning it is a possible option, then don’t dismantle it. Install Windows and some suitable software then you can monitor the individual core frequency and temperature, and fan speed in real time. You can even do stress test like Prime95.

Just to throw this out there: many of us have experienced severe slowdowns with Roon since 1.6 that are not originating from any sort of broken hardware. Rebooting the core is a temporary solution and the problem can recur. If you can reboot and fix it temporarily, you may be in seeing the same thing.

Right. This reminds me @Allan_P should also check that it’s not caused by (normal) background analysis.

Thanks again for updates. I don’t think it’s background analysis, as that finished a few days ago and I’m not seeing any further activity.

I think another possibility is that it’s a BIOS issue (I’m on the latest version). I did notice on the first evening when processing speeds were healthy that the fan picked up a fair bit of speed and was obviously audible later in the evening. Since noticing the lower processing speeds I’ve also noticed the fan has been on constantly but at a fixed lower speed. I assumed it was just doing less processing so less heat being generated. I’ve read a suggestion that I should reset the BIOS with an F9, redo necessary changes, then save and exit. Apparently, if this is the issue, this should get the fan running to a variable and appropriate speed again.

Will report back with more info after I’ve tried a few things tonight

The processing speed is relative to a single core’s usage. review : http://kb.roonlabs.com/DSP_Engine

and:

So, your processing at 31x is using about 3 percent of one core. Before at 60x is was 1.5% ish. Not sure I’d worry about that minor amount of change. Likewise the change in cpu usage between 20x and 6.5x is roughly 9% for the single core. If you read the 2nd link, you’ll see I’d commented to the Roon dev that it would actually be more useful to display the core usage percentage.

I did not get a good sense if you were comparing the mac mini numbers to the new core in the 20x to 6.5x example, or, was it that your new core last week was 20x and now is 6.5x this week?

@Rugby here are the DSP processing speeds I recorded

2012 i5 Mac-mini
Upsampling 44.1/16 to 176.4/24: 65 - 70x
DSD64 to PCM: TBD

NUC+ROCK, first evening:
Upsampling 44.1/16 to 176.4/24: 95x to “off the chart”, i.e. >100x
DSD64 to PCM: 20x

NUC+ROCK, second evening:
Upsampling 44.1/16 to 176.4/24: 31x
DSD64 to PCM: 6.5x

I’ve just had a frustrating evening trying to sort out what’s going on but I think I have a much better idea now. It’s not a BIOS issue. Reloading the BIOS made no difference, but I was able to adjust the fan settings so they were much more sensitive to CPU temp and picked up speed quite readily as the NUC just sat there in the BIOS, warming up from cold. At idle, in the BIOS, after CPU temperature stabilised the fan was running at approx 4,000 RPM.

I then went into ROON as normal, and just left things idling, the fan got nowhere near 4.000 RPM as it had when sat in the BIOS. When playing music with heavy duty DSP requirements the fan still refused to speed up from it’s very low default speed. I decided my ROCK install had gone wrong somehow and was responsible for the slow running of the fan and inhibiting fan speed up. As my build was done quite recently and no database edits yet I decided I had nothing to lose by starting over and re-installing ROCK from a fresh download and USB stick. The fan was whirring away nicely during the database build but when finished and playing music with heavy DSP applied things hadn’t improved, exact same fan behavior (running slowly and not speeding up) and the same slow processing speeds as recorded above on the second evening.

I was about to give up on it for the evening but I just happened to let background file analysis carry on running, which I’d stopped in order to check the processing speed figures. The fan burst into life, I rechecked the processing speeds and they were back to the figures I’d recorded above for the first evening. So clearly when I recorded the first evenings figures, background analysis was still running and the fan would have been spinning up to speed freely. On the second evening, background analysis would have been completed and simply playing music, despite the heavy DSP loading, the fan would have been in an inhibited state, running slowly and refusing to spin up to a faster speed for the cooling rate needed, hence the CPU was overheating, resulting in the poor processing speed figures.

I’ve played around with this now in various ways and concluded that Roon is taking control of fan speed, running the fan at quite a slow speed by default, which is fine, but DSP processing on it’s own is insufficient to allow the fan to come out of default fan speed despite the amount of cooling required, resulting in overheating. Introduce some other process, like Background File Analysis and the brakes come off and up spins the fan and all is normal.

This must be a bug in ROON’s software. I’m planning on rehousing the NUC board into a fanless case so the problem will go away then anyway, but I’d like to report it and provide whatever details I can. I’m new around here, what’s the best way of doing that?

It’s very unlikely that Roon is interested in controlling fan speeds. The fans should follow the cooling needs of the processor (processor speed) which follows system load.

There’s just not enough system load!

Why running the processor at full speed when e.g. half that speed is still more than enough? Modern PC systems try to save as much energy as possible and thus reducing processor speed when not needed. I guess this is what leads to the different readings in Roon.
A passive cooling solution would hardly be possible if the processor would constantly (and maybe most of the time needlessly) run at full speed.

What makes you believe so? Has the NUC turned off because of overheating?
If not, I would say that all behaves as expected.

1 Like

I doubt either statement is true. I agree with @BlackJack this may simply be not enough load.

Before you do any hardware changes this is the time you need to use Windows to verify your hardware by running stress tests. You may also check how Roon behaves when Windows is set to High Performance mode.

@BlackJack, @wklie Let me describe a few scenarios which lead me to think there is an issue with the CORE software:

Scenario A

  1. The BIOS fan speed settings have been adjusted so that the fan is quite audible (4,000RPM) just sat in the BIOS, idling, no other activity, CPU temp 43/44 degC
  2. Start Core, no activity, the fan has slowed right down and is barely audible

Possible conclusions: CORE is overriding the BIOS settings and setting a lower fan speed, or, the CPU temp when running CORE, idling, is significantly less than 43/44 degC (doubtful, I think). Perhaps what I’ll try as another test is adjusting the fan settings in the BIOS some more so that the fan is constantly running quite high, even at room temp for the CPU, then start CORE and see if the fan slows right down. That would indicate if CORE is having some direct influence over fan speed

Scenario B

  1. With no background activity running, start music playing with some intensive DSP (I’m using convert PCM to DSD128)
  2. Processing speed starts initially at 4.9x, but gradually drop to 2.8x over the course of a minute
  3. Observe that fan never picks up speed from low default setting

Conclusions: CPU is overheating and throttling back to prevent burn-out, or, CORE is dynamically adjusting the processor load to prevent the need for the fan to speed up and become audible. I’m not sure which of these possible conclusions is correct. It would be very clever of Roon if it’s the latter, but potentially shortening the life of the CPU if it’s the former

Scenario C

  1. As Scenario B, but with background file analysis running
  2. Fan speed is quite dynamic, speeding up and slowing down as needed
  3. Processing speed is consistently high

Conclusions: As fan now appears to be running normally, CPU is no longer overheating, or, CPU load is simply higher, but why would processing speed with background file analysis running be higher than when it’s not running?

I appreciate that in all scenarios processing speed is always > 1, so no actual problem, but my concerns are that a) the cpu could be overheating, leading to premature failure, and b) If I add further processes to the DSP chain in the future e.g. I add a convolution filter, I might run into problems with insufficient processing speed, if that’s being caused by a cooling related issue because the fan won’t spin up then I won’t be getting the horsepower I’ve paid for.

I guess for that second concern, I could try creating some additional load on the DSP engine and see if I can get it to drop below 1x. If I can do that and the fan is not running up to speed, then that’s an issue. Further, if I switch on background file analysis, the fan speeds up and the processing speed then exceeds 1x then that would point to a bug somewhere.

I’d love to try Windows on the NUC and run Core on that and see what happens, but I don’t have a copy. Would I have to pay for a licence just for temporary use?

To answer your original question, if a core had died your machine would very likely fail to boot.

1 Like

Ok, I’ve done a few more tests and I’m feeling happier now.

Test 1

I went back into the BIOS and set a really aggressive cooling policy. For every 1 degree above 25 degrees CPU temperature, increase the duty cycle on the fan by 10%. it’s very sunny here today and 23 degrees indoors. This had the fan whirring into a frenzy (about 6,000RPM) when just sat in the BIOS. I started up CORE and the fan went into a similar frenzy. I couldn’t measure the speed but it was definitely much faster than usual. So I’m happy that CORE isn’t controlling fan speed, the BIOS settings are.

Test 2

Set up my usual test of converting PCM to DSD128 and noted that the DSP processing speed was same as previous tests (about 2.8x after settling). This was with the fan whirring like crazy, so would definitely be cooling the CPU if it needed to. So I’m happy that the previous low fan speeds I was seeing before wasn’t causing the DSP processing speed to slow right down due to lack of cooling and consequent overheating

Test 3

I added two more end points and tasked them with some heavy duty DSP. This didn’t even make a dint on the DSP processing speed I was seeing in Test 2. So I’m happy that the CPU speed is being dynamically adapted to suit the load to try and minimise power consumption and heat generation

What I still don’t fully understand, but it’s not really an issue, is why the DSP processing speed more than doubles when Background Flle Analysis is running.

So overall I’m comfortable now that my NUC is working as it should. I’m still going to rehouse it in a fanless case and now fully expect that I’ll see pretty much identical behavior, i.e. slower DSP processing speeds than the NUC is ultimately capable of, but CPU horsepower will be bumped up as needed to adapt to the load at hand.

The cpu intensive nature of file analysis triggered the CPU to run at a higher frequency, or even turbo boost. Without the additional load the CPU may run at a low frequency to save energy and lower heat. If necessary, energy saving C-states can be disabled in BIOS, and Windows can be set to run High Performance mode to keep the CPU frequency high.

1 Like

I have noticed my system running ROCK is running lower processing speeds than it did. I think this is down to some changes somewhere and its using more cpu cycles. Before 1.6 I was getting 3.7x without sigma delta enabled. Now I am lucky to get above 2.5x so somethings changed either in ROCK or Roon server.