Performance and ecores (offloading DAC corrections)

The state of intel…

I have an Intel(R) Core™ i7-12700K and it’s out of cycles and I cannot turn on DAC correction. I push everything to DSD512 with, usually, one of the ASDM 512+fs modulators. This machine also has a 3090 in it so filters are offloaded. It appears the modulator uses 1 core per channel and that’s where it falls over. I’ve run the cores at 5.3Ghz which makes the machine slightly unstable. I’m currently pushing them to 5.1Ghz. DAC correction at DSD512 also produces some level of dropout.

With the state of Intel there does not seem like just shopping in a new CPU on the Alder Lake architecture seems like a good idea as it’d be a small boost in performance. Additionally, Intel’s latest shipping architecture feels too dated for me to jump in before new architecture start arriving this year and next.

Now, I kind of goofed when I bought this thing and have a DDR4 board. While the board supports overclocking it’s not up to the same sophistication as like a ROG or proper gaming motherboard.

For those with experience, I have none… I could use some advise… Should I sit back and wait, upgrade the motherboard, maybe there is a magic Alder Lake processor I’m not finding and could drop in? If, say, I had a budget of $500 could I get to a configuration with DAC correction. Thanks

@jussi_laako Can the DAC correction be offloaded? Even to e-cores? My e-cores are doing nothing.

Also, in my current configuration, setting sinc-Mx will lockup the machine right at the start of play. It’s play LOCK within a couple seconds. However, since-MGa works fine. I was chasing best transients and found a couple filters that just caused the machine to lock. I can reproduce this and it requires a hard power cycle to reboot the machine. Nothing on screen. No logs I can find after reboot. It just locks hard. Then, to get Roon happy again, I have to delete the audio interface and re-add HQPlayer. Very odd.

My 13600k, 64g ddr4, 3080ti can run dsd512x48 with dac correction with just about any filters and modulators (other than sinc L which I don’t use anyway)

On paper I don’t know why that chip and my chip would be all that different. I’m starting to think something is just broken with my system. Is it just the L2 cache?

I think I can drop a 13th gen i7-13700k chip on my motherboard. It might be worth a try. So confused. Come on Intel get your stuff together.

After the microcode 0x12b bios flash, my system could run dsd1024x48 7EC super with default filters too. No dac correction tho

2 Likes

yeah, that’s been loaded (I think, I think I did that a couple weeks ago. I will double check)

halo

i report

good performance update with 5.10.0

i now achieve stable dac correction @ 1024x44.1 with amsdm7ec 512+fs and psghrlp without interruption while browsing

image.png.e2ffca177de9de67e34756da19f19ccc.png

this required:
current build 14900k/4070ti super/ddr5 8200mhz cudimm

performance mode in windows 11

max perf in power management of nvidia app with overclock via auto tuning that resulted in +117 and 200mhz vram

ecores pool

nblocks 7

hardware buffer minimum 100ms

does not work with 48k atm

test ongoing

2 Likes

It is partially offloaded, to the extent it makes sense. For better transients, I would look into some of the shorter poly-sinc-gauss filters, at most poly-sinc-gauss-long. This will also free up some GPU resources and may help achieving DAC correction too. Depending on where the bottleneck is.

E-core offload is sort of alternative to GPU one. So you could try setting E-cores to filter and then CUDA offload to grayed.

Another datapoint. My I9-14900k + DDR5 7200 + HQPE/Ubuntu + no GPU can do 192k/24 → ASDM7EC-super @ DSD512x48 + dac correction + convolution + most filters other than Sinc-x family. Alternatively, it can do the same at DSD1024x48 but without dac correction. Right on the edge but it works well and I don’t see a need for a gpu at this time.

1 Like

Remind me what grey is in the hqplayerd.xml ?

Yes, I’m chasing the bottleneck… smi doesn’t show much past 15% util and there is plenty of ram available. The only problem I ever see is very high utilization on 2 p-cores and not util of e-cores.

oh darn. i just realized I don’t have an “ecores” setting. Time to test!

Thank you, as always.

Still drop-outs:
sinc-MGa, ASDM7EC-light 512+fs, source is 24x48, DSD512x48 with DAC Correction

I see stuff like this on my system where is ramps way up and then falls… drop outs are somewhat inconsistant but there. Only those two cores, 0 and 8 get in the +90% utilization. I am seeing some utilization on the ecores now:

03:25:41 PM 	CPU 	%usr
03:25:42 PM 	all 	9.96
03:25:42 PM 	0 	88.00
03:25:42 PM 	1 	0.00
03:25:42 PM 	2 	0.00
03:25:42 PM 	3 	0.00
03:25:42 PM 	4 	0.00
03:25:42 PM 	5 	0.00
03:25:42 PM 	6 	0.00
03:25:42 PM 	7 	0.00
03:25:42 PM 	8 	84.16
03:25:42 PM 	9 	0.00
03:25:42 PM 	10 	0.00
03:25:42 PM 	11 	0.00
03:25:42 PM 	12 	0.00
03:25:42 PM 	13 	0.00
03:25:42 PM 	14 	0.00
03:25:42 PM 	15 	0.00
03:25:42 PM 	16 	13.00
03:25:42 PM 	17 	0.00
03:25:42 PM 	18 	13.13
03:25:42 PM 	19 	0.00
  	 	
03:25:42 PM 	CPU 	%usr
03:25:43 PM 	all 	10.29
03:25:43 PM 	0 	76.00
03:25:43 PM 	1 	0.99
03:25:43 PM 	2 	0.00
03:25:43 PM 	3 	0.00
03:25:43 PM 	4 	0.00
03:25:43 PM 	5 	0.00
03:25:43 PM 	6 	0.00
03:25:43 PM 	7 	0.00
03:25:43 PM 	8 	73.47
03:25:43 PM 	9 	0.00
03:25:43 PM 	10 	0.00
03:25:43 PM 	11 	0.00
03:25:43 PM 	12 	0.00
03:25:43 PM 	13 	0.00
03:25:43 PM 	14 	0.00
03:25:43 PM 	15 	0.00
03:25:43 PM 	16 	25.00
03:25:43 PM 	17 	3.03
03:25:43 PM 	18 	2.00
03:25:43 PM 	19 	26.00

“convolution”, it is more these days, what I’d call “large operations”, but I didn’t want to make version incompatible changes to the configuration file.

I would change this to the practically equivalent poly-sinc-gauss-xla. This will change the offloading pattern too.

This was dropouts all over. If I could send the filter plus corrections to the GPU I’d be fine but this option appears to have pulled the filter and modulator back onto the cpu and that fell over big time.

I think I’ve got plenty of GPU I’ve either got a CPU bottleneck or RAM too slow but I have no idea how to determine if its a RAM / IO issue.

I’m just assuming I need to go 13th gen.

If you want everything possible on GPU, then having CUDA offload checked (cuda=“1” on Embedded) does this.

There is only one (sensible) way to do partial offload.

Results also depend which of the two buckets the filter choice goes to.

2 Likes

Could it be getting into thermal throttling? Do you have a way to monitor CPU temps?

I don’t see it go above 65ish. AIO cooled big radiator. But I’m only looking at 2 sensors.