I am having an issue getting the poly (non 2s) filters to play successfully with roon/HQP or just HQP solo. If I select a poly 2s filter (or minringFir) playback starts almost immediately and task manager says CPU usage is around 30%, Core temp reports ave CPU temps of 45-50 C. Computer is win 10 pro, stand alone system with music on usb attached HDD, processor is 17-4790K, MB is Gigabyte Z97, with 8GB of 2133 Mhz ram. When I select the Poly filter (non 2s) the CPU pegs at 100% for about 45 seconds and then sometimes music begins playing but stutters constantly, other times music will not play. While stuttering Task manager shows CPU usage around 50%, CPU core temps briefly hit low 60’s C but ave low to mid 50’s.
For folks getting DSD512 to work successfully on the poly filters what are you using for processors and MB and are you doing any specific setting in bios to better tweak the system. This is driving me crazy as this setup should easily handle DSD512 with all filters.
I don’t think it’s a given that your system can easily handle DSD 512. 512 is pretty serious. If you are not having trouble with the -2s filter variants, or with the same settings at 256, then I suspect your CPU is not up to the task,
Rather than upgrade the CPU or motherboard, you might look into a dedicated number-cruncher, i.e. an NVidia graphics card to handle CUDA offload.
Having read more forum pages than I care to. It looks like you may be right the 4790K can play all 2s filters, and poly sinc ext on my system just fine with CPU usage around mid 20’s to mid 40’s. When I try to play the polysinc (non 2s) I get 20-45 seconds of silence while the filter is loading then the music stutters, but as stated above, CPU usage average use (core temp 1.3 monitor) mid 30’s to upper 40’s. Further examination of each core usage shows 1-2 cores hitting 75-90% usage while others run 15-25%. I have hyper threading turned on in Bios and Pipeline SDM on (checked) in HQP. Jussi has stated in some forums if CPU goes above 75% then there can be stuttering. If there was a way to equalize CPU usage across all cores I would be in the 45-55% range and it would/should work.
There’s not much you can do from your end to equalize how HQPlayer divides workload between processor cores, I wouldn’t imagine.
You might make some headway disabling hyperthreading, however. Not Pipeline SDM, just hyperthreading. HQP doesn’t make use of virtual cores, so if this is your machine’s primary function you might get more balanced performance without hyperthreading.
I actually tried that last night. Polysinc shrt MP at DSD256 with hyper threading enabled; music paused for first 30 seconds then played and task manager showed 35% cpu usage. I then repeated same as above except hyper threading off, at first music paused then played fine but now task manager showed cpu usage at 63%. Turned hyper threading back on and repeated and task manage reported 33% usage.
I seem to remember on my other 4790K pc with only a H81 motherboard when I first went into bios and disabled a lot of stuff and then checked TM usage on all cores the load was very balanced, which surprised, but pleased me. Then I had other issues and changed it (bios) and stupid me didn’t write down what I had done that evened out the usage.
If you have a quad-core CPU, make sure you have “Pipeline SDM” enabled - this maximizes the core usage (on a dual-core it just increases overhead and thus has only negative impact). Using “Auto rate family” helps keeping load low as long as the DAC supports DSD also at 48-base rates.
That is correct and as it should, because task manager shows 100% when all cores are at 100%. So if you have HyperThreading enabled on a quad-core CPU and have all the four cores loaded to 100%, but none of the virtual cores loaded, you have exactly 50% load - which practically means HyperThreaded CPU being fully loaded. If you have HyperThreading disabled, then you don’t have those virtual cores, and having all the existing four cores loaded to 100% will also show task manager at 100%.
With HyperThreading enabled it is practically impossible to get the task manager load figure to 100%, because you have only four execution units but eight virtual cores - so every two virtual cores share a single execution unit, but have only register set duplicated to ensure CPU stays busy in cases where you have much more high loaded processes than you have cores (for example because RAM wait states may sometimes put one “thread” on wait because data has not arrived yet from RAM). IOW, CPU can itself “task switch” between the two virtual cores for the same execution unit, if the other of the two needs to wait. This doesn’t really benefit HQPlayer because it really need execution units for doing real work and doesn’t have so many highly active threads. It benefits more the OS itself. OS scheduler understands this and distributes the work to actual cores.
So when you check loads, open Resource Monitor from the Task Manager and switch to the CPU-tab. Then look at the per-core (virtual!) load graphs. If any of the cores maxes out, you likely get a drop-out. Remember that every doubling of sampling rate at least doubles the CPU load! So if for example at DSD256 the per core graphs already hover constantly above 50%, DSD512 is not going to work with the same settings.
One can try to help these things with nVidia graphics card too… If things work fine with the -2s filter variants, adding suitable graphics card can help bring things to working level with rest of the filter settings. Depending on case GTX 1060, 1070 or 1080 may be suitable.
Thanks Jussi, Yes I have pipeline sdm enabled in all cases and I use Roon and Tidal. When I looked at the individual core performance, usually 2 cores had usage in the 75-90% range (this when trying DSD512 with my 4790K) others were as low as 15-20%. So I am disappointed that OS and hyper threading with pipeline all enabled did not manage the load more evenly among the cores. Hence I get stuttering on the poly filters at 512 but 2s, minringFir and for some strange reason poly since ext works fine at 512.
Well, your CPU has only four cores… As I said earlier the other four virtual cores provided by HyperThreading are “fake” and cannot do actual work simultaneously with their siblings. And the OS knows this too (but Task Manager doesn’t, so it will incorrectly show total load 50% lower than it really is). With Pipeline SDM you get work distributed fairly evenly to four cores in this case.
When two of the cores choke on the amount of work, the rest will stall because they cannot proceed alone.
Some of the algorithms just cannot be distributed to more cores due to mathematical relationships.
Thank you Jussi, and Thank you for such a valuable and excellent sounding program. I enjoy it very much. So in your opinion and experience is the 6700K processor capable of handling all HQP filters at DSD512 up sampling with Roon and tidal integration? Or should one wait for the 7700K or even try a 6 core (perhaps 6800k with video card just for video, no cuda)
I’m not going to promise anything about any hardware spec… Devil is in the details. I know how things work on my particular machines. But I don’t have any 6700K machine at the moment.
Waiting for computer stuff is kind of eternal loop, there is always something better coming just around the corner. At some point you need to decide that you need something and settle with something that is actually available.
I’m curious why you don’t want to use CUDA if you’d already have a suitable GPU for graphics? Of course you could go with AMD graphics card and as a consequence not have CUDA capability (or if you have Intel integrated GPU).
I don’t currently have a graphics card, I have researched CUDA a bit more, so is the 1060 family of video cards enough to do decent CUDA processing. The 1080 you use are to pricy at almost $700 but 1060 at slightly under $300 is possible. Would that help the 4790K perhaps handle the Poly filters? Since that is the chip I currently use? I know nothing is guaranteed in PC land. The Poly 2s filters run in the low 30% range on TM. (Roon and tidal)
I don’t remember right now which DAC you were using, but if it supports 48-base DSD rates (T+A DAC8 DSD does), then likely yes, if you enable the “Auto rate family” setting (did you already have it enabled?). Because doing 48k -> 22.5792M is probably not going to work.
I am new to this world of hqplayer and dsd. Last week I bought a T+A DAC 8 DSD and it sounds good at PCM. I want to upsample everything to DSD512. In HQplayer I can get as far as DSD128. I have a Kaby Lake i5 7600 processor, 8 Gb of DDR4 2666 MHz RAM and no graphics card yet.
Wil adding a GTX 1060 for CUDA offload be enough to solve my problem?
What happens above DSD128, stuttering?. What filters are you using? with the i5 I would stick strictly to the 2S versions of filters. Even the i5 should be able to do DSD512 I would think. The cuda card helps but if you can’t get above 128, even with 2S filters then I would not spend the $ on this card and instead start looking at i7-6700K thru 8700K processor family to do this, tho even with the 8700K I doubt you could do all the HQP filter family, especially the XTR non-2s filters those are real heavy CPU loads.
The 7700K I built for a friend could not do the non 2s XTR, or the closed form filters. All others worked. 2S filters are very good sonically.
In fact, when I choose DSD5, DSD5v2, ASDM5, DSD7, ASDM7 I get DSD output to the DAC. Either DSD64 or DSD128. This is visible on the dac’s display.
When I choose DSD5v2 256+fs, DSD7 256+fs or AMSDM7 512+fs and press the play button, nothing happens. The play button becomes available again after 1 second.
No stuttering, no conversion, no music, and no locking of the dac to a signal.
I experimented with all kinds of settings and filters (also 2S) and nothing changes.
This morning I installed HQplayer on my other PC. Kaby Lake i5 7600K, 16 Gb of DDR4 RAM and a GTX 1080 + graphics card. With CUDA offload enabled still nothing happened.
I don’t know what DSD5v2 256+fs, DSD7 256+fs or AMSDM7 512+fs should do, I thought they would give me DSD256 or DSD512.
I am using software version 3.19 (trial version to test if everything works)
Those are the modulators, think of them a bit as a crossover (1st order, 2nd order, 4th order etc) they determine to some degree how ultra high frequencies are rolled off (very simple explanation) they do nothing for upsampling. Upsampling is determined by the bit rate box. Here’s a photo of my settings at DSD512 (44.1x512)
With settings set like this, and with the 4th box (rightmost, under the HQP volume knob on the main HQP screen) set to SDM. Then HQP will upsample the source file to DSD512 (44.1x512 or 22M6), using the poly-sinc-xtr-mp-2s algorithm and filtering with the DSD5V2 modulator. Note in the lower part of the screen I greyboxed Multicore DSP this is a auto setting for hyper threading so make sure you have it greyboxed (auto), or checked (ON), if you are using your PC with the 1080 card you can check cuda offload, if there pc with no GPU then leave unchecked. Do not check Auto rate family. Set Vol Max tom -3 or -4 and volume min can be any number most set close to vol max number.
Then under HQP settings click the DSDiFF settings, right under the main settings selection that opened the screenshot above and make sure on that dialog that opens up that you uncheck direct SDM, if it is left checked then the volume setting you did on the screen shot will not take effect and will default to 0 db and digital clipping is likely to occur.
Remember stick with the 2s filter family (Oversampling) as that puts a much easier load on the CPU. Make sure backend (top part of settings screenshot) is ASIO, and Device will be PDP3000HV ASIO 1.0x for the T+A dac that you have. Then you should be good.
Thanks a lot! I put in your exact settings, and now everything is converted to DSD512 and the DAC locks.
That’s a big step forward.
Only this time I get stutters with everything I play. Hardware monitor shows dat in conversion the CPU load is 100%. As soon as the music starts CPU load is around 50% for all cores. I tried increasing the buffer time to 250 ms but that does not help.
Tomorrow I wil connect my DAC to the other PC, with the GTX 1080 and see if CUDA offload works.
when you first hit play the filter initializes and CPU can certainly go to 100%. Once music starts it does drop down, in my experience if any core goes above 70% utilization then there is a chance for stuttering. i5 is pushing it for DSD512, 256 would probably play just fine (48x256).
One thing I forgot, for bit rate since you have T+A set it at 48x512 and then check the box autorate family near the bottom left. This will upsample 44.1 files to 44.1x512 and 48K based files to 48x512. Autorate family will lessen the CPU load a bit this way. Also try another 2s filter, those XTR’s are CPU hogs.