Poor DSD512 upsampling performance on NUC10I7

My guess was correct, it uses 2. This sort of parallelisation isn’t easy to do!