AVRs tend to have lower-end SDM-based DACs (and lots of them to support all of those surround channels and extra zones). I would try targeting their highest supported rate first.
44.1 multiples only atm
7th order is empirically more processor intensive than 5th order. It does more arithmetic operations per sample. It’s technically more accurate, and has a lower noise floor.
That said, most of the CPU load involved in DSD upsampling is not in the SDM–it’s in the PCM-domain upsampling stage that comes before it. There is no CPU usage difference between min phase and linear phase. The best way to reduce usage is to target a lower sample rate.