Just for reference, here is what is MQA does:
First, the hi-res file is passed through a shallow filter that takes effect at about 20kz and is about 10db down at 50k. Since it is above 20khz, it is inaudible. Now have a look at figure 7 for the triangle of essential information from examining thousands of recordings. That is what needs to be reproduced - not everything in the hi-res recording. But it still needs to be hi-res to reduce time smear. To do this, the high res recording is resampled to 96k using a triangle function. That acts as the filter I mentioned before and produces a tiny time smear - about 20us. But any frequencies above 48k are ‘reflected’ back into the 48k region that a 96k recording reproduces. This is called aliasing. However, as can be seen, this is virtually nothing but noise, so who cares, except the recording is now noisier in the lower bits. But, because of the shallow filter above 20kz, when it is reflected, it is below the noise floor of the recording, so it causes no issues. MQA eventually chops off the bottom 8 bits where it resides, so it is not even present. It then applies a filter at 24kz and subtracts it from the now 96k music stream to give two streams at 48k - one 0-24 kHz and another 24-48 kHz. One way of doing this is in one stream simply add samples next to each other, and in the other subtract them. It then compresses the 24-48khz stream and puts the compressed data into the bottom 8 bits of the 0-24kz stream. That’s how the bottom 8 bits is ‘chopped off’. Note this constitutes subtractive dither and increases the effective resolution to 20 bits instead of 16 bits. I will let you investigate dithering and how it works. When Roon or whatever you use to play it back does the first unfold, it decompresses the 24-48kz from the bottom 8 bits and adds it to the 0-24kz stream to get 48kz back. How do you increase it back to whatever sample rate you started with? Easy. Since it is all noise, it doesn’t matter what value you use as long as it is about the same as what was there. So you simply linearly interpolate. This is a simple version of MQA - in practice, more sophisticated functions than triangles and linear interpolation are used - they are called splines. However, the principle is the same. That way, all the benefits of a high-resolution recording are restored as far as the time smear goes. Note this is a lossy process. But the audible 0-20khz information is not changed. That part is lossless. This is where the lossless controversy comes from. It is lossless in the 0-20khz region, the region you can hear. But otherwise, it is lossy. Each side of the debate is using lossy in a different sense. Like all debates about semantics, it is both annoying and pointless. But some get their jollies off on it. To each their own, I suppose. Which sounds better? As I said, some prefer MQA, others straight PCM. There is no right or wrong - just what floats your boat. Now we come to the vexed question of how the upsampling from 96k is done. The Bessel functions used are called minimal phase filters. What the M-Scaler uses is a highly accurate linear phase sync filter. Sync filters ring like crazy when fed with a short impulse - minimal phase filters do not. So you may think using the M-Scaler to upscale is a bad thing. Not so fast - short impulses are never fed into the M-Scaler from MQA - only real-world stuff band limited to 96k. This causes minor or no issues so is not a concern. It may however account for the reason why some prefer straight PCM with the M-Scaler. The ringing may not measurably be of much concern for real-world recordings, as we all know audio is funny and to some, it may sound not so good. Besides Rob clams, correctly, due to Shannon’s sampling theroem, using the ringing sync filter exactly reconstructs the bandlimited signal so the ringing is no concern anyway.