Sorry for reopening an old thread.
Here is the skinny. MQA is lossy. I will repeat MQA is lossy. But it is not lossy in the audible range of 0 to 20khz. Most recordings these days are made from capturing in a one-bit format like DSD - these types of recorders are accurate and surprisingly cheap. To make a 96k recording, you put the DSD through, say, a 32 bit (vastly more than you need - but it’s just to be sure no rounding errors in computations etc. when made 24 bit) 96k filter. That is how they produce a 96k recording these days. They usually use what is called a sharp cutoff sinc filter. If you feed a short spike into such a filter, it has ripples on either side of the spike - technically called a sinc function. These ripples cause what is called a time smear. The conjectured reason why increasing sampling frequency sounds better is that the higher the sampling frequency, the less time smear that is audible. So a standard 96k recording has 96k of time smear - better than 48k - but still not optimal.
Recognising this, instead of using a sinc filter MQA does it a different way. First of all, analysis of recordings shows that much less than 1% of recordings have musical information above 50khz, which is not greater than the noise. So 96k sampling is fine for conveying the musical information. The amount of musical details tends to drop approximately linearly with frequency. So the first thing MQA does is find the noise floor in the recording up to 48khz. It then wants to reduce the information at 48khz, so it is below the noise floor. MQA applies a very shallow filter that is virtually flat to 20kz and down about 8db at 48kz. It will vary from recording to recording, but MQA analyses the recording to determine the best filter. Now imagine you have a 192k recording. If you chuck every second sample away, you get a 96k recording. What you find is the information above 48khz is reflected in the 48khz region. Because of the filter, it is reflected into the noise of the recording, so it is inaudible - indeed, if you get rid of the noise later - doing that means it is not even there. Of course, you can do that to 384k to reach 192k, 786k to get 384k and so on up to the many MHz of a DSD recording. With a bit of math, it is possible to do all this with a single filter. So what you do is apply this single filter to the DSD stream instead of the sinc filter. Sure, you have a bit of droop in the frequency response above 20kz, but that is thought to be inaudible (whether it is or not is another matter). However, what you have gained is that using a shallow filter instead of a sharp one is an almost negligible time smear - it is claimed to be about 50us, whereas, for 96k, it will be 20 times greater at 1000us or 1ms. The claim is that it is audible. So what you have done is reduce the time smear at the cost of an 8db (or thereabouts depending on the filter used) droop at 48khz. It then gets trickier. Twenty-four bits have a resolution below the thermal noise limit. So you do not need 24 bits to record the audible information - 16 is usually adequate, and 18 certainly is especially if you use a dithering process. MQA can put those extra bits to use. What you can do is add and subtract a sample and the sample next to it. Use the added samples in the say first 16 bits at 48k. Then, since the removed information is above 24khz, and the amplitude decreases with frequency, the number of bits in the subtracted values is small and can be easily compressed into the last 8 bits. If you play it on standard equipment, it will sound like noise, which you would have had anyway. It is easy to recover the 96k by uncompressing, then adding and subtracting. In practice, they do not use adding and subtracting, but a Quadrature Filter. That in no way affects the principles.
Putting this all together. MQA produces a lossless file up to 20khz, which is the audible range. It has a slight frequency droop to 48Khz, which makes it lossy. Whether it is audible or not is another matter. Anything above 48kz, except for much less than 1% of recordings, is noise. However, time smearing is drastically reduced, and that is supposed to be audible. It is the claimed reason high sampling rate recordings sound better. It has reduced time smear similar to what a recording at many Mhz would have.
Since the filter used to downsample it to 96k is known, MQA can use an ‘inverse’ filter to upsample it to anything you like. It will be just a guess at the information lost (nearly always just noise), but hopefully, it is better than just any filter. That claim I am not sure of - however, it is another story.
So MQA is lossy - but with less time smear. Which sounds better? Comparisons have been mixed. I have participated in a few myself. Some like it, some do not. Me - I like it. A typical reaction is the original sounds fuller and more harmonically rich and textured, but with a slight haze or blurriness. MQA is leaner, starker, more transparent, like a layer of fog has been removed. There is no right or wrong here. Just what you prefer.
I think MQA is only partly on the right track. A better way to do it IMHO is to use a new compression technique to transmit at 176k:
Do the same as before, but since the frequency bands with little or no information do not add much to the size of the file, one can accommodate the small number of recordings with the info above 48k by simply transmitting at 176k with a simple standard slow roll-off filter and 16 bits. Also truncating in the frequency domain is far less audible than in the time domain. So in the unlikely event, 16 bits is not enough the difference will almost certainly be inaudible. Just a personal view. Of course, it would not be backward compatible like standard MQA. Further discussion of the details would require a separate thread.