That is a pretty good interview actually.
“how many f’ing speakers do you need for a movie?”
Gold. I actually did have a 13.2.4 setup so I can have a laugh at myself now
And with it having a dominating format that makes DSP any harder or more expensive. The faster computational audio rises, the better for everyone, as I’m sure Roon users that own Audeze headphones can vouch for, so let’s not rally behind something that holds it back.
I’d say there’s a pretty good point to be made that for anyone with enough bandwidth to stream Netflix, the bandwidth savings are already irrelevant (because of the environmental noise, I don’t really see where, in the mobile space, even with custom, super-isolating IEMs, you’d really have practical use for anything beyond RedBook anyway, let alone in a car, so if anything, streaming MQA on the move is a waste of bandwidth… how much streaming companies can save by using MQA vs 96/24 is another dimension, of course, but my understanding of the business model is that bandwidth costs are way lower than licensing costs, so that’s probably a bit marginal).
Because the format is lossy, the M is moot (ha!) by design, so all that’s left is the A, which is… where the crypto is.
Just to be pedantic, AC-3 is open:
As I’ve shown earlier, you can have equivalent resolution in a completely standard FLAC that actually consumes less bandwidth than MQA.
Because MQA is put into normal FLAC container and it’s “authenticated folding” generates random-looking noise, FLAC compression performance suffers. This is same thing as if you try to compress encrypted data with ZIP - it won’t compress practically at all. While if you compress the data first and then encrypt it, you gain full compression benefits because the data doesn’t look random for the compression algorithm. This is for example what GPG and SSH do, they compress first and then encrypt. MQA is doing this vice versa and suffers from it.
Wow @jussi_laako , can you point us to that experiment? It’s not surprising but I didn’t know anyone did the tests.
I made a blog post on Computer Audiophile and then later another one with different tools in one thread there.
As a side note, most optimal would be to adjust sampling rate based on the content properties. Like one of my tests was for 2L recordings to use 120 kHz sampling rate 18-bit resolution FLAC - based on bandwidth and SNR analysis of the recordings. Still smaller than the MQA version, but with better resolution. (Since HQPlayer can play any source sampling rate to any output sampling rate, and I assume other players with built-in rate converters can do the same too.)
Jussi noted in January 2016 that 18 bit, 96 kHz FLAC files were smaller than MQA:
This is Jussi’s blog post on CA with the tests that @danny was enquiring about:
And this is a later post on CA but not sure if it is the one Jussi is referencing:
So essentially it is 17-bit after decoding which is what they begin with. After a bit of “mastering analysis” of the 2L content I ended up selecting 120 kHz sampling rate and 18-bit resolution to preserve everything (all frequency harmonics and all dynamics). Encoding that as standard FLAC results in file smaller than the MQA file. And using more typical sampling rate of 176.4 left ~30 kHz unused bandwidth and 18-bit TPDF dithered samples (in zero-padded 24-bit container) results in completely typical FLAC that is very tiny bit larger than the MQA file (16.7 MB vs 17.0 MB).
In some thread I also posted results with iZotope noise shaped dithering to 96 kHz 16-bit with some other material.
I’m also planning to try out this for comparison: https://www.xivero.com/xifeo/
All these different approaches produce completely standard FLAC files.
This depends how you define “resolution”.
I’d like to add one other thought to this conversation. To me, Stuart is clearly right about one thing: Regardless of bandwidth and low data-storage-and-transmission costs, the ratio of file size to actual information is approaching absurdity. If you want to fly from Des Moines to San Francisco, you may have time and money to fly via Laguardia, but that doesn’t make it a good idea. PCM that makes no attempt to discriminate between music-correlated information and noise is silly. It makes little sense to retain all that noise.
It is combination of sampling rate and word-length (effective bitrate). MQA uses 2x the base (container) rate and word length varying depending on content. I’ve seen MQA content going below 16 bits. Amount of bits they need to encode the upper octave depends on how much content there is. Classical music is usually quite easy, electronic and pop/rock music being more challenging.
DSD is much more efficient, and doesn’t have any “blur” problems to begin with in first place, because even with DSD64 the “Nyquist” frequency is at 1.4 MHz. So you never have to deal with sharp filters, not at ADC nor DAC.
Interesting juxtaposition of quotes, considering that DSD64 has just one single bit of dynamic range. Both MQA and DSD use noise-shaping to shift the noise to higher frequencies. You’re right, though, about the time-domain stuff, although I’m not sure how well that timing precision is preserved in converting back and forth to/from DXD. Also, how compressible is DSD? Serious question; I don’t know the answer.
The DSD compression stats that I have seen for DST on SACD have run around 50 percent or 2:1 – similar to that of FLAC on Redbook Audio. What is unclear, though, is the multichannel factor, whether any of that DST compression ratio on SACD derives from multichannel correlation that does not apply to the same degree to stereo.
An interesting, off-topic note: audio systems for multichannel video (and many others) define the data format and the decoding algorithm, but not the encoding algorithm. You can experiment with different encoding algorithms, and a skillful engineer will tune the encoding algorithm and parameters to the content.
But how do you do that for real time broadcast? (Who would want to do real time multichannel? A basketball game, for example.) No opportunity for tuning. I read about one system that takes a 1024 sample buffer, 20 milliseconds, of all the channels, and passes it off to a whole bunch of computers. Each of them applies its own algorithm to the content. After maybe 15 milliseconds, the master computer picks up the compressed results from any algorithm that has finished (the more clever algorithms don’t operate in deterministic time, those that didn’t finish are out of the running for this buffer). It decodes the results, and compares the result from each algorithm with the original buffer using psychoacoustics (clarity vs. directionality etc.), and chooses one. Then on to the next buffer.
And you may tweak the audio bit budget based on the demands of the video stream.
That’s pretty clever.
A comment on DRM. If we put aside the terminology issue, whether it is DRM, and operationalize it, we must decide what to do about the known facts. Some argue that MQA has the technical capability to do DRM, even if they do not do it today. There was the patent quote kerfuffle again.
But I think this is tinfoil hat paranoia. Not because it is wrong, but because by that standard, you can’t do much today. You certainly can’t get a Windows Computer, or a MAC or iPhone, because they do auto updates that are in practice mandatory for security reasons, and they could restrict your behavior in many ways. You can’t get a car, because they are software based and get updates — if you financed the car, they can raise the interest rate if you drive off road, or on a race track, or in New York City. Or they can cooperate with the insurance company and raise the rates if you “abuse” the products. (They can call it “discounts for responsible drivers”.) If you’re a farmer, you can’t buy a tractor, because they are software based, and seeds are licensed, not bought, so Caterpillar can cooperate with Monsanto to prevent planting seeds in violation of the license agreement.
Crazy examples? Maybe, but my point is, if you refuse a modern product because of the technical capability for abuse, you will have to stay in bed.
What I’ve calculated, DXD timing precision matches DSD64 pretty much 1:1. Thus, for editing DSD64 material it is pretty good choice.
It largely depends on the modulator. Better the modulator, less compressible it is. So from streaming perspective it is better to calculate worst case scenario of uncompressible meaning 5.4 Mbps for stereo DSD64 (M = 1048576).
Later on I will do some tests on WavPack DSD compression to see how much practical compression it can achieve, but I would assume at least 10 - 20% for most traditional material.
Some of the modulators used for SACD are also optimized for DST compression in mind to gain better compression ratios while still preserving good quality.
Hmmh, maybe that is possible in the US. But at least not here. Only basis here for insurance fees are your age and personal accident history and car make/model statistics and the system is transparent. Prices are same for all and the bonus percentage is publicly specified in insurance terms document. For interest rate not even that, they advertise the interest already before you even walk into the shop. Quite regularly you can get even 0% interest and in worst case anyway below 5% fixed.
How would they know where you drive and how you use the car?
What you describe is something I’d never agree on.
Same goes for audio. Not all MP3 encoders produce same encoded data, while all decoders can decode data from all encoders. Same goes for AAC encoders. And there are clear audible differences between the encoders. There have been blind listening tests between different implementations.
For example current open source LAME MP3 encoder has much better subjective sound quality than the original Fraunhofer reference code.
The models predicting audibility of different components (IOW, what parts can be discarded) and deciding bandwidth allocation are different. Lot of development effort in that area has improved the performance significantly.
My point was, the technical capability of abuse is there.
The fact that the financing and insurance industries are regulated doesn’t change that.
Can you imagine some other abuse that the vendors could come up with?
Or do you think there some more devious person who could come up with one?
My point was that identifying a technical capability for abuse is not a reason to stay away.