Is there any way to tweak how Roon decodes lossy file formats? It recently struck me that what it currently does (decode to 16 bits per sample) likely involves quantisation and dithering steps that could be avoided if Roon could output at 24+ bits per sample. Since switching to Roon, I’ve had this nagging feeling that the lossy files in my music collection have all sounded slightly worse than they should…
I’m no expert, but I don’t think it would help in this scenario - information is already lost when creating the ‘lossy’ files, and it cannot be recovered, so I don’t see how upsampling to 24bit for playback is of any benefit.
Happy to be corrected!
There isn’t a way today, but it’s an interesting idea.
There is some complexity for us because we don’t ship MP3/AAC codecs (because of patent licensing complexities)–so we are to some extent stuck with what the OS provides for us. I’m not sure offhand whether each of those codecs has this capability or not.
For AAC, I believe I’ve seen decoder implementations that can decode to a higher bit depth. I’ve not come across that for MP3 but that doesn’t mean it doesn’t exist.
Do you know of another player that is doing this? Meaning–actually producing 24bits that isn’t just zero-padded 16bit output from lossy files? I’d like to poke around.
@Nick_Stamp: With lossy audio compression formats, there isn’t really an inherent bit depth associated with the content of a compressed AAC or MP3 file. You can think of these files as a mathematical description of what the compression algorithm considered important information (overly simplified: “Here’s a bunch of sine waves that sound a lot like the original source when you stack them on top of each other”). Most decoders will calculate a decoded waveform in a high-precision intermediate format and then will quantize to 16 bits and apply dither if asked for 16-bit samples as the final output. Regardless of the sample size of the original source, undithered 24 or 32 bit output of a decoder will be closer to the waveform described in the compressed data than the quantized and dithered 16-bit output. If the original source was 16 bits per sample (ie: ripped from CD), it will have been quantized and dithered twice* (once during mastering, a second time during decoding) before it reaches your ears. With the advent of initiatives like Mastered for iTunes, it’s now more common to be able to buy compressed music that was generated from undithered 24-bit sources. If your decoder can generate 24+ bit output, you can also play it back without it ever having been bounced down to 16 bits per sample.
*or even three times, if it happens to also go through a software mixer set to output 16 bits per sample.
Well, last time I had to write code that decoded MP3s (15+ years ago… yeesh), I remember using libmad to decode mp3s to 24-bit PCM. I just played around with the “madplay” command line tool (available in the ubuntu apt repo) and it appears to produce unpadded 24-bit output when asked:
madplay -b 24 input.mp3 -o madout.wav MPEG Audio Decoder 0.15.2 (beta) - Copyright (C) 2000-2004 Robert Leslie et al. Title: On Hold Artist: The xx Orchestra: The xx Album: I See You Track: 8 Year: 2017 8582 frames decoded (0:03:44.1), +0.2 dB peak amplitude, 68 clipped samples flac madout.wav flac 1.3.1, Copyright (C) 2000-2009 Josh Coalson, 2011-2014 Xiph.Org Foundation flac comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. Type `flac' for details. madout.wav: WARNING: legacy WAVE file has format type 1 but bits-per-sample=24 madout.wav: wrote 43215207 bytes, ratio=0.728 flac -ac madout.flac |grep wasted <... snip ...> subframe=1 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=13 residual_type=RICE partition_order=0 subframe=0 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=13 residual_type=RICE partition_order=1 subframe=1 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=13 residual_type=RICE partition_order=0 subframe=0 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=13 residual_type=RICE partition_order=1 <... etc etc etc ...>
Kind of a roundabout way to detect zero-padding by using flac, but I don’t know of any other command-line tools that do it. Similar playing around with ffmpeg (and looking at the mpeg audio decoder source) seems to suggest that its MP3 decoder is fixed at 16 bits per sample, regardless of requested output format. The ffmpeg AAC decoder, however, decodes internally to “fltp” 32-bit floating point and will happily output to true 24/32-bit wav:
/usr/bin/ffmpeg -i input.m4a -acodec pcm_s24le aacout.wav ffmpeg version 2.8.11-0ubuntu0.16.04.1 Copyright (c) 2000-2017 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 20160609 configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-opengl --enable-x11grab --enable-libdc1394 --enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 --enable-libopencv libavutil 54. 31.100 / 54. 31.100 libavcodec 56. 60.100 / 56. 60.100 libavformat 56. 40.101 / 56. 40.101 libavdevice 56. 4.100 / 56. 4.100 libavfilter 5. 40.101 / 5. 40.101 libavresample 2. 1. 0 / 2. 1. 0 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 2.101 / 1. 2.101 libpostproc 53. 3.100 / 53. 3.100 [mov,mp4,m4a,3gp,3g2,mj2 @ 0x10e3400] stream 0, timescale not set [mjpeg @ 0x10e6a80] Changeing bps to 8 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.m4a': Metadata: <... snip ...> Duration: 00:05:19.14, start: 0.000000, bitrate: 278 kb/s Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 263 kb/s (default) Metadata: creation_time : 1999-05-13 10:58:17 Stream #0:1: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), 572x600 [SAR 300:300 DAR 143:150], 90k tbr, 90k tbn, 90k tbc Output #0, wav, to 'aacout.wav': Metadata: <... snip ...> ISFT : Lavf56.40.101 Stream #0:0(eng): Audio: pcm_s24le ( / 0x0001), 44100 Hz, stereo, s32, 2116 kb/s (default) Metadata: creation_time : 1999-05-13 10:58:17 encoder : Lavc56.60.100 pcm_s24le Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s24le (native)) Press [q] to stop, [?] for help size= 82452kB time=00:05:19.13 bitrate=2116.5kbits/s video:0kB audio:82452kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000294% flac aacout.wav flac 1.3.1, Copyright (C) 2000-2009 Josh Coalson, 2011-2014 Xiph.Org Foundation flac comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. Type `flac' for details. aacout.wav: WARNING: skipping unknown chunk 'LIST' (use --keep-foreign-metadata to keep) aacout.wav: wrote 63672261 bytes, ratio=0.754 flac -ac aacout.flac |grep wasted <...> subframe=0 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=11 residual_type=RICE partition_order=3 subframe=1 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=12 residual_type=RICE partition_order=3 subframe=0 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=11 residual_type=RICE partition_order=3 subframe=1 wasted_bits=0 type=LPC order=8 qlp_coeff_precision=15 quantization_level=11 residual_type=RICE partition_order=3 subframe=0 wasted_bits=0 type=FIXED order=2 residual_type=RICE partition_order=0 subframe=1 wasted_bits=0 type=LPC order=7 qlp_coeff_precision=15 quantization_level=11 residual_type=RICE partition_order=0 <...>
@brian: It looks like the current version of iTunes on OS X High Sierra decodes MP3s and AAC files to 24-bit if the system output supports it. Using the USB input and 100% mixer volume, my PS Audio DirectStream shows 24-bit input on the front panel when playing MP3/AAC files in iTunes, but 16-bit input when playing back 16-bit ALAC files (but it switches to 24-bit on the fly when the mixer starts mixing other sources into the output). I don’t think this was always the case, but I couldn’t tell you when it changed. Here’s a short video of the front panel as I switch between a 16-bit ALAC and an MP3 file.
@brian, did you get a chance to look into this? I was out of town last week and thought I’d check in on this request now that I’m back.
We’ll look at it next time we have fingers in the lossy codecs and if it’s straightforward to pull off it will happen. Not sure when that will be.
I thought I’d put this here, as it seems to be related… Roon seems to be playing MP3 at 24 bit. I noticed as a little relay in my Marantz DAC clicks when there’s a change of resolution, and I looked at the audio path to see this:
Is this Roon-generated, or has something changed on my system ( Windows 10 PC running Roon Core) to cause this? @brian ?
Yes, we delivered this feature request earlier in the week:
Thanks. I’ve made too many changes recently to my set-up to pin down if this has made a difference to SQ - and lossy files only make up a small part of my listening - but they sound great. You might expect that lifting the system to a higher “resolution” would show these lesser files up - but instead they seem to get the same lift as everything else. Remarkable.
I admit I’m a little ??? about this.
The original source for the lossy encoding may be 16 bit, and in any case I think the lossy formats are 16 bits.
So what do you do to create 24 bits? Interpolate in done way, decode the lossy file in a way that creates 24 bit data?
Wouldn’t this only give me exquisite clarity of the compression artifacts?
(Pure curiosity, I don’t have any MP3 files.)
Lossy formats have no inherent bit-depth–they are instructions for recovering the audio waveform from a more compact set of instructions. The instructions are usually based on a frequency-domain view of the audio signal computed using a discrete cosine transform.
This is inherently a computation on real numbers, not integers, so inside of the decoder, you’ll most often see either 32bit floating point or a fixed point representation used. Typically this has about 24 significant bits of resolution.
So at that point, we could just play that 24bit(ish) signal, or we could truncate it to 16 bits, introduce some dither (noise) to remove truncation artifacts, and play that instead.
Why mess with it? Dither at the 16th bit is potentially audible. It is the right thing to do if you must truncate to a 16bit signal for compatibility purposes…but if the output device can take 24bits, it’s better to leave it untouched.
You might find this article interesting…it discusses Mastered for iTunes and some of the technical practices that go along with that mark, among other requirements, Mastered for iTunes tracks must be encoded to AAC from 24bit masters.
This topic was automatically closed after 32 hours. New replies are no longer allowed.