WAV = FLAC = ALAC = AIFF in terms of resulting PCM…they are all bit-identical. No side is (or should be debating that). This is a non-issue for Roon endpoints which sees PCM. Use whatever format (FLAC preferable for metadata) on Roon Core.
With devices, where the audio device is doing the decoding itself, you can always compare for yourself. Whether you hear any difference or not, ALWAYS ALWAYS IMO rip or download your master library in FLAC. For metadata, FLAC, ALAC, and AIFF are better than WAV. Between ALAC, FLAC, and AIFF in terms of metadata, I’ve come across on some devices where certain fields such as “album name” or “album artist” were incorrectly rendered or missing with AIFF and ALAC. I’ve found metadata universally to be better compatible and less troublesome with FLAC. Opt for the highest compression to save space for master library or with Roon Core and Roon endpoint schemes.
If you think WAV sounds better, simply derive a secondary library by batch transcoding the master FLAC library. Always have a master FLAC library for Roon and/or backups. In those systems, where the decoding is happening in the endpoint, use the WAV library.
As far as evidence, I’ve read comments by Linn engineers and in Naim’s own manual that they have measured differences in noise floor between FLAC and WAV decoding. Each has a unique pattern of noise floor. With FLAC, it’s more broadband and random, whereas with WAV it’s periodic and with well defined peaks (what’s more noticeable or harmful?)…Linn recognizes this and makes adjustments to render this inaudible by utilizing extensive filtering. Whereas, Naim simply recommends the use of WAV for best SQ.
Nonetheless, mechanisms and evidence do exist. There are different patterns and levels in noise caused by the decoding of these two formats. Linn even had measurement graphs of this. The more interesting thing is at what levels do these differences become inaudible. I’ve read Rob Watts speculate that it may be much lower than common accepted values such as near -180 db based on his internal testing.
When I use the BDP-1 (which is over 10 years old and limited in CPU power…so who knows how recent CPUs in recent streamers compare?) as a Roon endpoint, you can use FLAC or WAV on the Core and the sound is the same. However, when I use the BDP-1’s internal MPD decoding, then the differences show up.
Not only are there differences between FLAC and WAV, but there are also differences whether they are sourced from a NAS (ethernet) or an attached USB drive.
There are many users that do not hear any differences between formats and/or USB/NAS with their BDPs and various DACs in their systems. I totally understand, believe, and respect that.
Between the remaining users that DO hear some differences, what I find very interesting is that we usually can hear similar differences and can describe the same thing that we are both hearing in our systems involving BDPs, HOWEVER, we end up with different interpretations of whether which dichotomy sounds better and/or is a more accurate rendition.
Similar discussions involve the Jitterbug. Lots of agreement between what happens to the sound with or without Jitterbug in objective attributes, but difference in interpretation of whether it sounds more real life, or more accurate, or more enjoyable.
If someone wants to suggest that that it’s all in our heads, you absolutely have 100% right to do so…and who knows, it really may even be the truth . However, if you for a second stick with the gut feeling of what we are hearing, I’m more interested in understanding how so many people can roughly hear the same sonic changes, yet come up with different interpretations. I wonder how much age (FR audiograms), internal ideologies of what better sound is and other biases?, reference to previous and current systems and thus experience factor in towards whether you like A or B.
FWIW, between my BDP-1 and various gear, I tend to group FLAC, USB, and no Jitterbug as together and WAV, NAS, and Jitterbug on the other side. Obviously, you can mix and match these for number of combinations, but the dichotomy stands.
The FLAC, USB, and no Jitterbug sound signature is one that is more energetic, more upfront, narrower and together. In quick blind testing, one can easily prefer this sound as it’s more noticeable and appears to be more detailed. However, in both my rigs, I tend to get more fatigued of this as the listening session goes on.
The WAV, NAS, and Jitterbug group gives a sonic signature that initially appears more diffuse and doesn’t jump out at you. In fact it can sound less detailed and flat as first. However, as you listen more, rather than the sound jumping out at you, you can notice more deeper into the sound. It sounds more relaxed. In long term listening, it doesn’t bother me as much and I stop noticing the sonics and just kind of zone out. This is especially important when watching movies.
At this point, I’d love to know and see these measurements to explain these difference in SQ and which group yields the more accurate sound to see whether I prefer the more accurate or less accurate sound. I’m always up for some ear training.
(I hear similar trends in balanced interconnects as well: Grimm TPR, Mogami 2549, Mogami 3173, Ghost cable. I prefer the more boring cables that give the depth and don’t appear overly detailed at first. Well shielded and twisted with good symmetry and geometry. Keep the noise floor as low as possible.)