This is my post on A/B testing from 3 weeks ago (a different thread). I never received an answer from any objectivist. Maybe I’ll get lucky today.
A/B testing has its uses, but rarely delivers a clear verdict.
1) practical problems
money and time – who has the financial resources and/or the time to set up an A/B test for lots of different gear?
Doing such a test at a dealer’s is hardly conclusive, as you will listen to equipment that is not yours, in a different listening space, under time constraints, … So doing a test that might yield useful results would have to be done in your listening environment, with your equipment. Here the money factor comes in. Not every dealer will graciously lend you equipment to run your test.
2) major differences in SQ – I will grant you that a test done at a dealer’s may be conclusive if the differences in SQ are major and that such differences are immediately obvious (they hit you right in the face or, better, ears)
3) minor/ subtle differences – this is the case that interests me most.
A/B testing for minor differences is rarely conclusive. It’s an artificial way of listening to music (and, as I’ve argued in a previous post, ruins the state of mind that is needed to be at all receptive to minor differences). And the differences may only reveal themselves over time, or with certain recordings rather than others.
4) noise and information overload
For an A/B test to have any scientific validity, it must be run multiple times (a series of 8-10, with minimum of 10 samples) and you must consistently score within the 80% range. (And I won’t even mention double-blind, for who can readily put together a test in conformity to rigorous scientific testing protocols?)
The very scientific exigencies of the test set-up, paradoxically, also make it highly unlikely that such a test will yield any meaningful results. Your brain will go into auditory shut-down while being assaulted by all those series and samples. You won’t be able to hear anything meaningful any more, let alone subtle differences in SQ.
It’s like uttering the same word multiple times in quick succession. That word will soon have lost all meaning (information will be replaced by noise).
So what’s the conclusion I draw from all this?
A/B testing may work for major differences. For minor differences, well … see above.
Please, objectivists, have a go and address the arguments I’ve put forward. I’m here to learn (I insist: this is not meant sarcastically – I’m really here to learn).