ABX tests - required or rejected?

sbo6

VIP/Donor
May 18, 2014
1,660
594
480
Round Rock, TX
As I peruse social media sites and audio forums the science crowd often disputes any sonic benefits without evidence of ABX testing and rests on specifications as the bible for sonics. Thoughts?

My key findings over 30 years in this hobby are:

- While specs / measurements are important and often required (I worked in biomedical and defense system electronics), for audio they don't capture all sonic differences that can be heard.
- Not all systems have the ability to resolve at a high enough level to yield deltas / benefits of components, cables, tweaks.
- Not all people have the ability to resolve at a high enough level to hear deltas / benefits of components, cables, tweaks.
- Inherent bias often overshadows open mindedness (e.g.: "cables are cables", "snake oil!!", etc.)
- Many (most?) listening environments are far from being acoustically optimal often diminishing any sonic benefits.

All that said, who has experienced ABX tests? Results? I've done a few informal ones with interesting results. Thanks in advance for any comments. :)
 

Empirical Audio

Industry Expert
Oct 12, 2017
1,169
207
150
Great Pacific Northwest
www.empiricalaudio.com
As I peruse social media sites and audio forums the science crowd often disputes any sonic benefits without evidence of ABX testing and rests on specifications as the bible for sonics. Thoughts?

My key findings over 30 years in this hobby are:

- While specs / measurements are important and often required (I worked in biomedical and defense system electronics), for audio they don't capture all sonic differences that can be heard.
- Not all systems have the ability to resolve at a high enough level to yield deltas / benefits of components, cables, tweaks.
- Not all people have the ability to resolve at a high enough level to hear deltas / benefits of components, cables, tweaks.
- Inherent bias often overshadows open mindedness (e.g.: "cables are cables", "snake oil!!", etc.)
- Many (most?) listening environments are far from being acoustically optimal often diminishing any sonic benefits.

All that said, who has experienced ABX tests? Results? I've done a few informal ones with interesting results. Thanks in advance for any comments. :)

I have experienced ABX. I agree with everything you have said, and add one more factor: the quality and diversity of tracks being used. Very important.

ABX doesn't make sense in most cases, particularly academic experiments, because so many of the things on your list are not optimized. Historically, these experiments have concluded things like humans cannot hear any jitter lower than 1nsec. Right...

As for measurements, the audio industry and academia has adopted woefully inadequate test signals and measurement techniques. More recent phenomena like jitter you would think they would get right, but no. They are still quoting single jitter numbers. They should be doing direct jitter measurements and looking at the distributions, not spectral information. The real-time measurements correlate better to sound quality I have found.

The measurement that has been lacking for decades is a good transient response/dynamics measurement. The current measurements are much too simplistic to characterize dynamic performance of components. This is the primary thing that sets apart high-performance components and mid-fi stuff. Anyone can easily achieve low distortion and good S/N ratios with a continuous signal. Its not enough.

If better measurements and techniques were adopted, this would make it much easier for consumers to make buying decisions and understand the difference in quality and hopefully price differences.

Steve N.
Empirical Audio
 

sbo6

VIP/Donor
May 18, 2014
1,660
594
480
Round Rock, TX
Agree Steve. Also, good point wrt tracks.

WRT power, I recall Shunyata (Caelin?) reported developing a tool to measure power cable instant current draw, if I'm not mistaken. It would be interesting to see various cables measured followed by listening.
 

JackD201

WBF Founding Member
Apr 20, 2010
12,308
1,425
1,820
Manila, Philippines
I think that in general it should be left to experts to run the simulations and later actual listening tests. It is too easy for laymen to screw up an ABX test and therefore come up with reliable results. This is true for any ABX test not just audio. Controls need to be very stringent.
 

Ron Resnick

Site Co-Owner, Administrator
Jan 24, 2015
16,017
13,347
2,665
Beverly Hills, CA
These are excellent questions sbo6. Peter A and I were struggling with these very questions earlier today!

For me the question is not A/B/X versus measurements (I am a subjectivist so I reject the measurements-only view); for me the question is A/B/X, or at least A/B, versus long-term listening.

Both methodologies have their partisans, and both methodologies have significant, obvious flaws and biases and issues. My personal view is that I think there are more issues with the long-term listening approach than with the A/B/X approach. My personal view is that if one cannot reliably detect a difference using the A/B/X approach or at least the A/B comparison, then either there is no difference or the difference is too small to matter.

See http://www.whatsbestforum.com/showthread.php?22972-Comparative-Listening-Tests
 

sbo6

VIP/Donor
May 18, 2014
1,660
594
480
Round Rock, TX
These are excellent questions sbo6. Peter A and I were struggling with these very questions earlier today!

For me the question is not A/B/X versus measurements (I am a subjectivist so I reject the measurements-only view); for me the question is A/B/X, or at least A/B, versus long-term listening.

Both methodologies have their partisans, and both methodologies have significant, obvious flaws and biases and issues. My personal view is that I think there are more issues with the long-term listening approach than with the A/B/X approach. My personal view is that if one cannot reliably detect a difference using the A/B/X approach or at least the A/B comparison, then either there is no difference or the difference is too small to matter.

Ron, I would tend to agree. There are those who believe long term listening is required to fully comprehend any changes. While I do agree that long term listening especially of familiar tracks can 2x confirm changes (if any) I have always experienced changes immediately. Maybe it's my amateur musician's ear, but that's my experience. This also concurs with the belief that humans have very short term audible memory.
 

sbo6

VIP/Donor
May 18, 2014
1,660
594
480
Round Rock, TX
I think that in general it should be left to experts to run the simulations and later actual listening tests. It is too easy for laymen to screw up an ABX test and therefore come up with reliable results. This is true for any ABX test not just audio. Controls need to be very stringent.

Thanks JackD201. I don't disagree with the complexities complicating the accuracy and authenticity of ABX testing. Unfortunately that keeps the science versus art camps forever segregated. There's got to be some amicable middle ground, I'd love to find it... :)
 

JackD201

WBF Founding Member
Apr 20, 2010
12,308
1,425
1,820
Manila, Philippines
Thanks JackD201. I don't disagree with the complexities complicating the accuracy and authenticity of ABX testing. Unfortunately that keeps the science versus art camps forever segregated. There's got to be some amicable middle ground, I'd love to find it... :)

I believe many manufacturers ABX during the prototyping stage. I've been told by most if not all those I've spoken too that they do at least ABA. They say it is extremely useful especially when designing to the lower price points.

Personally I think it is fine to try it at home. It can be fun. I just think that like anything that deals with probabilities, we all should be careful not to overstate certainties, something scientists are always careful of doing in their published works. They always leave room for the possibility of outliers or methodological lapses in their conclusions by clearly stating qualifiers. :)
 

the sound of Tao

Well-Known Member
Jul 18, 2014
3,620
4,838
940
I feel that objective assessment is useful to navigate is to the right post code but long term subjective assessment can actually get us to a more exact destination. What jumps out as better in short term can actually just be about more obvious elements of the sound. Long term musical satisfaction is not necessarily immediately obvious.

At a time when agood mate had both Verity Parsifals and his Tune Animas going from the Animas to the Veritys all the strengths of the Parsifals jumped out... ah yes, that confidence and control and it was offering up a different spectrum of sonic strengths. It wasn’t till you played them for quite some time that you realised for all their sonic prowess that they just didn’t do music the same way and that something less tangible was missing.

It’s way too easy to get caught up in the impressive and obvious parts when doing ABX and not then also assess for the musical whole which is easier to appreciate when you aren’t focussing and analysing. Good whole assessment isn’t just about the obvious I figure.
 

bonzo75

Member Sponsor
Feb 26, 2014
22,443
13,473
2,710
London
Short term or long term listening I have always, and will continue to, prefer the Anima to the Verity. Btw I owned the verity Leonore which is much better than the older Parsifal Ovation but not the newer Parsifal anniversary, so not sure which Parsifal you are referring to.

Now, there is a situation where the verity can be made to sound better than the Anima. If you take the bigger Veritys to a room like Mike's or Marty's, carefully cross it over to JL Gothams like Marty has, it could. In such rooms the linearity helps, and you can see more of the recording, while with Anima, it will continue to sound like Anima. The Anima has a certain strong signature to its sound which will be a constraint in taking it farther in a Mike like system. But in other rooms it should trounce the Verity imo.
 

jeromelang

Well-Known Member
Dec 26, 2011
433
66
935
Any repeated testing that involves using vinyl as source is....
 

Gregadd

WBF Founding Member
Apr 20, 2010
10,517
1,774
1,850
Metro DC
Well ,this site has certainly been the location of flame wars regarding ABX testing. Now thwe the dust has settled I think this.
The necessity to account for personal bias and expectation bias certaainly is required for achieving reliable and and repeatable results in detecting smalll differences in audio components.
However the ABX challenge is not a valid scientific endeavor, In order to be a valid scientific test you must follow the true protocol. Argumments such as, "it is too cumbersome" or ,"proof is trivial" don't cut it.
The test must be designed and conducted by a neutrl tester. The testee must be someone with no stake in the outcome.
ABX is but one form of blind testing. ABX can be a useful tool in a properly designed scientific test. It is but one frm of blind testing ad nt an end unto itself.
 

microstrip

VIP/Donor
May 30, 2010
20,806
4,698
2,790
Portugal
I think that in general it should be left to experts to run the simulations and later actual listening tests. It is too easy for laymen to screw up an ABX test and therefore come up with reliable results. This is true for any ABX test not just audio. Controls need to be very stringent.

Exactly. It is all written in the famous ITU-R BS.1116-1 document, public and easily accessible " RECOMMENDATION ITU-R BS.1116-1* METHODS FOR THE SUBJECTIVE ASSESSMENT OF SMALL IMPAIRMENTS IN AUDIO SYSTEMS INCLUDING MULTICHANNEL SOUND SYSTEMS"

It explains why all the informal tests often described in audio groups and forums are flawed, particularly the absence of positive controls.
 

microstrip

VIP/Donor
May 30, 2010
20,806
4,698
2,790
Portugal
(...) My personal view is that if one cannot reliably detect a difference using the A/B/X approach or at least the A/B comparison, then either there is no difference or the difference is too small to matter. (...)

Did you ever carry a proper A/B/X test?
 

Gregadd

WBF Founding Member
Apr 20, 2010
10,517
1,774
1,850
Metro DC
agreed
 

PeterA

Well-Known Member
Dec 6, 2011
12,522
10,688
3,515
USA
Well ,this site has certainly been the location of flame wars regarding ABX testing. Now thwe the dust has settled I think this.
The necessity to account for personal bias and expectation bias certaainly is required for achieving reliable and and repeatable results in detecting smalll differences in audio components.
However the ABX challenge is not a valid scientific endeavor, In order to be a valid scientific test you must follow the true protocol. Argumments such as, "it is too cumbersome" or ,"proof is trivial" don't cut it.
The test must be designed and conducted by a neutrl tester. The testee must be someone with no stake in the outcome.
ABX is but one form of blind testing. ABX can be a useful tool in a properly designed scientific test. It is but one frm of blind testing ad nt an end unto itself.

This all sounds correct, but how to reconcile the part in bold with the guy trying to determine which amplifier sounds best to him before he makes a purchase decision? Does he just ask the testee who has nothing at stake to make the decision for him?

Also, if this protocol is followed, how could a manufacturer do proper testing of prototypes unless they brought in a panel of listeners with no stake? I guess Harmon did that, but I can't imagine a company like Pass Labs or Magico doing it without the designers also taking part in the listening.

Gregadd, are you aware of any audiophiles or manufactures who follow the protocol that you describe for assessing differences between components?
 

Ron Resnick

Site Co-Owner, Administrator
Jan 24, 2015
16,017
13,347
2,665
Beverly Hills, CA
I think that formal scientific protocol is not remotely plausible for our hobbyist A/B comparison testing. We are stuck just trying to be as honest with ourselves and our comparisons as is reasonably practicable.

So the answer is not that A/B comparisons are worthless because they don't satisfy formal scientific protocol. The question is whether our compromised A/B comparisons are better than (in my opinion) even more compromised and flawed long-term listening "tests" with no direct, contemporaneous comparisons.
 

microstrip

VIP/Donor
May 30, 2010
20,806
4,698
2,790
Portugal
I think that formal scientific protocol is not remotely plausible for our hobbyist A/B comparison testing. We are stuck just trying to be as honest with ourselves and our comparisons as is reasonable practicable.

So the answer is not that A/B comparisons are worthless because they don't satisfy formal scientific protocol. The question is whether our compromised A/B comparisons are better than (in my opinion) even more compromised and flawed long-term listening "tests" with no direct, contemporaneous comparisons.

You are mixing A/B and A/B/X (the OP subject), and even now addressing compromised A/B comparisons. Are you mainly addressing the short term / long term listening subject?
 

Gregadd

WBF Founding Member
Apr 20, 2010
10,517
1,774
1,850
Metro DC
This all sounds correct, but how to reconcile the part in bold with the guy trying to determine which amplifier sounds best to him before he makes a purchase decision? Does he just ask the testee who has nothing at stake to make the decision for him?

Also, if this protocol is followed, how could a manufacturer do proper testing of prototypes unless they brought in a panel of listeners with no stake? I guess Harmon did that, but I can't imagine a company like Pass Labs or Magico doing it without the designers also taking part in the listening.

Gregadd, are you aware of any audiophiles or manufactures who follow the protocol that you describe for assessing differences between components?
Thank you for your inciteful comments.
The fact that the protocol is not followed is the point. "It is what is." Tghe resuts you get are as good as the methodology used. I definitely beleive i in subjectivie evalauation. I use long term listening as close to my home listening situation as is practical evaluating differeces.
I hope that answers your question.
 

sbo6

VIP/Donor
May 18, 2014
1,660
594
480
Round Rock, TX
I think that formal scientific protocol is not remotely plausible for our hobbyist A/B comparison testing. We are stuck just trying to be as honest with ourselves and our comparisons as is reasonable practicable.

So the answer is not that A/B comparisons are worthless because they don't satisfy formal scientific protocol. The question is whether our compromised A/B comparisons are better than (in my opinion) even more compromised and flawed long-term listening "tests" with no direct, contemporaneous comparisons.

Getting toward middle ground, exactly, thanks Ron.

My point is this - why do we as audiophiles and avid music lovers have to conform to ABX or AB testing, to convince who?
Wouldn't it be great if there was a WBF consensus (plus possibly other consumer audio groups) to follow a derivative of an AB test that could be used to help reduce bias and provide prospective buyers / users more useful data? It won't conform to Rec. ITU-R BS.1116-1 and maybe it doesn't have to.
 
Last edited:

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing