Do blind tests really prove small differences don't exist?

Status
Not open for further replies.
Perhaps you have not read my prior posting. When the differences are small but perhaps still important, it is all too easy to organize a blind test as you describe: you get a result where the listener can only identify the DUT 5 or 6 times out of 8, which is not statistically significant; and you declare that there is no difference. But in all honesty, you have proved no such thing. You merely played a meaningless game, even if the null result reinforces what you already believe.
I keep reading this observation from those who insist on invalidating the scientific method. It is truly regrettable since it does not further the discussion.

First is this notion that there IS a difference that is small but important. This is called boot strapping. How do we know the difference is indeed small but important? How do we know there is any difference at all? Isn't this rather presumptuous? It may be true that there is a difference, but it also may be true there is not. It may true that the difference, if any, may be important, then again it may not. So this all begs the question: how do we go about answering these questions? And for a question which begs and screams for intellectual honesty, why is sighted testing more reliable in this context?

Second is this incessant MIScharacterization of what the null result means. No one who believes in the validity of blind testing and understands blind testing claims that the failure to pass = no difference. A blind test doesn't *prove* anything. So in the pursuit of intellectually honesty, particularly in a discussion forum such as ours here at WBF, it sure would be nice for those who do not subscribe to the validity blind testing to stop this mischaracterization.
 
No one who believes in the validity of blind testing and understands blind testing claims that the failure to pass = no difference. A blind test doesn't *prove* anything. So in the pursuit of intellectually honesty, particularly in a discussion forum such as ours here at WBF, it sure would be nice for those who do not subscribe to the validity blind testing to stop this mischaracterization.

I am sorry to disappoint you, Ron, but this is hardly a "mischaracterization. Read Hydrogen Audio, even read Phelonious Ponk''s postings where the inherent background to his statements is that there is no difference, even if he doesn't come out and say it. Read the editorial leader in the January 1987 issue of Stereo Review where it was widely proclaimed that the magazine's test results "proved" there was no difference between amplifiers operated below clipping. Read Tom Nousaine in The Audio Critic.

Yes, someone who understands the Scientific Method, you and me, for example, would not make such a claim. But these people are not scientists; instead they are true believers in "scientism" with an axe to grind about those of us who are professionally involved in audio.

John Atkinson
Editor, Stereophile
 
Last edited:
Read Hydrogen Audio, even read Phelonious Ponk''s postings where the inherent background to his statements is that there is no difference, even if he doesn't come out and say it.

The "inherent background" to my statements is that there is no difference between what? You paint with a broad brush, and on the inside of my head, no less. Do you always resort to personal attacks when you run out of substantive arguments, John?

Tim
 
I am sorry to disappoint you, Ron, but this is hardly a "mischaracterization. Read Hydrogen Audio, even read Phelonious Ponk''s postings where the inherent background to his statements is that there is no difference, even if he doesn't come out and say it. Read the editorial leader in the January 187 issue of Stereo Review where it was widely proclaimed that the magazine's test results "proved" there was no difference between amplifiers operated below clipping. Read Tom Nousaine in The Audio Critic.

Yes, someone who understands the Scientific Method, you and me, for example, would not make such a claim. But these people are not scientists; instead they are true believers in "scientism" with an axe to grind about those of us who are professionally involved in audio.

John Atkinson
Editor, Stereophile
I agree that there are (plenty of) those who believe in the validity of blind testing who mischaracterize the results of blind testing, that is why I prefaced my statement by saying those who *understand* blind testing do not equate a null with no difference.

Having stated this, did you not equate a null result with a "meaningless game"? Exactly how does that further the discussion? Exactly how is that not a mischaracterization? [Sorry for the double negative.] Moreover, the questions from my last post remain unanswered.
 
Play nice guys. This is a interesting debate and obviously tempers can flare.

Let's all challenge the post and not the poster

The post is what I challenged, Steve. Thank you.

Tim
 
I keep reading this observation from those who insist on invalidating the scientific method. It is truly regrettable since it does not further the discussion.

First is this notion that there IS a difference that is small but important. This is called boot strapping. How do we know the difference is indeed small but important? How do we know there is any difference at all? Isn't this rather presumptuous? It may be true that there is a difference, but it also may be true there is not. It may true that the difference, if any, may be important, then again it may not. So this all begs the question: how do we go about answering these questions? And for a question which begs and screams for intellectual honesty, why is sighted testing more reliable in this context?

Second is this incessant MIScharacterization of what the null result means. No one who believes in the validity of blind testing and understands blind testing claims that the failure to pass = no difference. A blind test doesn't *prove* anything. So in the pursuit of intellectually honesty, particularly in a discussion forum such as ours here at WBF, it sure would be nice for those who do not subscribe to the validity blind testing to stop this mischaracterization.

I do not think any of us are insisting on invalidating blind testing, but one has to decide what is the focus and limitation of any test, which leads onto what do we get from certain tests.
Your point about notion of if there is a small difference is spot on, but then we have; what blind test process would be ideal to identify and prove if small differences exist?
In your opinion Ron, which blind test process would help validate while considering other variables that may need to be managed so as not to skew results?

Relating to blind test doesn't prove anything and mischaracterisation of null results, while some do not read too much into the meaning of certain blind results others do and take the process of AB/X to prove this with a failure to pass = no difference (in cases where absolute matching is the test), without using other processes to validate the result by controlling bias-behaviour heuristics in a different way.
This can be seen with many posters on several other forums where AB/X has become the defacto discussion for many while not being aware of other studies that rely on statistical significance using subjective scales and multiple test processes or matching done in different approach.
As I said no-one can prove whether AB/X suffers from certain factors and anything would be speculation, but it is used by many as proof for differences do not exist for subtle differences.
For me, it would be ideal if further slightly different test was done with the same listeners or even capture the cognitive behaviour and heuristic pattern (if there is one) of the participant.
But, it is fair to say there are many other valid blind tests done that do also capture the bias and heuristic pattern of a participant even in dbt.
This does not invalidate such tests but adds to them, as an example the Harman 4 speaker test with multiple positions, or the study I mentioned where thay captured AB order bias and showed that it had a slight affect on the participant but could be weighted to still provide meaningful data.

Cheers
Orb
 
When you say unites, is that inclusive of or exclusive of speakers?

As long as we restrict ourselves to electronics, speaker wires and interconnects, yes.

As soon as you start chaing speakers, all bets are off. .

Ok, thanks for being direct. As you guessed I was excluding the speakers from the swap.

Can I ask you to nominate a few high quality classical recordings available at Amazon, including some symphonic music, that you feel are adequate to be used in this imaginary test?
 
References in this and other threads have been made to hydrogenaudio. Here is the hydrogenaudio thread entitled "What is a blind ABX test?" Now while the subject of this thread that Amir started was far broader than ABX testing, I think it would interesting at this point for our members to read the hydrogenaudio definition so to speak of a blind ABX test. Of particular significance to the recent discussion in our thread is the following:

Rule 1 : It is impossible to prove that something doesn't exists.

Putting aside the grammar issue, it should put this issue to bed, to-wit: failure to pass a blind test does not equate to no difference.
 
Just for the sake of balance I think it needs to be said that in the audiophile community, the ranks of those who read too much into blind testing are eclipsed by the ranks of those who read far too little.

Tim
 
The mere fact that it is being recognized that there are people that read too much AND too little, to me is a very good start.
 
The mere fact that it is being recognized that there are people that read too much AND too little, to me is a very good start.

I'll go along with that.

Tim
 
You Sir, are a gentleman. :)
 
The mere fact that it is being recognized that there are people that read too much AND too little, to me is a very good start.

Yes, people are applauding the fact that after only 35 years of the use of DBTs in audio, people are happy that they are making a good start with them. Seems like a rather sad commentary on human ability to absorb change.
 
Keep on pushin' Arny.
 
References in this and other threads have been made to hydrogenaudio. Here is the hydrogenaudio thread entitled "What is a blind ABX test?" Now while the subject of this thread that Amir started was far broader than ABX testing, I think it would interesting at this point for our members to read the hydrogenaudio definition so to speak of a blind ABX test. Of particular significance to the recent discussion in our thread is the following:

Rule 1 : It is impossible to prove that something doesn't exist.

It should put this issue to bed, to-wit: failure to pass a blind test does not equate to no difference.

Right, it basically says that the title of this thread is thought of to be a straw man argument by well-informed DBT advocates. The question "Do blind tests really prove small differences don't exist?" always has the answer "No" because proof that something does not exist is difficult or impossible. It's the opposite of a truism, which I guess makes it a falsism. ;-)

Another way of criticizing the title of this thread is the well-known adage:

"The absence of proof is not proof of absence".

That all said, it is curious that so many earth-shaking *improvements* in audio completely disappear when you start examining them based on just what they do to sound. You hear a strong difference, and by simply cleaning up your test, you make it go away.

Many read stereo magazines and online conferences, knowing or at least suspecting that vast percentages of what people talk about and desire simply goes away when you match levels, synchronize the music, and try to address sighted bias.
 
Here is what I think about this whole drawn-out thread: You can’t beat people over the head with DBTs and demand they “prove” any audiophile claims they make for how their gears sounds (well, you can-it just won’t get you anywhere). People believe what they want to believe even if there doesn’t appear to be anything rational about their thought process. It’s no different than politics or religion. You just end up with a bunch of people talking past each other with everyone smug in their own beliefs. After a very short while, it all just sounds like noise to me and I tune it out.

I have had some of my long-standing beliefs turned on their head lately and I didn’t need a DBT to make it happen. All I needed was to trust my ears to tell me the truth even if the truth was hard to swallow for several reasons. My eyes aren’t smart enough to convince my ears that they can’t hear the truth. You just have to be honest with yourself, don’t try and have the fastest ears in the West, and listen to components in your system over a period of time. Take notes. Swap the components in out a few more times. Take notes. After awhile, you may come to the conclusion that your lying eyes were leading you astray.
 
Right, it basically says that the title of this thread is thought of to be a straw man argument by well-informed DBT advocates. The question "Do blind tests really prove small differences don't exist?" always has the answer "No" because proof that something does not exist is difficult or impossible. It's the opposite of a truism, which I guess makes it a falsism. ;-)

Another way of criticizing the title of this thread is the well-known adage:

"The absence of proof is not proof of absence".
Your post and Ron's are confusing to me. If someone said the following:

"Modern gear can be so close to being perfect that two devices with excellent but vastly different technical performance will still sound the same.
[...]
The scientific facts tell a completely different story than the post I'm replying to. Once audio performance reaches a certain level, there can still be differences but nobody anywhere will ever hear the differences."


Would you say these statements are based on blind studies or something else?

Or would you say none of that is really proven?
 
Your post and Ron's are confusing to me. If someone said the following:

"Modern gear can be so close to being perfect that two devices with excellent but vastly different technical performance will still sound the same.
[...]
The scientific facts tell a completely different story than the post I'm replying to. Once audio performance reaches a certain level, there can still be differences but nobody anywhere will ever hear the differences."

That is me talking over at AVS, and they are basically the same statement twice. How can that possibly be confusing?

Would you say these statements are based on blind studies or something else?

There are tons of DBTs that support those statements, but so does the rest of science.

Unless you believe that what is known about thresholds of hearing and masking is completely bogus, and that audible differences among good electronic equipment still abound, then the above should be like motherhood and apple pie. It's completely believable and not confusing at all!

Or would you say none of that is really proven?

What does "really proven" mean as you ask it?

Philosophically, absence of proof, which is all that we have, is not the proof of absence.

OTOH, it is very common to say as a practical matter that this or that is proven.

For example, expert studies of thermodynamics and solid state physics prove that LED light bulbs will never exceed 40% efficiency. But is it really proven? Might there not be some quantum leap type finding about either thermodynamics or solid state physics down the road that changes everything?

Of course everything can change in a heartbeat. But how much will you bet on that happening tommow, next week or in your lifetime? I'm sitting here trying to second-guess the improving price/performance of LED luminaires to determine whether to recommend significant costly improvements in electrical infrastructure to support the use of incandescent lighting. If there are going to be $20.00 8,000 lumen narrow spot PAR56 bulbs on the market in volume by the end of the year, not so much!
 
Unless you believe (...) that audible differences among good electronic equipment still abound (...)

Yes, I believe they exist. And one of the reasons I am interested in WBF debates is learning about what causes them. I already know about the expectation bias, but I am also trying to learn abouth the other possibilities.

BTW, considering the DBTs, could you avoid nominating tons of them and referring one or two precise ones you consider of high relevance?
 
Status
Not open for further replies.

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu