The Misinformed Misleading the Uninformed -- A Bit About Blind Listening Tests

rsbeck · Jun 1, 2010

JackD201 said:
Just for kicks.

What if just one person out of 10,000 respondents gets it perfectly all the time?

That'd be something like the ability to hear higher frequencies. 14 year old girls tend to score best. So, we know it is possible, but in many cases not very probable -- except on the internet where anything is possible.

JackD201 · Jun 1, 2010

Then it really is a matter of perspective.

If that 1 in 10,000 was a respondent picking out the taste of a pinch of saffron in a soup, he would not be significant to the product testers. For someone out there out to prove an unmeasurable thing exists, he becomes totally significant.

rsbeck · Jun 1, 2010

JackD201 said:
If that 1 in 10,000 was a respondent picking out the taste of a pinch of saffron in a soup, he would not be significant to the product testers.

The issue with the saffron would be -- how would you prove it is detectable?

Only one way to prove it -- DBT.

rsbeck · Jun 1, 2010

Or, if you want perspective, you just say, "I believe something that isn't provable."

Personally, I see nothing wrong with that.

Drop the need for proof and audiophile life becomes a lot easier.

I can't prove most of what I believe about my system and it doesn't bother me in the least.

I would not volunteer for a DBT in most cases because I am fairly certain I would fail.

In most cases it also is just not practical.

Haven't lost a moment of sleep over it.

ggendel · Jun 2, 2010

rsbeck said:
Or, if you want perspective, you just say, "I believe something that isn't provable."

Personally, I see nothing wrong with that.

Drop the need for proof and audiophile life becomes a lot easier.

I can't prove most of what I believe about my system and it doesn't bother me in the least.

I would not volunteer for a DBT in most cases because I am fairly certain I would fail.

In most cases it also is just not practical.

Haven't lost a moment of sleep over it.

My take is that it is very valuable data. Many years ago I was part of a DBT test between wildly different systems. Interesting, the bottom line was the a particular B&O setup sounded very close to the quality of a very premium setup, but cost 20x less. This made a quality system within reach of a college student. It also demonstrated that you can get 90% of the way without breaking the bank. Of course, that was 35 years ago, so it's a whole new ballgame now.

Since then I tried to build a speaker to replicate the B&O design principal of flat-phase fill speakers with very minimal success so there was something I was missing. I stayed with my time-aligned design for many years until I built my hybrid electrostatic-TL speakers which opened up a new dimension.

audioguy · Jun 2, 2010

rsbeck said:
I can't prove most of what I believe about my system and it doesn't bother me in the least.

I would not volunteer for a DBT in most cases because I am fairly certain I would fail.

In most cases it also is just not practical.

Haven't lost a moment of sleep over it.

Amen!!

FrantzM · Jun 2, 2010

rsbeck said:
Or, if you want perspective, you just say, "I believe something that isn't provable."

Personally, I see nothing wrong with that.

Drop the need for proof and audiophile life becomes a lot easier.

I can't prove most of what I believe about my system and it doesn't bother me in the least.

I would not volunteer for a DBT in most cases because I am fairly certain I would fail.

In most cases it also is just not practical.

Haven't lost a moment of sleep over it.

Amen (squared)

JackD201 · Jun 2, 2010

rsbeck said:
The issue with the saffron would be -- how would you prove it is detectable?

Only one way to prove it -- DBT.

Well there ya go. Amen (cubed)

It is never about proving there is a difference. Those administering the test know for a fact that there are physical differences anyhow.

Jeff Fritz · Jun 7, 2010

I participated in a blind test at Paradigm last year. There were two speaker setups behind the curtain, one a Signature S8 (about $6000/pair), and the other a pair of their much less expensive minimonitors and a sub (probably under a grand for the whole thing). I was given a score card to rate the various sound characteristics and a remote to change back and forth between them. I scored the Sigs higher across the board. There really wasn't anything tough about it, and I came away impressed at blind testing as one design tool. I'd imagine anyone here would have had the same results. To me it's one tool to use, particularly for designers, nothing more or less.

FrantzM · Jun 7, 2010

Jeff Fritz said:
I participated in a blind test at Paradigm last year. There were two speaker setups behind the curtain, one a Signature S8 (about $6000/pair), and the other a pair of their much less expensive minimonitors and a sub (probably under a grand for the whole thing). I was given a score card to rate the various sound characteristics and a remote to change back and forth between them. I scored the Sigs higher across the board. There really wasn't anything tough about it, and I came away impressed at blind testing as one design tool. I'd imagine anyone here would have had the same results. To me it's one tool to use, particularly for designers, nothing more or less.

I am running out of Amen (s) ! Well .. There ! Amen ^3 !

Frantz

Gregadd · Jun 7, 2010

I am running out of Amen (s) ! Well .. There ! Amen ^3 !

When you preach to the choir they have to say Amen.

FrantzM · Jun 8, 2010

Gregadd said:
<snip>

When you preach to the choir they have to say Amen.

Greg

It doesn't have to be an "either or". To build a good system you must listen, carefully and measure even if it is only the distance from your speakers to the wall . Measure you must and you do for your tubes bias or your Wall AC power ( more important than most think). THe designers measured and they measured some more to provide you with he gear you are enjoying sighted or blinded. THe dichotomy is artificial... Now comes blind testing, it has its use when used properly, I think it has gotten its bad name or reputation because of its thoroughly humbling sometimes humiliating effects. One can bend around as much as one wants but there is no doubt that BT does reveal some falsities in our audio. Senses can be fooled. We can see what is not there as well as hearing what is not, remember , mep's post on how some SS amp sounded good to him after a teeth problem...
What the designer and maybe the hobbyist, the Audiophile wants is to replicate some results and it is a good way to conduct them ... with a certain amount of ignorance of what is been played from the subjects aka Blind. This way some , not all of our prejudices or conditioning are removed or their influence dimished.. It remains difficult to remove them and THAT has an effect on the final : It is almost impossible to remove the emotional response that the blind testing might elicit. SO yeah Blind Test is not a Silver Bullet, it is flawed but it is useful when applied properly and judiciously.
As a personal experience, I have scored very well on speakers, well on electronics, especially if I am familiar with the electronics, very poorly on cables under the same conditions, (I am been charitable with myself here, allow me that .. I have failed that part completely), thus my (new) position on cables and a respect for the blind testing methodology.
We have come to see things in very polarized fashion: Objectivist are those who believe in DBT and measures .. Subjectivists are those who beleive in their ears and ears only. I am willing to propose that those distinctions are false: Most Audiophiles are in a continuum, they believe in some measurements or testing and some listening .. THe proportions might change but most audiophiles believe and practice both measurements AND listening ... It would only help to listen sometimes ...blind

Frantz

hifitommy · Jun 10, 2010

dbt is good for a manufacturer

it can help them make qualitative judgments with confidence. they can make decisions on what they find. or not. thats where the judgments come in.

it can be tiring and MUST be conducted properly with enough subjects to make it valid, which isnt always that easy.

long term listening is another tool that is probably more useful. used together is the most effective method for coming to a meaningful conclusion. and using experienced/trained listeners will yield the best results. one must know what real music sounds like both amplified and unamplified.

reviewing is a lot different from manufacturing though some of the skills need to be shared. finding reviewers you can trust and believe in take quite a while to accomplish but once you do, it matters not whether they include measurements.

apanton1952 · Jun 11, 2010

There is room for both approaches. Agreed with Frantz. Probably the best thing is to use people who are actually blind for the blind testing, especially blind musicians. AND, make sure they are female. I have found that taking an available female with when choosing components, especially speakers, makes a positive impact and I haven't gone wrong yet. Perhaps, though, this reflects on my ear!

2 cases in point
1. 20+ years ago, upgrading and thinking some AR's with good specs would do the trick I took my mother to listen. She liked them but, as a pianist herself, pointed out that the piano bits on Van Morrisons Redwood Tree were completely inaudible! Strange but true. The sales person was dumbstruck as other speakers brought the piano back. We never did get to the bottom of that one. Neither did I buy the speakers.
2. A lady companion never liked my ML's but this was ultimately due to the Bryston amp - sounded ringy erhmm! Admittedly I was edgy with the sound too from time to time and various alternatives never resolved the issue. When we hooked the current Technics in, what a change. Couldn't get her (or me) away from the system.

Spec comparisons didn't do it here - the ears have it!

keithyates · Jun 11, 2010

Just saw this thread on DBT. A lot of my fellow 2-channel music-loving colleagues believe it's tougher to get a 2-channel system to lock in and make magic, whereas a 5.1 or 7.1 HT system is a lower hurdle to clear. To a large extent they've got it backwards, and a big part of the reason involves the "blind" part that's the subject of the thread.

A year or two ago I walked into a reviewer's purpose-built 2-channel sound room and saw a cover-story pair of >$100k metal-carcass speakers pulled out maybe 8 to 10 feet from the front wall, and toed in to align with taped locations on the floor. The gear upstream was exotic and expensive, glowing in vaguely mysterious ways on custom stands. The spkr cables were on cable lifters, the power conditioners were on TipToes; you get the picture. Here was a setup that quickened my pulse and breathing. Whatever the outer trappings, I couldn't help but believe I was in a temple, and in for something special.

When a customer goes into a theater of ours, there is almost never any evidence of equipment at all, other than the video image surface. The projector is encased in a hidden enclosure that is optically, thermally, acoustically, vibrationally, electrically and electrostatically conditioned. The electronic components -- all of 'em, down to the power amps -- are in their own conditioned room. The speakers and subs are behind transparent fabric, etc. There's no visible evidence to tell the first-time visitor that, behind the beautiful fabric there's a middle-6-figures array of big hulking LCRs (Wilsons or YGs or whatever you like), 4 to 10 full-range surrounds, a truckload of Real-Man's subwoofers, and 30 different types of engineered acoustic treatments covering 90% of the wall and ceiling surface area. You take your seat, roll the soundtrack, leave the projector off if you like. Either you get goosebumps the size of grapefruits, or you don't. No visual cues to guide or shape your sonic experience.

Because we can't rely on the gear's visual presentation to communicate this or that quality to the audience, we have to satisfy the client the old-fashioned way: Make it sound like he or she is THERE. It's sink or swim. You find that the gear that used to do it for you in your retail days doesn't actually cut it when the listening experience is truly blind. You find that other things -- things the familiar manufacturers don't actually sell -- do make a difference, and to access them you don't head to a trade show, you need to head to graduate school.

Speaking personally, the "blind" element creates a helluva high hurdle for system designers, pro and amateur alike. Beginning 20 years ago, the practice of making our HT rooms "blind" caused me, personally, to see things in a sharper focus than ever before in my 18 years in the industry. Changed my practice. Changed the way I design 2-channel rooms. Deepened my appreciation for The Good Stuff, and sharpened the way I separated it from The Other Stuff.

BTW, those >$100k speakers in the purpose-built 2-channel room -- sounded crummy.

tonmeister2008 · Jun 28, 2010

MylesBAstor said:
DBT DO NOT WORK for audio. Plain and simple. Trying to apply a method to all situations without understanding the test's downsides, limitations and problems is just plain bad science. Just because DBT works for drug testing does not mean it will work for audio!!!

1. Short term memory is notoriously unreliable. Short term memory has a serial processing bottleneck that affects perception. Thus limited "disc" storage space.
2. CNS arousal aka inverse U effect-eg. one wants least arousal and most perception for test. This can't be met.
3. Interindividual hearing differences swamps out statistical tests.
4. DBT is designed for a null effect.
5. No internal controls.
6. Unfamiliarity of equipment, room, sources in most cases.
7. How can you prove adding another piece of gear in the signal path doesn't affect the sound? As John Curl said, every connection affects the sound.
8. Oh, and these individuals are just a little biased? They're hardly unbiased sources. Why don't you find some things from the other side of the argument. MY PhD advisor would label your argument as parochial-only taking the data that supports your point of view and ignoring that data that doesn't support your viewl. Life and science unfortunately is not black and white but shades or gray, something that engineers only find out after they graduate. (at least the good ones).

Need I add more?

"Double-blind do not work for audio. Plain and simple?" Wow! That sounds like a pretty blanket statement without much qualification, and a little misinformed. Several of your arguments are not universally true (you choose to selectively describe only poorly designed DBT which is very misleading), or are completely irrelevant to your central argument.

1. Memory: So how do you explain listeners who can identify a speaker in a DBT, days after they heard it in a previous DBT test? If they can remember the speaker after several days, then you would think they could remember it after 3-4 seconds? Also it is quite easy to deal with short-term echoic memory (3-4 seconds ) effects by quickly switching components. Oh, I forgot: John Curl who designs amplifiers says you can't use a switcher (see counterargument #7).

2. Stress and arousal depends on the complexity of the task and the expertise/training of the listener. If the task is easy for the listener (through good test design or via training) then peak performance can be attained. The fact that I get very sensitive and reliable response data from my listeners suggests that arousal is not a factor in my tests.

3.If the listeners are carefully selected and trained, the individual differences may well be quite a small effect. Also, you can do statistical analysis of individual listener data to see if there are differences in discrimination, preference,etc

4 So what? Most scientific experiments are designed to test a null hypothesis. Do you have a problem with the scientific method? You formulate a null/alternative hypothesis (e.g. there are no differences in preference among speakers/or there are significant preferences among the speakers) you run a listening test, statistically analyze the results, and you make a conclusion based on the evidence. I tend to focus on perceptual effects that are well above the threshold of audibility, and produce a positive result. So tell me how that proves DBT tests do not work?

5. Can you explain what you mean by internal controls?

6. That's an easy one to solve. You use trained listeners or let the naives get used to the room, program and the task. Let them listen as long as it takes: hours, weeks, years. Scientific evidence suggests we quickly adapt to the room acoustics when comparing audio components - up to an extent.

7. You can repeat the test with and without a switcher. If you get the same results, then you could reasonably conclude the switcher had little or no effect. Moreover, the effect of the switcher is a constant factor for the independent variables being tested, as long as it has no biasing influence you can dismiss its influence on the results. You can also measure the electrical properties of the switcher/cable/whatever and determine whether there is an effect. Without scientific evidence, I would not accept what John Curl says as truth, anymore than I would accept what Bill O'Reilly says on Fox News as truth. Curl is not a scientist, and I doubt that he can prove that "every connection can affect the sound".

8. To make a blanket statement that all DBTs in audio do not work based on half-baked truths, and then accuse DBT practitioners of being " biased " , sounds like the pot calling the kettle black. Take your PhD advisor's advice, and read the audio scientific literature where you will find many examples of DBT tests that do in fact work in audio -- plain and simple.

MylesBAstor · Jun 28, 2010

Well it's clear that we're never going to settle this argument and it's just going to evolve into another wreck audio high end argument. It just seems that you don't want to discuss issues. But just to set the record straight:

In regards to (1), what you're referring to basically has nothing to do with short term memory but the ability to transfer memory into long term memory (I think we all develop the ability to different sounds in our environment, whether it be audio, or different cats meowing). And then one needs to search for those variables that probably stay the same so the ear/brain picks can identify and choose between speaker brands. There is a lot of information that gets lost and never translated into long term as well as a lot of subconcious information processing.

In regards to (5), internal controls are a standard scientific procedure used to detect whether the methodology being used, whether biochemistry, audio, works. In other words, one puts a predetermined amount of chemical, distortion into the sample and sees if one gets the expected answer; if not, then there's something wrong. That same internal control can also be used so as to calibrate sensitivity of the methodology. For instance, below which level a certain say distortion might not be detectable; but at the same time, need to show that the methodology employed can show that the aberration can be picked up also.

And in regards to (8) another word my thesis advisor taught me was parochial, in other words just referring to information that supports your hypothesis and ignoring everything else.

amirm · Jun 28, 2010

tonmeister2008 said:
7. You can repeat the test with and without a switcher. If you get the same results, then you could reasonably conclude the switcher had little or no effect. Moreoverr, the effect of the switcher is at constant for the independent variables being tested.You can also measure the electrical properties of the switcher/cable/whatever and determine whether there is an effect.

I think this answer is a bit unfair Sean.

Let's say arbitrary that the switch kills off all the high frequencies above 12 Khz. In that manner, if I used it, I could not test two sources whose only difference was that one went to 20 Khz, and then other, only to 14 Khz. So the fact that the switch and its filtering is invariant doesn't help make the case that it doesn't disturb the experiment.

Your answer to above is that we could measure its effect. Indeed, in doing so, you would have found the filtering above. But what if we don't know what to measure because we don't know the effect?

Let's take the concept of jitter. If one did not know that such an effect existed, and that "bits were bits," wouldn't two sources that differed in this respect but not otherwise, look identical when in reality they are not?

By the same token, the notion that a switch is fine to insert in such A/B tests, requires that one believe that it cannot make a difference. To believe that, in turn requires believing that cables don't make a difference either. After all, both would show similar effect in measurement. Using this understanding then, to check the very hypothesis of whether audiophile concepts such as cable differences is circular in my opinion. You can't assume the outcome (i.e. cables/switches make no difference) to construct your experiment to show the same! At least not if you want to have any hope of convincing the other side of anything

.

Steve Williams · Jun 28, 2010

Let's say arbitrary that the switch kills off all the high frequencies above 12 Khz. In that manner, if I used it, I could not test two sources whose only difference was that one went to 20 Khz, and then other, only to 14 Khz. So the fact that the switch and its filtering is invariant doesn't help make the case that it doesn't disturb the experiment.

But you could use an appropriate switcher that works up to 20 Khz

tonmeister2008 · Jun 28, 2010

amirm said:
I think this answer is a bit unfair Sean.

Let's say arbitrary that the switch kills off all the high frequencies above 12 Khz. In that manner, if I used it, I could not test two sources whose only difference was that one went to 20 Khz, and then other, only to 14 Khz. So the fact that the switch and its filtering is invariant doesn't help make the case that it doesn't disturb the experiment.

Your answer to above is that we could measure its effect. Indeed, in doing so, you would have found the filtering above. But what if we don't know what to measure because we don't know the effect?

Let's take the concept of jitter. If one did not know that such an effect existed, and that "bits were bits," wouldn't two sources that differed in this respect but not otherwise, look identical when in reality they are not?

By the same token, the notion that a switch is fine to insert in such A/B tests, requires that one believe that it cannot make a difference. To believe that, in turn requires believing that cables don't make a difference either. After all, both would show similar effect in measurement. Using this understanding then, to check the very hypothesis of whether audiophile concepts such as cable differences is circular in my opinion. You can't assume the outcome (i.e. cables/switches make no difference) to construct your experiment to show the same! At least not if you want to have any hope of convincing the other side of anything .

I also said that if you get the same test results with and without the switcher, then you can conclude that it has little effect on the test outcome. Of course, your are right that the switcher and electronics in the test need to be as accurate, neutral and transparent as possible. I believe that that can be determined using electrical measurements. But you could also design a listening test to test whether they have an audible effect, as well.

The Misinformed Misleading the Uninformed -- A Bit About Blind Listening Tests

WBF Founding Member

WBF Founding Member

WBF Founding Member

WBF Founding Member

New Member

WBF Founding Member

Member Sponsor & WBF Founding Member

WBF Founding Member

[Industry Expert]

Member Sponsor & WBF Founding Member

WBF Founding Member

Member Sponsor & WBF Founding Member

Well-Known Member

New Member

Industry Expert

WBF Technical Expert

Reviewer

Banned

Site Founder, Site Co-Owner, Administrator

WBF Technical Expert

Similar threads