JPLAY Responds: An Open Letter

A fair point so drop the 'bias' part and just leave 'expectation effects' ? Or even 'due to expectation' ?

We can try!

Its important to note (but almost universally not noted, just as subjectivists rarely acknowledge expectation in sighted tests) that expectation effects apply just as much to experimental design as to sighted listening. So objectivists pretty consistently design experiments according to their own expectation - that of no audible difference. And its not at all surprising that most experiments do indeed return this result.

Sure - but any experiment is open to critique and improvement, and always carries a "under these conditions" disclaimer. Much harder to deal with somebody giving a subjective opinion, but denying any expectation effects might be at play.
 
Its an interesting question. I can only speculate but perhaps they don't acknowledge the placebo effect or imagine that while it exists, they are in a privileged position and therefore it only exists for other people? Its quite normal for people to imagine they are somehow 'privileged' compared to others...

...I'd say that listening shows that many feedback amplifiers don't give a satisfying sound.

Does this mean you are privileged in that the placebo only affects others? :)
 
Its important to note (but almost universally not noted, just as subjectivists rarely acknowledge expectation in sighted tests) that expectation effects apply just as much to experimental design as to sighted listening. So objectivists pretty consistently design experiments according to their own expectation - that of no audible difference. And its not at all surprising that most experiments do indeed return this result.

I don't disagree with this, and I think it entirely possible that merely knowing that you are taking part in a test of any sort could render you incapable of discerning differences that maybe you could discern in other circumstances - even if you can do the test at home, taking as long as you want with the testing regime of your choice.

I am sceptical that it is easy to create tests that isolate the real difference between two systems when no one actually knows why such a difference might exist (e.g. "we think the pebbles absorb harmful energy fields", or "we don't know why small buffers sound better; they just do!").

In short, these are the reasons why listening tests of any sort are simply not trustworthy. And nor are generalised measurements at the output. Instead it's got to be a combination of rigorous design, plus knowledge of specific parameters like power rails, and homing in on measurements of specific potential weak points using appropriate signals. Of course this assumes that we know what the objectives of the system should be, and I think we do: boringly flat frequency response, stability, linearity - even if this results in people with certain expectations declaring that it sounds unsatisfying or un-musical; supplementary musicality effects can be added deliberately, in a controlled way with extra circuitry or software.
 
I have no problem with dropping the term "real." You guys are right. It defines the opposite as "unreal," "fantasy," "delusion," etc. Very negative and not helpful. I prefer audible. If you can consistently (if I remember my statistics studies it takes somewhere in the neighborhood of 80%>) identify a difference between the PracticalDac One and the the SuperUltra Platinum Supreme Conversion MagiComm, with all of the common biases for price, brand, looks, reputation, etc., removed - ie, you don't know which one is playing when -- I call that audible. Better? That's a judgement call.

So there goes "real." Substitute the real word; audible.

Bias? A scientific term with a specific meaning that applies here. Effect doesn't mean the same thing. You're offended by the fact that you might have biases? You may as well be offended by the fact that you may be imperfect. Oh, and this...

Its important to note (but almost universally not noted, just as subjectivists rarely acknowledge expectation in sighted tests) that expectation effects apply just as much to experimental design as to sighted listening. So objectivists pretty consistently design experiments according to their own expectation - that of no audible difference. And its not at all surprising that most experiments do indeed return this result.

I don't doubt this is true when the methodology is sloppy and sub-standard. It's probably pretty consistently true of casual blind listening, and the kind of "testing" you see reported on Hydrogen Audio. And it is a much greater potential issue when using blind listening to determine preference. But the difficulty of designing a decent test to determine if a difference is audible has been blown grossly out of proportion by people who want desperately to dismiss the results. The simplest DBT removes the listener's ability, and the moderator's (this is the "double" part)ability, to know which component/codec/player, etc. is playing. The listener can't be influenced by look, size, price, brand, reputation...or even the subtlest leading gestures of the moderator, because the moderator doesn't know either.

It works. It works in the development of products and applications much more critical than listening to music for pleasure. Denying it is denial, plain and simple. And there's a ton of it in this hobby.

Oh and I know someone will be along soon to say that music is different that it takes listeing over a time span of days, even weeks, to hear these magnificent improvements. I don't know how to put this gently; that's wrong. Decades of audiological research says it is wrong. Small audio differences are best identified through quick switching between them. Nobody but audiophiles denies this.

Now, getting your head wrapped around what those difference are, which you prefer, how the differences manifest themselves in your soundstage, etc, etc. Sure. Take your time. But if you want to know if there is an audible difference between A and B switch rapidly between them. Or deny science; join the flat earth club.

Tim
 
You may as well be offended by the fact that you may be imperfect.

A lot of people would be offended if you told them they might be imperfect. Just like they would be offended if you told them they are below average drivers.

There is a dirty secret most people have not realized - almost half the world population is less-than-average intelligent.

Or deny science; join the flat earth club.

Unfortunately it is a rather popular club.
 
But the difficulty of designing a decent test to determine if a difference is audible has been blown grossly out of proportion by people who want desperately to dismiss the results. The simplest DBT removes the listener's ability, and the moderator's (this is the "double" part)ability, to know which component/codec/player, etc. is playing. The listener can't be influenced by look, size, price, brand, reputation...or even the subtlest leading gestures of the moderator, because the moderator doesn't know either.

It works. It works in the development of products and applications much more critical than listening to music for pleasure. Denying it is denial, plain and simple. And there's a ton of it in this hobby.

Oh and I know someone will be along soon to say that music is different that it takes listeing over a time span of days, even weeks, to hear these magnificent improvements. I don't know how to put this gently; that's wrong. Decades of audiological research says it is wrong. Small audio differences are best identified through quick switching between them. Nobody but audiophiles denies this.

Now, getting your head wrapped around what those difference are, which you prefer, how the differences manifest themselves in your soundstage, etc, etc. Sure. Take your time. But if you want to know if there is an audible difference between A and B switch rapidly between them. Or deny science; join the flat earth club.
There may be no audible difference between two setups, but one may be better than the other. Distortion of 0.1% in one amplifier may be inaudible, but distortion of 0.01% in the other is better. DBT will probably show all amplifiers 'sound the same', and all CD players (and obviously all cables), but yet there may be measurable differences that are worth pursuing. By portraying DBT listening tests as the gold standard, it diminishes the value of measurement and design.

DBT listening tests of speakers may be even sillier, if not impossible: how far from the wall? What toe-in? Which room treatments are best for each speaker? Which choice of music? Here, measurements and fundamental design 'best practice' must surely be the key, rather than simplistic listening tests. If the aim of the test is to show which speaker is 'better' then the rapid switchover approach, or test carried out over one evening, really may not be all that useful: I can certainly conceive of characteristics that sound great for a short while, but are grating on the ears in the long term - but as soon as the tester takes the speaker home, the DB, or even SB, aspect is gone.
 
People who claim to love DBT act like they are as simple as spreading butter on a piece of toast and everybody can do them at any time. The devil is always in the details as they say, and setting up a DBT that would pass muster in the scientific community is neither cheap or easy in terms of time or talent.
 
I don't disagree with this, and I think it entirely possible that merely knowing that you are taking part in a test of any sort could render you incapable of discerning differences that maybe you could discern in other circumstances - even if you can do the test at home, taking as long as you want with the testing regime of your choice.

That has indeed been my experience - there's 'test anxiety' even when I'm not testing for the benefit of anyone else, just myself. I speculate a little about why this might be and it seems to me that it encourages me to listen with my mind rather than with my heart. Music's primarily about connecting with my emotion, not an intellectual exercise.

I am sceptical that it is easy to create tests that isolate the real difference between two systems when no one actually knows why such a difference might exist (e.g. "we think the pebbles absorb harmful energy fields", or "we don't know why small buffers sound better; they just do!").

I agree. But the problem is not just 'testing' but also this focus on 'differences'. Donald Hoffman's research into visual perception shows we all suffer 'change blindness' and he has pics up on his website which demonstrate how hard it is to spot the difference between two images which are side-by-side. How much harder to spot differences in audio when we can't bring the two stimuli up 'side-by-side' so to speak.

Incidentally I can't understand the question you posed just prior to this about placebo. Would you like to re-phrase it?
 
.....But the difficulty of designing a decent test to determine if a difference is audible has been blown grossly out of proportion by people who want desperately to dismiss the results. .....
Tim
Tim,
I think you should read about MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor) ITU-R BS.1116-1 "METHODS FOR THE SUBJECTIVE ASSESSMENT OF SMALL IMPAIRMENTS IN AUDIO SYSTEMS INCLUDING MULTICHANNEL SOUND SYSTEMS"
to learn about designing of a "decent test" regime.
 
There may be no audible difference between two setups, but one may be better than the other. Distortion of 0.1% in one amplifier may be inaudible, but distortion of 0.01% in the other is better. DBT will probably show all amplifiers 'sound the same', and all CD players (and obviously all cables), but yet there may be measurable differences that are worth pursuing. By portraying DBT listening tests as the gold standard, it diminishes the value of measurement and design.

I suppose things like the further reduction of distortion that is already well below the threshold of audibility is an admirable act of engineering. But you'll understand if I choose a less expensive alternative with audibly identical performance?

DBT listening tests of speakers may be even sillier, if not impossible: how far from the wall? What toe-in? Which room treatments are best for each speaker? Which choice of music? Here, measurements and fundamental design 'best practice' must surely be the key, rather than simplistic listening tests. If the aim of the test is to show which speaker is 'better' then the rapid switchover approach, or test carried out over one evening, really may not be all that useful: I can certainly conceive of characteristics that sound great for a short while, but are grating on the ears in the long term - but as soon as the tester takes the speaker home, the DB, or even SB, aspect is gone.

Speakers do present some difficult challenges. They are also the place where pretty obvious coloration is, thusfar, unavoidable. That makes them a good place to choose your color and to spend most of your money, IMO. Best to choose efficiency and neutrality up to that point. YMMV.

Tim
 
Tim,
I think you should read about MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor) ITU-R BS.1116-1 "METHODS FOR THE SUBJECTIVE ASSESSMENT OF SMALL IMPAIRMENTS IN AUDIO SYSTEMS INCLUDING MULTICHANNEL SOUND SYSTEMS"
to learn about designing of a "decent test" regime.

I just scanned it, John. Have you read it yourself? There are system recommendations in there that audiophiles would reject out of hand. There are others that are specific to the point of the ridiculous. And then there is just a lot of pretty sound methodology -- appropriate listening samples, randomization, careful pre-screening of subjects, sufficient repetition, that sort of thing. This, sound research and statistical methodology, is what I'm talking about when I say "decent test." What I'm not talking about are specific system recommendations, because any specific system is just an opportunity to dismiss the results. If you really wanted to convince the "high-end," of anything that is not already a part of their belief system, you would have to conduct tests with a pretty broad variety of high-end systems, and many subject groups including audiophiles, engineers, civilians, trained and untrained listeners, etc. You'd have to use a broad variety of recorded material, both "audiophile" and mainstream. You'd have to run many multiples of the tests, in different rooms, under different conditions with different listeners for many months, to numbers beyond the statistical requirements, and get essentially the same results over and over again. That might convince a few people to change their minds.

But it has been done, more than once, and failed to change very many minds.

My blind listening experiments at home? There's nothing statistically valid about them; they prove nothing. I've said that many times here. What they tell me, and only me, is that many of the things we fuss over are so insignificant that in casual, but very concentrated listening, they disappear. Maybe if I used MUSHRA's recommendations I would hear them. But if I didn't, the audiophile community would reject the results anyway. There are plenty of points in that paper to object to if the results are not to your liking.

Tim
 
to hear what you want.

My blind listening experiments at home? There's nothing statistically valid about them; they prove nothing. I've said that many times here. What they tell me, and only me, is that many of the things we fuss over are so insignificant that in casual, but very concentrated listening, they disappear. Maybe if I used MUSHRA's recommendations I would hear them. But if I didn't, the audiophile community would reject the results anyway. There are plenty of points in that paper to object to if the results are not to your liking.

Tim

A very good point. And I want to add a comparison from my profession of professional violin playing: In the business of providing violins ("master" violins) to the aspiring young fiddler, violin dealers nowadays rarely let you take out an instrument, partly because the high value is difficult to insure for a few days (or so they say), and partly to keep control over the evaluating process:
There has been a lot of Stradivarius bashing of late, which will not confuse any true specialist: A Strad is a fantastic stile of instrument, among many other fabulous makers of comparable quality, be it old or new
But to get to know it, and being able to handle it, you usually need an extremely light hand and a free mind, which is unlikely in a "TEST" situation.

Why am I writing this: You will never hear what it (be it a violin or an audio system..) can do within a split second. I am a firm believer of extended assessments, where the results hit you, just like Tim says, without any proof ;-)
egidius
 
^

I'm with this guy.
 
So Tim
You still think this? "...But the difficulty of designing a decent test to determine if a difference is audible has been blown grossly out of proportion by people who want desperately to dismiss the results. ....."
 
Last edited:
So Tim
You still think this? "...But the difficulty of designing a decent test to determine if a difference is audible has been blown grossly out of proportion by people who want desperately to dismiss the results. ....."

Of course. I'll go you one better, John. I believe the designing of an effective audibility test that disagrees with their existing beliefs has been deemed impossible by those who want desperately to dismiss the results...

Tim
 
Happily. Otherwise we would not have had high-end since long. But people were knowledgeable enough to understand that most of these tests were flawed and ignored them.

...I rest my case. Oh and make this "all of these tests that challenged their core beliefs were flawed." They usually fail to find fault with the tests that agree with them.

Oh, and this description of an exhaustive process?

If you really wanted to convince the "high-end," of anything that is not already a part of their belief system, you would have to conduct tests with a pretty broad variety of high-end systems, and many subject groups including audiophiles, engineers, civilians, trained and untrained listeners, etc. You'd have to use a broad variety of recorded material, both "audiophile" and mainstream. You'd have to run many multiples of the tests, in different rooms, under different conditions with different listeners for many months, to numbers beyond the statistical requirements, and get essentially the same results over and over again.

I didn't make it up. Any guesses as to what study I'm talking about? Hint: it was broadly and immediately rejected by the high-end community. And bear in mind, that's a description of what I believe it would take to possibly change a few audiophile minds, not what it would take to make a statistically valid test. That's a lot easier.

Tim
 
Last edited:
Perhaps I should just post blanks to threads like this.

Well, whatever we post won't change the minds of the True Believers. But at least this is just audio - there are other areas where confirmation bias causes much more serious problems for all of us.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing