Objectivist or Subjectivist? Give Me a Break

Phelonious Ponk · Aug 17, 2013

jkeny said:
The source is here http://www.itu.int/rec/R-REC-BS.1116-1-199710-I/e The MUSHRA methodology is recommended for assessing "intermediate audio quality". For very small audio impairments, Recommendation ITU-R BS.1116-1 (ABC/HR) is recommended instead.

Partly, the anchor is randomly distributed through the listening tests & a statistical evaluation done on it's results which can then be used to evaluate the sensitivity & validity of the whole test including listeners, material & procedures.

No, you haven't & the test is invalidIs it the only difference? I didn't design the test so why ask me? Doesn't stop me asking questions about it's validity though. Test are either valid or not - if it's not possible to use appropriate controls then the test is not valid, is it?

Tim

[/QUOTE]

Because you're the guy insisting that the test is worthless if it can't follow this particular set of recommendations. Are you saying there are no other recommendations, no other testing methodologies? Are you saying that no test for small audio differences can ever be valid if it can't or doesn't use these particular recommendations, this particular control, that we can test nothing and gather no evidence worth considering if this particular control cannot be used?

If your answer to the questions above is yes, then I guess I'll wonder how you came to such a sweeping and unlikely conclusion, what incredible depth of testing and successful repetition this methodology has gone through since the 90s to become universally considered the only valid test for audibility by the entirety of the research, audiology and EE communities. And of course I'm going to wonder how it got so broadly used and accepted, given that is something actually is inaudible, this control is irrelevant.

I'm sure you have a link for that as well?

And yes, I know you didn't say all of that; you just said that Meyer and Moran, in spite of all of it's depth and care was rendered completely invalid for its failure to use that control. Which is close enough.

Tim

sasully · Aug 17, 2013

microstrip said:
Just frequency response, THD, IMD and signal to noise ratio figures.

All four of those today typically measure so 'good' in amps, preamps, and digital sources, as to not matter. If you mean manufacturer's reported specs, those were always best taken with a grain of salt.

I accept all the "fooling" theory, but prefer the risk of analyzing and scrutinizing others opinions, weighting them with my experience, since the "measurement camp" does not present anything that we can use to evaluate sound quality of sources and electronics - many of them they just say they should sound all the same if designed competently.

There are typically more caveats that than, e.g., operated within their non-distorting limits, outputs level-matched to within 0.2 dB, not using tubes, not being an electromechanical source (e.g., turntable/cart system).

The question when that's taken into account then becomes, why *shouldn't* they sound the same? Why do you 'prefer the risk' -- a rather high risk -- that what you are analyzing and scrutinizing are just artifacts of a flawed and misleading evaluation technique?

There's other significant areas of home audio where real, measurable, non-magically audible difference comes into play, after all.

sasully · Aug 17, 2013

JackD201 said:
J-J if you would actually read what I wrote closely. It is the LANGUAGE I object too. Conclusions must be qualified. Every scientist in every field does this and that is the way it should be. Unfortunately, what we do see in forums is people that do say that everything there is to be known is known dismissing observations outright. Sure many or even most of these may be imagined but there is always the probability that some are not. The significance of these statistical outliers is another story. These may or may not be relevant to a particular topic but that is not the point. My point is simple. Unless you can not account for all variables, the conclusion must acknowledge the scope and limitations

This is most amusing because if anyone is qualifying their conclusions, it's the 'objecivists' -- they're the ones specifying conditions under which things are LIKELY to sound the same. It's the subjectivist/audiophile crowd that seems positively averse to allowing even the *possibility* (awful as it may be) that what they are hearing isn't real. They are one ones not 'accounting for all the variables' on a regular basis. Indeed they almost appear to deny then existence, and certainly the importance, of certain 'variables'.

sasully · Aug 17, 2013

Alan Sircom said:
Not quite the reaction I expected to what I said, but OK.

Personally, I think both sides of the debate could do with a bit of a reality check. The subjective side should remember that not everything is audible. The objective side forgets that in the world of buying things, human nature will tend to trump even the most ineluctable brute fact if it contradicts prima facie findings.

The objective side is of course very well aware of that. So, what are you suggesting 'objectivists' should do when a claim of audible difference seems based more on 'human nature', than on a careful assessment of the brute facts?

jkeny · Aug 17, 2013

Phelonious Ponk said:
Because you're the guy insisting that the test is worthless if it can't follow this particular set of recommendations. Are you saying there are no other recommendations, no other testing methodologies? Are you saying that no test for small audio differences can ever be valid if it can't or doesn't use these particular recommendations, this particular control, that we can test nothing and gather no evidence worth considering if this particular control cannot be used?

Tim, as you already said - you know what controls are for. So if the controls aren't sufficient to ensure that the test is sensitive enough you end up with an invalid test. It really is as simple as that

If your answer to the questions above is yes, then I guess I'll wonder how you came to such a sweeping and unlikely conclusion, what incredible depth of testing and successful repetition this methodology has gone through since the 90s to become universally considered the only valid test for audibility by the entirety of the research, audiology and EE communities. And of course I'm going to wonder how it got so broadly used and accepted, given that is something actually is inaudible, this control is irrelevant.

I'm sure you have a link for that as well?

And yes, I know you didn't say all of that; you just said that Meyer and Moran, in spite of all of it's depth and care was rendered completely invalid for its failure to use that control. Which is close enough.

Tim

Well, if you looked at the heading of the paper or into the link I provided you would see that it is a recommendation from the ITU Radiocommunication Sector (ITU-R) which is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU) and is responsible for radio communication.

MUSHRA to evaluate the subjective audio quality of codecs, is also a recommendation from ITU-R & widely used, AFAIK.

I'm sure there are refinements & other methodologies but all will have internal controls embedded in them as per these methodologies to ensure their robustness, so I don't see what your objections are to this.

sasully · Aug 17, 2013

rbbert said:
Meyer and Moran's study never demonstrated that listeners could hear the difference between any sources.

That's not true. During the course of the tests they actually replaced one source because it was audibly different.

http://www.bostonaudiosociety.org/explanation.htm

"The first of these trials was done with the Pioneer player, and the fadeup of the room tone at the beginning of the Hartke disc revealed a slight but audible nonlinearity in its left channel decoder. We did some tests with the Sony, which sounded clean at any gain setting, and then switched to the Yamaha DVD-S1500, which was used for the remainder of the tests at this site.

And they also reported that differences between formats could be audible when low-level signal of a high-quality digital recording was played back well at well above 'standard' (i.e., non-deafening) gain settings -- something entirely predictable from 'measurements'.

JackD201 · Aug 17, 2013

sasully said:
This is most amusing because if anyone is qualifying their conclusions, it's the 'objecivists' -- they're the ones specifying conditions under which things are LIKELY to sound the same. It's the subjectivist/audiophile crowd that seems positively averse to allowing even the *possibility* (awful as it may be) that what they are hearing isn't real. They are one ones not 'accounting for all the variables' on a regular basis. Indeed they almost appear to deny then existence, and certainly the importance, of certain 'variables'.

An offender is an offender. It doesn't matter what he is. It's what he does.

sasully · Aug 17, 2013

jkeny said:
You mean the MUSHRA listening tests that the codec people use? Yes all for that (note to Tim, it includes positive & negative controls). But I thought you were suggesting measurements? Again you talk about measurements & calculations - what specific measurements exactly? Can you be a bit more specific, please - it seems all rather aspirational with "should" & "could" lavishly sprinkled throughout.

If you are this skeptical of DBT studies that fail to include positive and negative controls, you should be far more skeptical of plain old sighted listening evaluations -- typical reviews, both published and online - which have that flaw PLUS zero controls for bias.

sasully · Aug 17, 2013

Gregadd said:
The question remains " Why do our perfect measurements yield such imperfect results? Something is missing. From a measurement perspective, what is it and where did it go?
We can't hear it but we can measure it.
We can hear it but we can't measure it using our current technology.
"Those who seek the chaperone of science..."try to reconcile those two statements. (Assuming you find them at odds.)

It's not necessary to reconcile them, because you left out one really, really important (and commonly true) possibility. Which is what audiophiles often do.
They start from the premise that if they hear it , it's real, and the problem is for science to 'explain' it.

microstrip · Aug 17, 2013

Why do we go on debating the old Meyer and Moran tests when the raw data and the exact conditions of these tests are not available? Please do not tell me that the list of brands and recordings that was later published is really enough to please our more scientific members

) .

The only reason I find for this lasting debate on Meyer and Moran is economics - although AES has a High Resolution Technical Committee http://www.aes.org/technical/hra/ and we can find several more recent interesting papers and presentations on the subject reading them will cost many tens of dollars each. So we go one debating the free outdated and controversy relics from the past, repeating the arguments that both sides have written unchanged thousands of times.

jkeny · Aug 17, 2013

sasully said:
If you are this skeptical of DBT studies that fail to include positive and negative controls, you should be far more skeptical of plain old sighted listening evaluations -- typical reviews, both published and online - which have that flaw PLUS zero controls for bias.

So are you asking which I am more skeptical of - flawed listening tests (including DBTs) or flawed measurements?

jkeny · Aug 17, 2013

sasully said:
If you are this skeptical of DBT studies that fail to include positive and negative controls, you should be far more skeptical of plain old sighted listening evaluations -- typical reviews, both published and online - which have that flaw PLUS zero controls for bias.

If you reject the concept of positive & negative controls in DBTs then we are done!

microstrip · Aug 17, 2013

sasully said:
That's not true. During the course of the tests they actually replaced one source because it was audibly different.

http://www.bostonaudiosociety.org/explanation.htm

And they also reported that differences between formats could be audible when low-level signal of a high-quality digital recording was played back well at well above 'standard' (i.e., non-deafening) gain settings -- something entirely predictable from 'measurements'.

Sorry but using extreme cases such as you refer is not a proper control test!

sasully · Aug 17, 2013

rbbert said:
Well, no. Meyer and Moran simply demonstrated that under the conditions of their test listeners could not consistently hear a difference between SACD and that SACD run through a filter making it 16/44.1 resolution. Really that's all it says, and while it isn't nothing it also isn't very much. Also, as mentioned many times previously, there was no attempt to quantify statistically the chance that M and M missed a real difference within the setting of their study (even apart from having no positive control).

Are you saying they applied no statistical tests to their data (they did) or that you don;t think they applied the right ones or applied them incorrectly?

I have already informed you that small but real differences *were* perceived during the test, proving the system was sufficiently 'resolving' for at least that ....which would be one purpose of the positive control. The other objection, that the listeners weren't 'vetted' by proving they could differentiate a positive control, has an academic value, but in the context of audiophile assumption-- namely, DSD is *obviously* audibly superior to Redbook -- it's kinda moot.

Btw if subjects were vetted beforehand with a hearing test, then that *is* a positive control for the method, so long as conditions were not otherwise significantly different in the subsequent tests.

Why does a DSD vs resampled PCM test say 'not much' to you? You can perform virtually the same test they did (using high rez PCM instead of DSD), on yourself, if you have the right hardware and software (foobar2000 with it ABX comparator plugin, a soundcard that can 'do' hi rez PCM and does not resample output) .

sasully · Aug 17, 2013

microstrip said:
Sorry but using extreme cases such as you refer is not a proper control test!

That's funny, because subjects themselves requested higher than standard playback levels, so they could "listen for more details in softer passages". And M&M gave it to them.

This 'gotcha'-type harping on positive controls -- which, properly, would be inclusion of a difference pre-tested to be likely to be audible to most of the populace, or to a trained subset -- as a supposedly irredeemable flaw in M&M, is really quite disingenuous in the face of routine audiophile claims made for tbe obviousness of hi rez/DSD vs Redbook difference. One would think that if the benefits of hi rez /DSD were as stark as claimed, under the many conditions claimed, positive and negative controls would be almost redundant.

But it's hardly the first time I've seen audiophiles demand the highest scientific rigor...when it suits them.

sasully · Aug 17, 2013

jkeny said:
If you reject the concept of positive & negative controls in DBTs then we are done!

Hardly. I don't reject them, I recommend them, along with listener training. But lack of them, in this case, did not render the results nearly as useless as audiophiles desperately wish them to be. For that to be true you have to posit one or more of these to be true too:

- that DSD vs REdbook is a rather subtle difference that M&M's subjects lacked the discriminative skills to detect (except that they could, if the playback was boosted)
- that M&M's playback chain was flawed in a way that 'masked' real differences (though not enough to mask the differences that in fact *were* detected)
- that M&M's choices of music did not adequately reflect the 'real' audible differences between DSD and REdbook, e.g., the SACDs were sourced from PCM or analog masters (this despite allowing at least some users to 'roll their own' in terms of what discs to listen to, and despite common and widespread audiophile report of the superiority of SACDs to Redbook versions, regardless of provenance of the recording)
- that M&M's statistical analysis was flawed, and likely detection of real differences *in their own dataset* was missed.

sasully · Aug 17, 2013

JackD201 said:
"All our knowledge has its origins in our perceptions."

Leonardo da Vinci

*Origins* yes, but mere perception does not guarantee the accuracy or truth of that 'knowledge'. That required more work.

MylesBAstor · Aug 17, 2013

jkeny said:
If you reject the concept of positive & negative controls in DBTs then we are done!

Moreso, internal controls!

sasully · Aug 17, 2013

jkeny said:
So are you asking which I am more skeptical of - flawed listening tests (including DBTs) or flawed measurements?

all flaws are not created equal

sasully · Aug 17, 2013

rbbert said:
Perhaps a good place to remind ourselves of two axioms of current scientific thought:

1) We can't know everything.

2) Some things we "know" are wrong.

These are particular true in biological systems

It really matters what 'things' are being referenced here.

Objectivist or Subjectivist? Give Me a Break

New Member

New Member

New Member

New Member

Industry Expert, Member Sponsor

New Member

WBF Founding Member

New Member

New Member

VIP/Donor

Industry Expert, Member Sponsor

Industry Expert, Member Sponsor

VIP/Donor

New Member

New Member

New Member

New Member

Well-Known Member

New Member

New Member

Similar threads