Conclusive "Proof" that higher resolution audio sounds different

amirm · Aug 29, 2014

arnyk said:
The crux of the issue seems to be related to the question: Is there is such a thing as ultrasound?

I am pretty sure there is. At least that is how we looked at our then unborn child

.

Testing for the audibility of ultrasound seems to be a contradiction in terms.

We did hear something when they turned on the speaker on the ultrasound machine which sounded like heartbeat.

How does one test for the audibility of ultrasound?

Take your female partner to the doctor at the appropriate time?

amirm · Aug 29, 2014

arnyk said:
amirm said:

Is it your opinion that they have complied with this part of the list to generate "reliable" and "sensitive" results?

arny said:

Ten (10) Requirements For Sensitive and Reliable Listening Tests

(1) Program material must include critical passages that enable audible differences to be most easily heard.

Click to expand...

This was the thesis of the test, so it could not be complied with. It was the independent variable.

arny said:

(2) Listeners must be sensitized to a audible differences, so that if an audible difference is generated by the equipment, the listener will notice it and have a useful reaction to it.

Click to expand...

Since the difference is entirely in the ultrasonic range, our best science says that it is not an audible difference. Again it is part and parcel of the thesis of the experiment.

arny said:

(3) Listeners must be trained to listen systematically so that audible problems are heard.

Click to expand...

Again the difference is entirely in the ultrasonic range, our best science says that it is not an audible difference. Again it is part and parcel of the thesis of the experiment.

Is it your opinion that they have complied with this part of the list to generate "reliable" and "sensitive" results?

Click to expand...

The question above is the question that the experiment attempted to develop relevant evidence to perhaps be used to address it. Therefore trying to comply with these guidelines is itself a conundrum or impossible mission.

Click to expand...

It is not uncommon for people who are hypercritical and don't really understand experimental design or the problem at hand to unknowingly add impossible requirements to a difficult experiment.

Is it you Arny that typed this message or has someone taken over your account here? It is hard to believe you typed those words.

Addressing the last two parts, that is textbook definition of circular reasoning and biased experimentation. Are you with a straight face using your assumptions about the outcome of a test, i.e. "ultrasonics are not audible" to create an experiment to see if "ultrasonics are audible?" We run experiments because we are not convinced of theory. If we are not convinced of the theory then you can't use the theory to create the experiment. By your logic no test should have been run at all.

I have always said that I fear bias in the person creating an experiment far more than the bias in a listener. And you just proved why that is so dangerous. This is why people are flippant about many of these so called "DBTs." The person taking the test may have been blind but the person cooking the test certainly was not.

Now, if this were any other time, maybe it could be excused. But now? On the heels of a number of people passing your impossible test of the same "ultrasounds," you go on to declare that M&M's test was supposed to show negative outcome? Your test was supposed to do the same, right? You said for 14 years that was the case. All of a sudden it did not, once enough critical energy was put on it.

Likewise, you are continuing to make paper assumptions and applying them to real products. Real products don't perform perfectly. The purpose of such tests is to determine if the imperfections in them are audible. I would think you should not come within a mile of such tests if you are going to assume one outcome and proceed to cook he the test that way by taking away controls, selection of revealing tracks, listener training, etc.

Your top-10 list seems to be a farce as was your statement that you only put limited value on Meyer and Moran. You sang the praises of the test and are now defending it tooth and nail to the point of throwing out the most fundamentals of experimental bias: Thou shalt not be biased....

The hole is certainly getting deeper and deeper.

thedudeabides · Aug 29, 2014

Given the fact (I think) that this is a "factually based, objective" discourse and with all due respect gentlemen, who "won" or is that still undetermined?

arnyk · Aug 29, 2014

thedudeabides said:
Given the fact (I think) that this is a "factually based, objective" discourse and with all due respect gentlemen, who "won" or is that still undetermined?

You are IME correct in that it appears that a search for truth was never the goal of many of the WBF participants. The problem with winning as a goal is that winning is not a distinct goal all by itself. The question becomes: Winning what?

JackD201 · Aug 29, 2014

amirm said:
Is it you Arny that typed this message or has someone taken over your account here? It is hard to believe you typed those words.

Addressing the last two parts, that is textbook definition of circular reasoning and biased experimentation. Are you with a straight face using your assumptions about the outcome of a test, i.e. "ultrasonics are not audible" to create an experiment to see if "ultrasonics are audible?" We run experiments because we are not convinced of theory. If we are not convinced of the theory then you can't use the theory to create the experiment. By your logic no test should have been run at all.

I have always said that I fear bias in the person creating an experiment far more than the bias in a listener. And you just proved why that is so dangerous. This is why people are flippant about many of these so called "DBTs." The person taking the test may have been blind but the person cooking the test certainly was not.

Now, if this were any other time, maybe it could be excused. But now? On the heels of a number of people passing your impossible test of the same "ultrasounds," you go on to declare that M&M's test was supposed to show negative outcome? Your test was supposed to do the same, right? You said for 14 years that was the case. All of a sudden it did not, once enough critical energy was put on it.

Likewise, you are continuing to make paper assumptions and applying them to real products. Real products don't perform perfectly. The purpose of such tests is to determine if the imperfections in them are audible. I would think you should not come within a mile of such tests if you are going to assume one outcome and proceed to cook he the test that way by taking away controls, selection of revealing tracks, listener training, etc.

Your top-10 list seems to be a farce as was your statement that you only put limited value on Meyer and Moran. You sang the praises of the test and are now defending it tooth and nail to the point of throwing out the most fundamentals of experimental bias: Thou shalt not be biased....

The hole is certainly getting deeper and deeper.

Test cooking or "gaming" for me is what gives testing a bad name from a test subjects point of view.

thedudeabides · Aug 29, 2014

arnyk said:
You are IME correct in that it appears that a search for truth was never the goal of many of the WBF participants. The problem with winning as a goal is that winning is not a distinct goal all by itself. The question becomes: Winning what?

That would be, from your "O" perspective, for you to decide.

As previously stated, this whole discussion IMHO, has been (from an "S" perspective) totally irrelevant and meaningless as to why I, and others, listen to and enjoy music.

amirm · Aug 29, 2014

arnyk said:
You are IME correct in that it appears that a search for truth was never the goal of many of the WBF participants.

Once more Arny, you and others created these tests. No WBF participant created the tests. You created one of them. You said in 14 years no one could pass it. Now people left and right are passing it.

As participants we were challenged to come up with positive results and claim made that such an outcome would not be possible. We proved otherwise. This is a major advancement in these discussions. Never in the history of these arguments had anyone taken an impossible test by Arny and got positive results and not just once, but through a number of iterations.

The problem with winning as a goal is that winning is not a distinct goal all by itself. The question becomes: Winning what?

Winning an argument. Your argument was that even 32 Khz sampling is transparent. You created a test with the single goal of proving that. Well, it backfired. Passing 32 Khz was easy and some went on to pass it at 44 Khz. Yourself included at the end!

So we have advanced the discussions. No longer can you claim that nobody has ever a) run such tests and b) created positive outcomes.

Orb · Aug 30, 2014

arnyk said:
AFAIK the first whistle blower was David Greisinger who wrote this paper:

"Perception of mid frequency and high frequency intermodulation distortion inloudspeakers, and its relationship to high-definition audio." and first presented it in 2003 at the AES 24th International Conference, Banff, Alberta, Canada June 26-28. http://www.davidgriesinger.com/intermod.ppt. I will take the liberty of putting he and I in the same general camp of audio. Clearly an objectivist.

I am chuffed as I have been posting about this guy a good 50 pages back or so and twice provided links to that paper.
His background is/was Chief Scientist for Harman Speciality Group Lexicon, and has a great background in science and engineering.
So you accept his following conclusion, which ties in with what JA was saying and his results as the two products he tested that had -50db IMD related products were in fact clipping with a 0dbfs signal - hence lowering the signal removed notable IMD:

Griesinger said:
When the ultrasonic signals only were played at high levels, intermodulation products from the input signals were easily heard
- at levels consistent with amplifier distortion
....

Amplifier distortion can produce distortion products below 20kHz that are audible (with difficulty) in the absence of other signals below 20kHz.
But with a high quality amplifier these distortion products are not audible in the presence of even extraordinary ultrasonic sources such as rattling keys.
Unless the amplifier is driven into clipping

Furthermore Griesinger mentions these were only heard IF one removes the sub 20khz source content, in other words nothing there to mask those products.
While some may pick up on his use of "high quality amplifier", worth noting that JA tested products that would be the worst case scenario design (integrated functionality small to very small form factor DAC-PRE-PWR-Headphone/line out) and outside of clipping all gave good results.
This fits in with Tony Lauck view (apologies oversimplifying position) that if a commercial product fails on such parameters, one should buy something else if looking to do testing and/or hirez-critical listening.
Thanks
Orb

Orb · Aug 30, 2014

esldude said:
I made that mistake about this thread too Arny. This thread was never about ultrasound, or even high rez. It was an attempt to critique how forum ABX tests are done and that they have some deficiencies when used as "proof". I agree with that premise, but regret taking part in what seems a cultivated misdirection to introduce one idea pretending to talk about another.

Ah Esldude sorry you feel that way.
This thread to me is about several different facets.
- Trained listeners and assumptions; To begin with and narrative from some is that no-one has passed these tests to show audible difference, this was initially "proved over at AVSF where everyone taking the test failed.
Until Amir passed and provided his results several times, same for a couple here. Another change was how those passing also changed their approach/listening methodology in doing the ABX (so in reality could be deemed casually trained via forum as guided in this approach by some members)

- Now there has been successful passes, assumption by some it must be IMD creating cues; this was followed up with actual specific measurements and some doing further testing with additional hardware to ensure this can be removed as a consideration, in practical terms IMD is not an issue unless hardware is pushed into clipping and tbh this breaks ABX framework anyway.
A side debate may be why use jangling keys (Griesinger and also Boyke versions would be more ideal as they have near comparable sub 15khz peaks but that is still high frequency) as their nature compounds conclusions; as seen by the arguments.
However this is also negated by hirez music that has a different nature in where peaks reside.

- Hearing ultrasound; lets be honest, no-one in this thread has even suggested that is what is being picked up.
Suggestions have been around either how source digital is "handled" (this can be applicable to both studio and home), or while not specific to this test the actual benefit of native hirez is flexibility with regards to interpolation filter used by DAC and specifically slow rolloff/minimum phase no longer dropping from 16khz onwards (I appreciate these depends upon the coefficients and can be before sin(x)/x or closer to 20khz depending upon design) as it would normally for CD related sampling rates.
Also it has been mentioned that again while not relevant to those in this thread, IMD consideration for a digital product would be for stopband/alias rejection strength-weakness; specifically as extreme example NOS DACs that suffer high IMD.

- Managing clipping; comes back to some extent with IMD that involved creating IMD products in the audioband, tbh as I mention to Arny this is something of an expectation-management that should had been set with those doing this test over at AVSF, especially when provided tones to test for IMD where nearly all who heard IM related distortion mention their system has IMD problems even though they must turn the volume up loud (not understanding clipping-stress testing hardware that breaks ABX framework).
That said several of those do understand this and managed their test, and went further by including using multiple hardware solutions.

- Framework and implementation of the ABX test; most of these tests are unfortunately hobbyist in approach (include M&M in that as well), this is further compounded that scope-focus-conclusion are usually overextended by those creating a narrative.

Think that sums up the primary facets that are of interest in this thread.
Thanks
Orb

arnyk · Aug 30, 2014

amirm said:
Once more Arny, you and others created these tests. No WBF participant created the tests. You created one of them. You said in 14 years no one could pass it. Now people left and right are passing it.

Really?

I see no clear trend. I don't see dozens of people who claim to have obtained statistically significant results that they think is valid. Are you the only one, Amir? I don't know.

There are a number of people who post only on AVS who have said that they obtained a statistically significant result (including myself) but they admit that they gamed the test.

I see no evidence that anybody who posts solely on WBF has given any of them a serious try. A very conservative bunch, some of whom make themselves amusing by pontificating on ABX tests having never deigned to actually try one, even though the bar for do so is very low. The resulting false claims about ABX provide additional levity or consternation for those who are more knowledgeable.

It seems reasonable that more people have tried and failed but kept their failures on the QT. A number of people who post on AVS seem to have tried and failed and admitted it publicly.

I'm not keeping score because the picture is so confused.

All of the hyper concern over the do-it-yourself IM testing sends a strong message that some people really badly need these tests to have a positive result. They may fear anything that raises questions. That is not exactly a condition that heightens objectivity.

Listening tests related to hearing the removal of ultrasonic sounds seem to have additional difficulties that many other kinds of listening tests lack. It is hard to contrive a positive training sequence for things that science says absolutely posmatively can't be heard. In the case of things like IM, Jitter, and the like it is relatively easy to jack the artifacts up to the point where they are heard by everybody and then back them down to actual levels in small steps. When the artifact is generally considered to be inaudible at any practical SPL level, that is not so easy.

amirm · Aug 30, 2014

arnyk said:
Really?

I see no clear trend. I don't see dozens of people who claim to have obtained statistically significant results that they think is valid. Are you the only one, Amir? I don't know.

Oh, there is a trend alright. Or should I say a bit of a snowball. You can miss it if you are not following the real story (see Orb's excellent summary).

Until now, our camp had done such a successful job of biasing all listeners into thinking these tests are unbeatable. This is a masterful version of it as I quoted in this first post of this thread:

arnyk said:
Yes. Take the best audio system you can find. Take the best recordings you can find - recordings that sound great and also have significant content > 20 KHz, even > 35 KHz. Switch a 16 KHz brick wall filter in and out of the signal path. Nobody notices nuttin'

People say: "But I can hear pure tones at 21 KHz". Probably true. But that is without content at other frequencies masking it. Music is composed of many tones at many different frequencies. Masking in the upward direction frequency-wise is very strong.

"Nobody notices nuttin'." Love that southern charm

. The trick was to get past that experimenter bias. Unlock the door so that others have a chance to get through. So I did a listen, found the difference pretty easily and post the results. So I didn't have to "hang it up." What a relief that was

.

The second sentence is quite ironic seeing how to determine if "IM distortion" was behind the positive outcome, you created two ultrasonic tones with absolutely nothing below 20 Khz to mask the distortions (again as Orb mentioned). Masking? Who needs masking if you are trying to invalidate positive results. Not only that, but let's boost the levels of ultrasonics beyond any recorded music.

There are a number of people who post only on AVS who have said that they obtained a statistically significant result (including myself) but they admit that they gamed the test.

Again, the real story is not understood. None of these people would have tried to pass these tests had they not seen the positive outcome from my testing. Once that was there, they were determined to find the differences. Immediately they started on the path of becoming critical listeners!

It is not like with those cheats it is easy to find differences. It took you days to find the differences you did and I don't remember you saying you had cheated. You said you heard the difference due to IM distortion which we know doesn't hold water if we consider your masking comment.

If you are now saying there is no real audible difference and you managed to game the system, let's hear that. I asked this question repeatedly but you would not answer on AVS. It would be good to clearly get it on the record.

This is why I put "proof" in quotation mark. We didn't prove ultrasonics are audible. We proved that this "dare tests" can be passed. In your case and others by cheating. In my case, immediately listening to the files blind with no regard to which file was which. My training told me that there was a chance I could detect differences and I was not swayed by the experimenter bias of "Nobody notices nuttin'"

This is not to brag. Your doctor is not bragging when he uses his medical training to figure out what is wrong with you when you don't know. As I have explained, my hearing ability and familiarity with these tests comes from it being part of my job.

I see no evidence that anybody who posts solely on WBF has given any of them a serious try. A very conservative bunch, some of whom make themselves amusing by pontificating on ABX tests having never deigned to actually try one, even though the bar for do so is very low. The resulting false claims about ABX provide additional levity or consternation for those who are more knowledgeable.

I gave it more than a serious try. Heck of a more serious try than you gave yourself to it! You still have not run and reported your results on Scott's test. If you are so good at cheating in these tests, how come you have not provided those? How about your own "jitter" tests? We didn't see the outcome from your testing on those either (with the exception of a silly one).

You should have post the results of your test before you asked anyone else. You champion these tests yet you are the first one to be shy and afraid to post the results. And then proceed to put others in bad light for not running your test?

This is not an argument with the subjecitivists Arny. They don't claim to pass these tests or have interest in them. The tests are for *us*. The "objectivists." It is for us to determine if our assumptions are rooted in experimentation or stuff that makes sense to our belly. Now we have that data and in front of the subjectivists no less, throw out such things as masking, create artificial tests with near clipping ultrasonics, look for any excuse to dismiss the results of the test, etc. You don't think these things will come back to haunt us? I assure you they will and they have already.

It seems reasonable that more people have tried and failed but kept their failures on the QT.

Which is what you have done on other tests not yet reported despite numerous requests for you to do so.

amirm · Aug 30, 2014

arnyk said:
I'm not keeping score because the picture is so confused.

The picture is abundantly clear. Again, please read Orb's summary.

All of the hyper concern over the do-it-yourself IM testing sends a strong message that some people really badly need these tests to have a positive result. They may fear anything that raises questions. That is not exactly a condition that heightens objectivity.

No fear whatsoever. You put the test forward and I ran it and reported on the results. Anyone else would have not given you a second chance seeing how you were wrong on the first one that the key jingling sounded the same down to 32 Khz.

Here is how that interchange went:

arnyk said:
No there is a fourth possibility that seems to have somehow slipped through the cracks.

The above results are not the least bit remarkable if the monitoring system is less than audibly linear over the frequency range of the program material that is involved.

A listening test was developed to test any monitoring system for linearity in the range from 20 KHz to 48 Khz more or less. It was made available in this thread yesterday morning.

Only one person has tried this test, and analysis of his results are being withheld from publication pending other relevant people attempting to use it. This person's name is not Amir. Amir has some unfinished homework!

Any listening test results using an untested monitoring system are potentially bogus.

So confident you were that yet again I would fail that test. Well, I did not. As I reported, not only did my system not have "IM distortion" but I also pointed out a flaw in what you said should be audible: http://www.avsforum.com/forum/91-au...cott-s-hi-res-audio-test-77.html#post25974394.

Listening tests related to hearing the removal of ultrasonic sounds seem to have additional difficulties that many other kinds of listening tests lack. It is hard to contrive a positive training sequence for things that science says absolutely posmatively can't be heard. In the case of things like IM, Jitter, and the like it is relatively easy to jack the artifacts up to the point where they are heard by everybody and then back them down to actual levels in small steps. When the artifact is generally considered to be inaudible at any practical SPL level, that is not so easy.

Yes, they do have "additional difficulties." And you discovered that now, after how many years Arny? Decades?

As to the rest of that comment, in a lossy codec the system uses psychoacoustics model of the hearing system to remove things that you are not supposed to hear. Yet as I reported, and expert listeners elsewhere attest, you can learn to hear those artifacts that "science says absolutely posmatively can't be heard."

So what went wrong? There is no science that says "absolutely posmatively can't be heard." Real systems are not like the ones on paper. Ultrasonics don't just vanish by waving your hand. Remove the extra bits and you immediately create distortion. Add dither and you convert that to increased noise. These things can become audible based on what we know about hearing acuity. The real science and not stuff we say on forums.

This is what I post from one of your mentors in the debate thread:

AES paper: A Digital-Domain Listening Test for High-Resolution
John Vanderkooy
Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1

At the outset, let me state my bias that CD-quality audio
(44.1 kHz, 16 bit) is essentially transparent. Under
pristine conditions it may just be possible to hear
residual channel noise. Listening tests [1] have agreed
with this position, but there is much anecdotal evidence
that higher sampling rates and longer wordlengths have
significantly superior performance

See? It doesn't say it is impossible to hear chanel noise in 44.1/16. "Essentially transparent" doesn't say "transparent." It means for vast majority of people this is so. But some people should be able to hear the channel noise.

So let's not invent voodoo audio science to explain away these results. Just because we declare something using layman assumptions of ideal systems, etc. doesn't mean it is

amirm · Aug 31, 2014

Scott did an interview with Jon Iveson of stereophile on this topic and others. It is a bit long so I watched it at 1.5x speed:

arnyk · Aug 31, 2014

amirm said:
This is what I post from one of your mentors in the debate thread:

AES paper: A Digital-Domain Listening Test for High-Resolution
John Vanderkooy
Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1

At the outset, let me state my bias that CD-quality audio
(44.1 kHz, 16 bit) is essentially transparent. Under
pristine conditions it may just be possible to hear
residual channel noise. Listening tests [1] have agreed
with this position, but there is much anecdotal evidence
that higher sampling rates and longer wordlengths have
significantly superior performance

See? It doesn't say it is impossible to hear channel noise in 44.1/16.

Absence of evidence is't evidence of absence. I've read the paper cited above and it pretty well describes the tests many of us have been doing all along. I see people's inability to read the paper being exploited to create false impressions.

arnyk · Aug 31, 2014

When you listen to these files, remember that "Blind Test" generally means "Single Blind Test" and that a single blind test is just a sighted evaluation with the some of the appearance of a DBT , but not the real thing.

It is true that DBTs and a SBTs are both blind tests, but that does not mean that they are any way equivalent even though they may look that way to a casual viewer of the test.

A DBT experiment is an experiment in which any information about the test that might lead to bias in the results is concealed from both the tester(s), and the subject(s). There is nobody ho knows the correct identity of the unknowns that has any visibility at all to anybody who can possibly affect the outcome of the test.

A SBT experiment is an experiment in which any information about the test that might lead to bias in the results is concealed from the subject(s) but not the tester(s) or other persons who are present during the test. There is now someone who knows the correct identity of the unknowns who has some kind of visibility to somebody who can possibly affect the outcome of the test.

For many years SBTs were considered to be adequate until the well-known case of "Clever Hans the Talking Horse" in the early 1800s. Clever Hans was a horse who amazingly enough tapped out correct answers to difficult questions with his hooves. The tests were SBTs in that someone who knew the correct answers was visible to the horse and communicated with the horse whether intentionally or unintentionally via his body language. When the tests were upgraded to DBTs by removing anybody who knew the correct answers from the presence of the horse, the horse suddenly lost his question-answering abilities.

A listening test can't be done with the listener acting totally in a vacuum. The listener needs to know which trial is the current trial, for example. The easiest way to do a double blind test is to prepare a script of the trials and correct identities of the unknowns and give it to someone who controls the technical side of the test, and for example does the switching. This person is completely concealed from everybody else who is involved. A second person calls out the trial numbers, which the concealed person uses to do the switching, and the listeners use to record their results.

The first innovation of ABX was to build a machine that was a mechanical test coordinator that controlled the unknowns and kept the listeners updated as to trial numbers, etc. The second innovation was the ABX comparison method itself which allows sighted evaluations for the purpose of learning what to listen for, and blind tests for the purpose of determining the outcome of the test.

Orb · Sep 1, 2014

arnyk said:
Snip...
A SBT experiment is an experiment in which any information about the test that might lead to bias in the results is concealed from the subject(s) but not the tester(s) or other persons who are present during the test. There is now someone who knows the correct identity of the unknowns who has some kind of visibility to somebody who can possibly affect the outcome of the test.

For many years SBTs were considered to be adequate until the well-known case of "Clever Hans the Talking Horse" in the early 1800s. Clever Hans was a horse who amazingly enough tapped out correct answers to difficult questions with his hooves. The tests were SBTs in that someone who knew the correct answers was visible to the horse and communicated with the horse whether intentionally or unintentionally via his body language. When the tests were upgraded to DBTs by removing anybody who knew the correct answers from the presence of the horse, the horse suddenly lost his question-answering abilities.

Just to expand on this as fits nicely, you can do screened SBT though.
There has been modern examples of the Clever Hans, usually dogs and a university studying this use screened SBT; the challenge is that one ideally still needs the owner to engage with the animal.
No animals passed the screened SBT.
So this can be applied to other tests including audio where one is a controller, although I appreciate not possible or ideal with all test scenarios-environments.

Funny example of training animals to pick up on body language/facial cues; a zoo in conjunction with a university trained one of their eagles to identify when a person lied.
It was freakishly good and the way the eagle reacted to when the lie started or the person was about to lie. also amusing the human reaction being studied by the eagle when they realised they could not get a lie through

Was quite a few years ago, great fun test though.

Cheers
Orb

amirm · Sep 1, 2014

arnyk said:
When you listen to these files, remember that "Blind Test" generally means "Single Blind Test" and that a single blind test is just a sighted evaluation with the some of the appearance of a DBT , but not the real thing.

It is true that DBTs and a SBTs are both blind tests, but that does not mean that they are any way equivalent even though they may look that way to a casual viewer of the test.

Of course single blind test can be the "real thing." What makes it the "real thing" is keeping experimenter bias out of the equation. Double blind testing takes care of some of that but leaves gaping holes otherwise.

As an example, I can create an MP3 file that will sound identical to the source at 128 kbps to a hundred people with your listening abilities Arny. You can run that double blind all you want and the results are still corrupt. The corruption got introduced in how I selected the files and the listeners.

What makes double blind tests far worse in this regard is that those three letters, DBT, give false sense of security to people reading the results. This is not a problem in the industry/research but in online forums, people equate those letters with "science." They automatically believe the outcome above and beyond any other type of test.

By the same token, the slant that Single blind test is "not the real" thing and somehow inferior to dbt is absolutely wrong. Take the above MP3 test. This time I select the MPEG codec test clips, and trained listeners. I will run it single blind with me switching files behind a curtain on instructions from the tester. I will be absolutely quiet and follow the instructions as given. The outcome will be that testers will hear the difference. We have run such tests double blind so we know that outcome is correct either way.

The talk about Clever Hans and such is just that: talk. You have to look past SBT/DBT and read and understand the test. Neither three-letter acronym helps you in figuring out if the test is valid or not. I will take a properly done SBT over a thousand poorly done, DIY DBTs any day of the week and twice on Sunday.

I challenge you Arny to show how you can corrupt the outcome of a single blind test like the one I have outlined above. And how the DBT results of the one before it is valid.

esldude · Sep 1, 2014

amirm said:
snippage......

As an example, I can create an MP3 file that will sound identical to the source at 128 kbps to a hundred people with your listening abilities Arny. You can run that double blind all you want and the results are still corrupt. The corruption got introduced in how I selected the files and the listeners.

snippage..................

I challenge you Arny to show how you can corrupt the outcome of a single blind test like the one I have outlined above. And how the DBT results of the one before it is valid.

The first DBT is valid, and not corrupt if you are honest about selected listeners with a certain lower than average level of hearing acuity, and that you selected files with a particular set of criteria. Further if the claim for the results were evidence certain hearing limitations and sound types 128 kbps is transparent to those listeners for those purposes.

amirm · Sep 1, 2014

esldude said:
The first DBT is valid, and not corrupt if you are honest about selected listeners with a certain lower than average level of hearing acuity, and that you selected files with a particular set of criteria.

And if I didn't? The test would still remain double blind yet generate an outcome which is wrong. This is the point I was making.

Further if the claim for the results were evidence certain hearing limitations and sound types 128 kbps is transparent to those listeners for those purposes.

What if those claims were universal? The test is still double blind yet generates an outcome which is wrong. This is the point I was making.

As you are doing, what makes the outcome valid or not are such details. The DBT moniker did you no good to make any of these assurances.

Now imagine everything you said should be there above existed but the test was single blind. Then compare that to a double blind test where none were there. Which one would you take? I know I will opt for the single blind one that is correct otherwise. And in no way did the single blind degrade to a sighted test as Arny implied.

arnyk · Sep 2, 2014

amirm said:
Of course single blind test can be the "real thing."

If wishes were fishes...

What makes it the "real thing" is keeping experimenter bias out of the equation.

How do you do that when the experimenter knows the right answers and can affect the outcome of the test? Ever since Clever Hans the answer for thoughtful people has been: "The human ability to control its personal bias is questionable at best".

Double blind testing takes care of some of that but leaves gaping holes otherwise.

Typical of the many golden eared audiophile attempts to stigmatize DBTs for problems that they share with any any subjective test or more general than that, any audio test.

Just to clarify an obvious misapprehension, DBT is not a complete experimental design, it is a critical part of one.

Anybody who reads BS 1116 should be able to see that. BS1116 says that the test should be a DBT and then it specifies a whole lot of other test conditions to come up with a semblence of a good experimental design.

Conclusive "Proof" that higher resolution audio sounds different

Banned

Banned

Well-Known Member

New Member

WBF Founding Member

Well-Known Member

Banned

New Member

New Member

New Member

Banned

Banned

Banned

New Member

New Member

New Member

Banned

New Member

Banned

New Member

Similar threads