Conclusive "Proof" that higher resolution audio sounds different

It seems that more effort is being expended in finding ways of denying the relevance of the Foobar ABX test than in trying to find a better way to administer it. I've mentioned it before but would it not be a good idea to find ways of putting some positive & negative controls into audio files that act as a pre-test to the actual test. Has anybody ever followed this sort of procedure - a pretest which verifies & qualifies (or calibrates the listener & his equipment/environment)?

These controls should be enough to avoid the sort of objections that I've seen raised here & elsewhere. So let's see what ways a future test could avoid these objections or are all future Foobar ABX positive results going to be objected to:
- IMD in the equipment
- differences of 0.2dB level (is this really audible - why not have it as a control & see?)
- timing differences(what level is audible?)
- dither differences (again test for it's audibility).
- proctoring (now this really is a killer objection - I thought the computer was the proctor in this test?)
- include known differences that should be audible by normal listeners (what might these be)

I'm sure there are others that I'm missing - just think of all the objections & devise a mini-test which stops it from being an objection after the test is run. Carefully designed controls could answer how much of an expert listener you need to be to pass. Any other suggestions?

ArnyK, given your many objections, how would you now run the test?
 
Last edited:
Well to listen and see if you find areas that sound slightly different to you. To narrow down and listen intensively to short segments. If it involved an entire song of a few minutes you don't listen to whole songs. You find a segment of a handful of seconds and compare those. And your files were a good example. Even at only 12 seconds I couldn't ABX them whole. But sliding around listening to parts here or there the 2.9 second segment I settled on was discernibly different.

And the way I have been suggesting for a long time as well, just to expand on your post worth emphasising that this does not mean switching between a-b-x sequentially while the audio is playing but as you say focus on a specific "event" over a very small time frame/window and keep replaying that event and then switching for comparison (with reiteration again).
Switching sequentially while audio is still playing means the event (anchor point) is gone, however I appreciate this is rather oversimplifying because the anomaly/trait/behaviour may appear multiple times in the segment and comes down to knowing what to listen for/isolation and being consistent with tracking that anomaly-trait.
Everything else on this has been mentioned already so leaving it at that, but thought the above added nicely to esldude's post.

Edit:
The above is in the context of this thread and audibility in terms of a specific anomaly/trait, rather than looking for a broader behaviour-mechanism such as FR/loudness/time sync-offset where of course sequentially switching continuous played music can help.
Cheers
Orb
 
And the way I have been suggesting for a long time as well, just to expand on your post worth emphasising that this does not mean switching between a-b-x sequentially while the audio is playing but as you say focus on a specific "event" over a very small time frame/window and keep replaying that event and then switching for comparison (with reiteration again).
Switching sequentially while audio is still playing means the event (anchor point) is gone, however I appreciate this is rather oversimplifying because the anomaly/trait/behaviour may appear multiple times in the segment and comes down to knowing what to listen for/isolation and being consistent with tracking that anomaly-trait.
Everything else on this has been mentioned already so leaving it at that, but thought the above added nicely to esldude's post.

Edit:
The above is in the context of audibility in terms of a specific anomaly/trait in context of this thread, rather than looking for a broader behaviour-mechanism such as FR/loudness/time sync-offset where of course sequentially switching continuous played music can help.
Cheers
Orb

Yes, I for instance listen to A, B, A, B, A,B and then x. Amir does this a bit different. But each time I was listening to a couple seconds. A to B meant it played A and then played B using the same segment.
 
.......

Post #2 was about the AIX music files. Which subsequently have been found to have a level difference and what looks to be a sub-sample timing shift.....
Just to add Amir pointed out the last set of AIX music files did not have a level difference and only the sub-sample time shift; think this is mentioned just a few pages back by Amir in response to one of my posts.

Cheers
Orb
 
Great, Tim, it's that sort of stress that can lead to a null result - did you do either :)?

Did them both, and then some. It's amazing how much you can get done when a job isn't taking up most of your time.

Tim
 
...expert listeners are a requirement for any listening test that is searching for small differences.

The above has been generally understood for decades if not more like half a century or more. The controversy has been over what constitutes such a being, and how such beings might come into existence.

Historically, there have been a striking lack of objective means that were used to establish whether or not a certain individual was a trained listener, and for what? Most so-called trained listeners were self-appointed and allegedly proved their mettle by means of sighted evaluations.

But up to now, it was assumed that there is no such class of listeners.

?????

Therefore we could take the outcome of any blind tests, regardless of the skill of listeners, and apply it to the audiophile population at large.

??????

I don't remember reading about any of the DIY ABX tests that had expert listeners.

Would seem to speak to a lack-than encyclopedic knowledge of the art of listening over the past 40 or more years.

It was thought until now that such ABX tests could not generate positive results.

Ditto. A key requirement for positive results in ABX tests is the existence of an actual audible difference. Since the previous gold standard for listening evaluations was the totally-flawed sighted evaluation method, there has been a lot of confusion about what constitutes an audible difference. Scientists who had been doing reliable listening tests for the better part of a century knew, but audiophiles and audio practitioners had been largely kept in the dark.

This is the most important lesson here. It should lead to more careful tests and more reliable results.

It did, and has for decades.
 
ArnyK, given your many objections, how would you now run the test?

If I had a free choice in the matter a lot more tests would be run and reported, and there would be more careful proctoring.

I suspect that far more tests are being run than reported, but the unreported tests mostly had negative outcomes. Rather than be labelled a tin ear, people just keep their negative outcomes to themselves.

While the test tools themselves could have more self-checking, there's nothing to keep someone from totally gaming the system and providing utterly fraudulent results.

For example, falsifying the test logs is trivial. Falsifying the identity of the files being compared is trivial. Test gear could be attached to the monitoring system to tell the desired results. While an attempt was made to put some self-testing of the monitoring system into the procedures, the self-tests themselves can be misapplied and misinterpreted. It goes on...
 
Well to listen and see if you find areas that sound slightly different to you. To narrow down and listen intensively to short segments. If it involved an entire song of a few minutes you don't listen to whole songs. You find a segment of a handful of seconds and compare those. And your files were a good example. Even at only 12 seconds I couldn't ABX them whole. But sliding around listening to parts here or there the 2.9 second segment I settled on was discernibly different.

That is good practice but it is hardly a recent innovation. Virtually every musical selection has portions that don't elicit the most positive results for differences, and others that do. Finding them is of the essence.

For experienced listeners, the lack of discussion of this issue over the decades by golden ears and all that heavy breathing about long term listening was a big tell - these people were probably not actually hearing differences. Instead they were just reporting the false positives that are part and parcel of sighted evaluations.
 
This would also be true in the cases where no difference was heard-thus invalidating the null results.

While that is true, negatives in general don't seem to get nearly as much publicity, and for obvious reasons.

In which case, what did you hope to achieve by putting the files up for testing

See what happens. ;-)

It is like throwing food into a pond that is full of hungry fish.
 
No one here, you included, I am quite sure, actually hears the >20kHz content of the ' high rez' files. *Possibly* the bitdepth difference is audible, depending on how the conversion was done, and how abnormal the listening was (the fact that two variables -- SR and bit depth -- were changed is one of the issues making these tests problematic). Beyond that, we are left with either artifacts (from conversion, from the software , from the hardware) introduced into the audible band...or phenomena unknown to science-- and extraordinary claim require far more extraordinary evidence that provided thus far.

Well said.
 
If I had a free choice in the matter a lot more tests would be run and reported, and there would be more careful proctoring.

I suspect that far more tests are being run than reported, but the unreported tests mostly had negative outcomes. Rather than be labelled a tin ear, people just keep their negative outcomes to themselves.

While the test tools themselves could have more self-checking, there's nothing to keep someone from totally gaming the system and providing utterly fraudulent results.

For example, falsifying the test logs is trivial. Falsifying the identity of the files being compared is trivial. Test gear could be attached to the monitoring system to tell the desired results. While an attempt was made to put some self-testing of the monitoring system into the procedures, the self-tests themselves can be misapplied and misinterpreted. It goes on...
So, If I read you correctly, your only solution for a valid test is to have the test proctored by someone like J_J ?
I take it then that we should, according to this view, discount all unproctored ABX results (positive & null) as suspicious & discount any conclusions/indications/suggestions arising from them?
 
Yes, I for instance listen to A, B, A, B, A,B and then x. Amir does this a bit different. But each time I was listening to a couple seconds. A to B meant it played A and then played B using the same segment.

Yeah my approach (long time ago) was similar to Amir's; use A-B to identify/isolate/target "event", and then when it comes to the test use only A or B and never both to X.
Listening to both A-B-X can conflict the anchoring (anchoring in this situation being the "event" and the anomaly-trait) and with that confusion the "A" select bias can be triggered.
This bias is well known that when a person is not sure they are likely to follow a cognitive decision bias, for many that would usually be to default and choose a specific value/button/etc.
Quite a long time ago I provided a number of credible biases/jnd/perception testing research papers on this subject that also go beyond usual scope of AES; was in another veeery lengthy thread on biases or abx, or both and well lost in all those pages in their threads :)

Cheers
Orb
 
Last edited:
Yep. Orb, tricky business eliminating ALL biases but that area of discussion got bogged down in this thread (& seems like the other thread suffered a similar fate also?) not so long ago which is a pity as there is a lot more learning to be done in that subject.
If you have links to those papers handy & it's not too much bother I would be interested in reading them, please?
 
Yep. Orb, tricky business eliminating ALL biases but that area of discussion got bogged down in this thread (& seems like the other thread suffered a similar fate also?) not so long ago which is a pity as there is a lot more learning to be done in that subject.
If you have links to those papers handy & it's not too much bother I would be interested in reading them, please?
Sorry to say would be a right pain in backside to find as all the papers are done by university or lab-research and I spent many hours on this last time putting the right papers together and they are not all digital so had to find paper and then do internet search for others here to read :(
If I find a few will PM you though.
Cheers
Orb
 
No problem, Orb, I thought they might be tricky to find.
 
So, If I read you correctly, your only solution for a valid test is to have the test proctored by someone like J_J ?
I take it then that we should, according to this view, discount all unproctored ABX results (positive & null) as suspicious & discount any conclusions/indications/suggestions arising from them?

I'm not Arny, but no, I don't think you read correctly. He's saying that he would prefer better monitoring and control to prevent possible errors and cheating that could invalidate the results. It sounds like you're back on your personal definition of valid again, the one that makes the absence of perfect (by your definition) controls and protocols the same as none (sighted long-term listening). This is a misuse of "valid" in the service of rationalization...

val·id

(of an argument or point) having a sound basis in logic or fact; reasonable or cogent.
"a valid criticism"

...and it invalidates your argument. :)

Tim
 
Tim, you seem to be back to your circular argument, again when you say "could invalidate the results" do you mean you choose which results are valid - on what basis do you decide? From what I can make of his statement, all ABX results which don't have careful proctoring therefore "could be invalidated". My understanding is that this "could invalidate" is sufficient grounds for dismissal of the test.

Misuse of "valid" - semantics, perhaps, Tim? Rigorous, robust, I really don't care what you call it - if I can't trust a test result, I call it invalid, you might have other terms for it.

And what ArnyK was saying is that he doesn't trust these results. This applies to both positive & null results.

Now, I suggested some controls that might introduce more rigour, robustness, validity to the test & hoped that it might be a seed for other suggestions. ArnyK's answer was along the lines that he wanted more tests & reported results along with more careful proctoring. As regards controls, he indicated that there were too many to cover - it ended with the sentence "It goes on......."

So my conclusion based on his logic is that all past & future ABX results should be discounted unless his criteria are met. Do you have a problem in this statement/logic, just an issue with the use of the word "valid" or both aspects?

P.S. I suspect that what you are trying to say is along the lines of "science isn't infallible but that is not a good reason to throw out it's conclusions as invalid"
 
Last edited:
I'm not in stuck the circular argument, John. You are. I'm just sticking my head in from time to time to counter that erroneous argument with a bit of reality. Validity, in the sense that you have thrown it around here, is not a question of semantics. You have very plainly stated that if JJs list of controls and protocols is not met completely, any test is invalid and no better than no test at all (i.e.: long-term casual listening). You are wrong, and your error is rooted in a misunderstanding or misuse of the term "valid." You keep misusing it, so you are the one who is circling. It's pretty simple, really.

Tim
 
I'm not in stuck the circular argument, John. You are. I'm just sticking my head in from time to time to counter that erroneous argument with a bit of reality. Validity, in the sense that you have thrown it around here, is not a question of semantics. You have very plainly stated that if JJs list of controls and protocols is not met completely, any test is invalid and no better than no test at all (i.e.: long-term casual listening). You are wrong, and your error is rooted in a misunderstanding or misuse of the term "valid." You keep misusing it, so you are the one who is circling. It's pretty simple, really.

Tim

+1here..

What I find most interesting is how long term listening according to that view would be the superior alternative to the "flawed" forums ABX.. Well we know where everybody stands .. You can have the last word John ;)
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing