Conclusive "Proof" that higher resolution audio sounds different

I was JJ's (senior) boss.

Absolutely meaningless.

In this specific regard, it should have no weight at all. No one in the journal is vouching for ethics of the tests conducted by them.

Misdirection. The AES review board are not the same as the journal editorial staff.

http://www.aes.org/journal/authors/guidelines/

"Manuscripts are reviewed anonymously by members of the review board. After the reviewers' analysis and recommendation to the editors, the author is advised of either acceptance or rejection. On the basis of the reviewers' comments, the editor may request that the author make certain revisions which will allow the paper to be accepted for publication."

They have simply reviewed a paper and thought it rose above the minimum standard for publication.

False claim. The review board are not connected to the journal editorial staff and their identity is concealed from them and just about everybody else.

The test could have been completely fake and no one at the Journal would have caught it.

Misdirection. The Journal editorial staff are not supposed to catch such things. The Review Board are.

This kind of myth seems to not die. People keep thinking someone from the journal showed up and loo
ked over people's shoulders to make sure they were testing things correctly and audited the process when in reality no technical peer review is ever conducted that way.

It is possible that something like an on-site inspection might happen. Or the details of the experiment might inspect it by other means. The designated members of the Review Board do for sure interact directly with the authors. Presumably the interactions are designed to conceal the identity of the review board, but I can tell you for sure that their comments are not intentionally veiled. If you recognize someone's identity from their writing style, you might even be correct.

I have explained this so many times but the myth keeps getting repeated per above. I know you have seen my explanation Arny in various threads.

As demonstrated above I know for sure that your statements are not always unbiased or even just a little bit accurate. I have it on good authority that you are human and that carries with it the possibility even sometimes certainty of bias and errors.

Why do you keep propagating the myth?

Because there is no myth in what I say. I have intimate knowledge of some of these matters.

And remember, these are the credentials of the authors:


i-GwBpSHf-X2.png


They seem like pretty nice people but completely miss the standards for which you thought JJ is qualified.

The myth above is that all qualifications that are valid are identical.

They lack any prior experience in this field whatsoever.

That would be an example of making claims for which most lack the evidence to make. Please explain how you know Meyer and Moran say better than I do.

I should be clear that I trust everything they have written as not being fraudulent. Due to lack of controls however, I don't trust the results of their work.
[/quote]

I agree with some of your comments and disagree with others. I wouldn't have done the test the way they did, but I'm not the only judge of such things.

There is nothing trivial about it with respect to forum arguments.

Forum arguments are informal communications, and not in the same league as publications in refereed professional publications.

There has not been such a development or you wouldn't be here and on AVS arguing so hard to create doubt about it.

I'm hardly arguing at all. I've absented myself from this forum for weeks and months at a time.

The evidence is super strong in two areas:

1. Trained/expert listeners have far better abilities than masses of public or even "audiophiles." Lack of their use is against best practices of the industry/research community and hence seriously undermines any test which did without them.

Agreed.



2. Test created by Arny himself and positioned as an impossibility, going as low as saying 32 Khz sampling is transparent, was falsified. According to your own most, for some 14 (?) years no one had managed to pass such tests. But now multiple people have (to varying degree).

Agreed. However the evidence that has been obtained to date is miniscule and may have serious doubts associated with it.


#1 is 100% supported by industry/research practices. It is a new concept for many on forums but not in real world. People are now getting educated and hopefully won't go around saying results of one set of blind tests applies to everyone else.

I've been saying such things on forums for decades. I agree that many audiophiles have no clue about any of this and that even yourself Amir have provided the results of sighted evaluations as earnest proof of your magical audio beliefs in the past several years.

#2 speaks for itself. No longer can you say Arny that this and that test says our hearing is that dull. You created a test for that and we passed it. It doesn't get better than this as the commercial goes :).

I would seriously hope that it gets far better than this because I have seen far better than this many, many times.
 
Egregious false claim. Many of the tests I was involved with (I did not solely run the test mentioned above) were documented in various places such as Audio Magazine and Stereo Review.
Oh great. Can you please list them and explain what your involvement means in each one?

I don't think that making false claims about other people's work enhances the credibility of one's own work, which is a lesson that some seem to need to learn. In fact it seems to seriously detract from it. Based on this false claim which has been repeated in kind several times on AVS, I see good reason to discount every claim about listening test outcomes that some have made. If someone is going to make false claims like this, where will they stop?
It is not a false claim when over days and weeks I asked you for any documented double blind tests with your name on it, online or offline, and you would only provide the one in print which I post earlier. And even that one listing took a lot of effort.

But sure, no harm is done. Please provide the information so that we can analyze them.
 
Incorrect. A test proctored by the right people would go a long way to support the test's credibility. I'm not going to be bullied into saying that there is one and only one way that a test can be credible. That would be an excluded middle argument and I try very hard to avoid making them.
So who decides if it's credible - you? You are demurring from saying what controls are necessary for a credible test & yet you are calling the credibility of these positive results into question? I think you forget that tests are meant to be repeatable yet you are refusing to state what those criteria are, preferring, as you have done for these positive results, to introduce criteria after the test has been performed & you don't like the results. You may try playing to the crowd & call it bullying but this is testing 101, Arny.

Looks to me like yet another excluded middle bully-job. Assigning a value of less than utter certainty and sufficiency to the outcome of a minuscule number of tests is not the same as discounting them. A lot of the ABX tests I've been associated with involved dozens of listeners and a committee of experienced test organizers and proctors.

Would I rate the 1984 "Some amplifiers do sound different" tests far higher? Yes. For the time being I'll leave common sense explanations of why to others, in the hope that there are still people posting here who have some common sense left.
Again, I'm afraid I sense you playing to the crowd & attempting to play the "everyone knows" logical fallacy. Unfortunately, it goes back to my statement about those who have internalised a belief without thinking about it logically.

Everything in the real world has a weight which is somewhere between infinitesimal and huge but still possible to enumerate.

A minuscule number of positive results among a tiny population of mixed results that are obtained by means of an almost vanishing number of unsupervised individuals from various backgrounds is not very good evidence.
Nobody is saying that the case is closed, by any means or that a definitive answer has been decided, based on these results - all that's being discussed here is the test itself & how it might be improved for a more credible result. As the originator of the test files & the person who proferred it as a challenge, it seems odd that you now don't want to discuss it & how it might be improved. You offered none of the stipulations about proctoring or committees or IMD or cheating , etc. before laying down the challenge to take the test. It seems rather disingenuous to now state that these are credibility issues around the test. These credibility issues apply to both positive & null results so you have wasted everyone's time, apparently.

It grieves me to say stuff like this to adults because they should already know it well. Rush to judgement, anybody?
Oh dear!
 
Why is there still discussion of cheating? Does anyone seriously think that people are deliberately cheating? If so, then man up and make specific accusations. Cheating in the kinds of tests we are discussing is akin to cheating at solitaire.

It would be helpful if there were suitable test tools. I don't plan to do any more ABX tests until the "click" problem with PC ABX is fixed or until other software is available that avoids this problem. I am surprised that the PC ABX developers did not detect and/or fix this bug as its existence renders the software useless for its intended purpose. It took me less than five minutes to discover this bug after playing with the stop and start feature of the user interface.

It would also be helpful to investigate and characterize bit depth and audio bandwidth (sample rate). Testing one of these at a time makes sense. There are many alternatives to bit depth conversion and there are many alternatives to filtering associated with sample rate conversion. Current tools such as iZotope RX provide many choices of algorithms and/or parameters for each of these.
 
Absolutely meaningless.
Subjective opinion stated on a forum and hence, not worth anything. :D

Misdirection. The AES review board are not the same as the journal editorial staff.

http://www.aes.org/journal/authors/guidelines/

"Manuscripts are reviewed anonymously by members of the review board. After the reviewers' analysis and recommendation to the editors, the author is advised of either acceptance or rejection. On the basis of the reviewers' comments, the editor may request that the author make certain revisions which will allow the paper to be accepted for publication."
??? The link you provided says this: Journal Author Guidelines. And this is what I said: "No one in the journal is vouching for ethics of the tests conducted by them." My terminology is exactly the same as theirs whereas yours with "AES review board" and such is not.

That aside, the link completely torpedoes your position that publication by the journal provides assurances against fraud. As you see, they simply review a paper. They don't make site visits. They don't stand behind the people taking the tests. They don't interview witnesses. They don't audit the records. Nothing other than reviewing the paper which is what I said.

As I said, it is a complete myth that journal review provides the assurances you claim. Your own reply here confirms the same. It is a problem of not being in the industry/research community and learning what it means to review papers for a journal. Lay assumptions are made as if this is the FDA accepting drug research or some such thing.
 
Oh great. Can you please list them and explain what your involvement means in each one?

Not worth the trouble given how some have twisted and mangled what little that I have revealed such as the above.

It is not a false claim when over days and weeks I asked you for any documented double blind tests with your name on it, online or offline, and you would only provide the one in print which I post earlier. And even that one listing took a lot of effort.

But sure, no harm is done. Please provide the information so that we can analyze them.

It appears that some are not familiar with this truism: "Absence of evidence is not evidence of absence". Educating alleged intelligent adults about the meaning of that truism is not my job, especially given how many times I've already tried and failed. Many of my past DBT exploits are documented in the Usenet archives which appear to be online and freely accessible.
 
I don't. This is getting very tiring. I guess I need to point out a few things about statistics 101. Statistics are based on more than a tiny number of samples. The samples don't have to all go the same way, but in the end a lot of them do hoave to go the same way.
You made a statement that not all results are going the same way as if this was some basis for doubting the ones that do show positive results. As per your stats 101 it is only necessary for a statistically significant number of positive results to be confirmed & the greater the sample size becomes, the stronger that conclusion becomes.



I gave a real world example which would seem to suffice, perchance someone had no other background in statistics.
You mean the Meyer & Moran study? I think Amir has dealt with that enough times, in enough detail to be seen for what it is - a test run without sufficient controls & therefore lacking credibility

Here's a memory test: Please give the names of the two experimenters who are associated with a previous experiment related to high resolution audio that was published in the JAES. At this point, it seems like this salient fact has been forgotten and I need to do a sanity check...
Bios of those two have been presented, Arny & your credibility criteria fail again on both those counts.
 
Why is there still discussion of cheating?

Good question. I'd be happy to drop it.

Does anyone seriously think that people are deliberately cheating?

Let me introduce you to what appears to be an unfamiliar concept. It's called not knowing for sure. What is sure is that the tools are susceptible to an endless number of different cheats and errors.




If so, then man up and make specific accusations.

There's not enough evidence either way to justify specific accusations. However the level of emotion and bullying that have been seen raise some legitimate concerns.

Cheating in the kinds of tests we are discussing is akin to cheating at solitaire.

It appears that quite a bit of ego and emotion is present here that would not usually be present in an idle solitaire game.

It would be helpful if there were suitable test tools.

The tools are imperfect but they are not unsuitable for reasonable people.


I don't plan to do any more ABX tests until the "click" problem with PC ABX is fixed or until other software is available that avoids this problem.

Good. We will probably survive without any contributions from many people.

I am surprised that the PC ABX developers did not detect and/or fix this bug as its existence renders the software useless for its intended purpose.

That would be your opinion and one that is not universally held.


It took me less than five minutes to discover this bug after playing with the stop and start feature of the user interface.

I've explained it and explained the trade offs associated with it. Feel free to pay someone to write the ABX Comparator of your dreams or write it yourself.

It appears that the Foobar2000 ABX Comparator is far from the poorest or most flawed example of an ABX Comparator and many seem to be using the even more flawed ones. Go figure.

It would also be helpful to investigate and characterize bit depth and audio bandwidth (sample rate). Testing one of these at a time makes sense.

Fact is that almost nobody wants to spend the or so minutes it takes to perform a 12-16 trial test. Some complain about the comparator's minor inconveniences and others appear to be just too busy. I would make relevant test files if I had any assurance that someone would actually listen to them. Obviously, you've checked yourself off the list of candidates. That makes it easy for me to do nothing.

There are many alternatives to bit depth conversion and there are many alternatives to filtering associated with sample rate conversion. Current tools such as iZotope RX provide many choices of algorithms and/or parameters for each of these.


Just because the options can exist doesn't mean that anybody is interested in testing them, it seems.
 
You made a statement that not all results are going the same way as if this was some basis for doubting the ones that do show positive results. As per your stats 101 it is only necessary for a statistically significant number of positive results to be confirmed & the greater the sample size becomes, the stronger that conclusion becomes.

I said more than just what was mentioined above, and its apparently intentional exclusion is not exactly a sign of good will or careful work.

Earlier today I had to read 90 posts to correct someone's careless error attacking my work, which was then dismissed.

I understand the concept of accountability and why some people like to make false claims and then avoid taking responsibility for their statements.
 
I said more than just what was mentioined above, and its apparently intentional exclusion is not exactly a sign of good will or careful work.
In reply to your "Right now the amount of evidence we have is trivial, and it doesn't all go the same way." I replied "& why would you expect all results to go the same way"- you then proceeded to try to maintain that I knew little about statistics with your "statistics 101" comment. Your original statement was the one that needed a statistical correction & I was pointing out your basic error.

Your suggestion of lack of good will in those who don't agree with you & questioning the veracity of those who return results that you don't like, is becoming thematic, ArnyK (but then a greater sample size is needed to establish this as fact although we do have evidence of the same from AVS & other fora).
 
It seems to me that some people posting on this thread are looking to do two things.

1. To try to foster the notion that all past blind-tests were worthless based on some results from some other, non-independantly verified/observed blind-tests.

2. To try to foster the notion that all future blind-tests will be worthless unless they follow an exact, specific procedure defined by them.

I'm sorry but to me this is simply commercially driven, disingenuous nonsense.
 
Why is there still discussion of cheating? Does anyone seriously think that people are deliberately cheating? If so, then man up and make specific accusations. Cheating in the kinds of tests we are discussing is akin to cheating at solitaire.

It would be helpful if there were suitable test tools. I don't plan to do any more ABX tests until the "click" problem with PC ABX is fixed or until other software is available that avoids this problem. I am surprised that the PC ABX developers did not detect and/or fix this bug as its existence renders the software useless for its intended purpose. It took me less than five minutes to discover this bug after playing with the stop and start feature of the user interface.

It would also be helpful to investigate and characterize bit depth and audio bandwidth (sample rate). Testing one of these at a time makes sense. There are many alternatives to bit depth conversion and there are many alternatives to filtering associated with sample rate conversion. Current tools such as iZotope RX provide many choices of algorithms and/or parameters for each of these.

Well no need throw the baby out with the bathwater. You could still use the ABX software as long as you make sure it has time aligned samples to start with. A bother, but not that big of a bother.

I wholeheartedly agree about testing bit depth and sample rate separately. At least on the jangling keys file Amir is the only one I know who detected it with modern resampling. So I wish he would do a 96/24 vs 96/16 test and a 96/24 vs 48/24 test, but I am making no demands. That would be foolish.
 
It seems to me that some people posting on this thread are looking to do two things.

1. To try to foster the notion that all past blind-tests were worthless based on some results from some other, non-independantly verified/observed blind-tests.
We are now specifically talking about ARnyk's test files & results, Max - don't try to re-ignite a discussion from 20 pages ago.

2. To try to foster the notion that all future blind-tests will be worthless unless they follow an exact, specific procedure defined by them.
The correct procedure for blind testing have been laid out in various documents to which you have been directed. I didn't write them so it can't be said I define these criteria. Have you informed yourself about these documents, read any of them?

I'm sorry but to me this is simply commercially driven, disingenuous nonsense.
This suggestion of "commercially driven" is not worth answering but is interesting in what it says about mindset in all of this.
 
I said more than just what was mentioined above, and its apparently intentional exclusion is not exactly a sign of good will or careful work.

Earlier today I had to read 90 posts to correct someone's careless error attacking my work, which was then dismissed.

I understand the concept of accountability and why some people like to make false claims and then avoid taking responsibility for their statements.

Consider the source, then don't assume it was careless or erroneous. You are caught in a vortex Arny, where denial, moving goal posts, re-writing of your own positions and creative translations of simple English will make nonsense in the next post of something that was perfectly clear in the last. I'd tell you to run, but...well...better you than me. :)

Tim
 
We are now specifically talking about ARnyk's test files & results, Max

I'm aware of that, John.

don't try to re-ignite a discussion from 20 pages ago.

Please see above, and please refrain from giving me orders.

The correct procedure for blind testing have been laid out in various documents to which you have been directed

Correct as defined by who?

I didn't write them so it can't be said I define these criteria.

You are endeavouring to declare, to sell the notion of their perfection, and thus the imperfection of any tests that are not identical to them.

Have you informed yourself about these documents, read any of them?

Yes, where did you read that these test protocols are universally accepted as perfect, and are the only protocols that can be considered valid?

This suggestion of "commercially driven" is not worth answering but is interesting in what it says about mindset in all of this.

It's evident in your case, John by virtue of your insistence that, contrary to all logic, sighted listening is as valid as any blind-testing that doesn't conform to your self-declared perfect model for blind-testing, in terms of identifying audible differences.
 
Max, it's obvious by your questions that you haven't even looked at the title page of any of the documents so continuing with your line of discussion is fruitless & a waste of time.
 
Max, it's obvious by your questions that you haven't even looked at the title page of any of the documents so continuing with your line of discussion is fruitless & a waste of time.
John, I've read JJ's list, the one that you state is the only one that can ever be considered valid.

I've read too that according to you, not adhering to JJ's protocols does not invalidate a given test from which positive results, are the result.

I've seen you write that long-term listening can allow identification of differences that tests which adhere to the the standardised, double blind ABX testing methodology cannot, and that these tests are therefore somehow valid, even though they do not mirror JJ's testing methodology.

So many contradictions and declarations of what is valid and what isn't based on personal illogical beliefs.

Why bother reading any scientific papers if you are so prone to allowing an illogical faith-based belief-system to supersede their importance at any given time?
 
It seems to me that some people posting on this thread are looking to do two things.

1. To try to foster the notion that all past blind-tests were worthless based on some results from some other, non-independantly verified/observed blind-tests.
I am not sure if you are addressing this to me but just in case :), I have repeatedly asked Arny and others to present one test, just one test, which complies with best practices of the industry/research community. None has been presented.

2. To try to foster the notion that all future blind-tests will be worthless unless they follow an exact, specific procedure defined by them.
I don't know which exact test you mean but we have standards for such work called ITU BS1116. Even Arny has a "top 10" list that is a subset of this. Yet he and others routinely reference tests that don't even come close to complying with these best practices.

Do I throw out the results out of hand if they don't comply? No. But I read through them and if I find failing after failing, then I will not reference their data as being valid. They may just be valid but we don't know it because of the test. Imagine someone doing lab work with dirty hands and find bacteria in the sample. Would you trust the data? I assume not. The bacteria may indeed be in the sample but violate simple principal of not polluting the sample and we simply have to throw out the work and start over.

I'm sorry but to me this is simply commercially driven, disingenuous nonsense.
Commercially driven? How so? Do we get to accept any test even if this was so?
 
amirm,

Since this type of thread (objective vs subjective) seems to dominate the "General Discussion" category way too often (and often lead to the "thread closed" outcome) and it's relevance to "listening / enjoying music" is at best obtuse if not totally irrelevant to most folks, would you and the other mods consider creating an independent section that focuses on this topic versus allowing the ad nauseum / circular discussions that so often occur in the GD section.

Suggestion. ABX and other "controlled" listening tests.

Tim can be the moderator. :)

Just a thought.
 
John, I've read JJ's list, the one that you state is the only one that can ever be considered valid.

JJ's list isn't just JJ's list, its JJ's statement of a set of goals that many of us hold in common. Here's my version of the same basic ideas, written in my own words, but I believe first posted on the web some years earlier (year 2000):

Ten (10) Requirements For Sensitive and Reliable Listening Tests

(1) Program material must include critical passages that enable audible differences to be most easily heard.

(2) Listeners must be sensitized to a audible differences, so that if an audible difference is generated by the equipment, the listener will notice it and have a useful reaction to it.

(3) Listeners must be trained to listen systematically so that audible problems are heard.

(4) Procedures should be "open" to detecting problems that aren't necessarily technically well-understood or even expected, at this time. A classic problem with measurements and some listening tests is that each one focuses on one or only a few problems, allowing others to escape notice.

(5) We must have confidence that the Unit Under Test (UUT) is representative of the kind of equipment it represents. In other words the UUT must not be broken, it must not be appreciably modified in some secret way, and must not be the wrong make or model, among other things.

(6) A suitable listening environment must be provided. It can't be too dull, too bright, too noisy, too reverberant, or too harsh. The speakers and other components have to be sufficiently free from distortion, the room must be noise-free, etc..

(7) Listeners need to be in a good mood for listening, in good physical condition (no blocked-up ears!), and be well-trained for hearing deficiencies in the reproduced sound.

(8) Sample volume levels need to be matched to each other or else the listeners will perceive differences that are simply due to volume differences.

(9) Non-audible influences need to be controlled so that the listener reaches his conclusions due to "Just listening".

(10) Listeners should control as many of the aspects of the listening test as possible. Self-controlled tests usually facilitate this. Most importantly, they should be able to switch among the alternatives at times of their choosing. The switchover should be as instantaneous and non-disruptive as possible.

I believe that this list includes some items that others may not have mentioned.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing