The Upgrade Company

MarinJim · Apr 2, 2012

Very true. But what if you see blue, and most tell you it is red? Follow your heart, and if you feel something to be better or improved, so be it. It is your dime, I have no right to tell you how to spend it. But, I can tell you how I spend it.

Gregadd · Apr 2, 2012

I agree with that but, if the test is blind, why should it matter?

If one does not want to hear a diffeerence, how do we account for it ? E.G. If i have already decided brand X taste the same as Cocoa Cola, how can a blind test eliminate that bias? In a tube vs solid state the "gross " distortions of tubes should be ready appearent? The results are almost always consistent with guessing.

Not at all. The test failed to prove a difference. It did not prove there is none. So, we are back to square one unless the anticipated statistical analysis informs us otherwise.

I think we used some double negatives. "Being back to square one" and "being a wash is synonmous to me. Failing to prove the world is round is not proof that it's flat.

MarinJim · Apr 2, 2012

"In a tube vs solid state the "gross " distortions of tubes should be ready appearent".

I think you meant to say the "gross sterilization of ss should be ready apparent?"

Kal Rubinson · Apr 2, 2012

Gregadd said:
If one does not want to hear a diffeerence, how do we account for it ? E.G. If i have already decided brand X taste the same as Cocoa Cola, how can a blind test eliminate that bias? In a tube vs solid state the "gross " distortions of tubes should be ready appearent? The results are almost always consistent with guessing.

Right. That is why one needs a sufficiently large number of different subjects.

MylesBAstor · Apr 2, 2012

Kal Rubinson said:
Right. That is why one needs a sufficiently large number of different subjects.

The problem is also that the ABXers assume that 100% of the people will hear the difference. Say that's not so and only 30% of the listeners can hear a difference between DUT#1 and DUT#2. Then the number of needed participants to show a significant difference would swell to far beyond the ability to test. That's why, and I'm sure you're well aware of, that drug trials are inter-institutional so as to be able to accumulate patients and adequate stats.

Kal Rubinson · Apr 2, 2012

MylesBAstor said:
The problem is also that the ABXers assume that 100% of the people will hear the difference. Say that's not so and only 30% of the listeners can hear a difference between DUT#1 and DUT#2. Then the number of needed participants to show a significant difference would swell to far beyond the ability to test. That's why, and I'm sure you're well aware of, that drug trials are inter-institutional so as to be able to accumulate patients and adequate stats.

Indeed. That is one reason I was disappointed with the tests. Only 3 participants. Even if they repeated it ad nauseam, the results would not completely satisfy.

naturephoto1 · Apr 2, 2012

I participated in the listening session at Dennis Markley's home along with Jeff (Pepar). I had agreed to bring along my upgraded Onkyo 5508 and my upgraded Nuforce Edition Oppo 83SE. I was never told that we would be using Kal's ABX switching box until I arrived. I know that in private conversations with Ken (RUR) that he suggested that I should not participate in a test that I felt uncomfortable doing. But, I had arrived with the equipment. I do not like ABX testing and to say the least I was particularly nervous about the testing particularly under these conditions.

When we listened, we listened comparing the stock versus upgraded Onkyo 5508. From what I and what I believe the others heard in the "on the fly" listening portion of the testing to get accustomed to the session, we heard some differences that in many cases may be described as subtle. Regardless of Jeff's comments and suggestion that these were sighted tests, from where I was sitting I could not see or tell which machine was being used. It was generally thought that the upgraded unit made the sound of the music smoother, with less harshness/hash, like a veil had been lifted and it allowed more information through. The differences in the sound for many of the recordings was less than I believe actually were there and I attribute this at least in part to the switching box that was used. I know that others claim that the box should have no affect of should affect both units equally.

As to the actual ABX listening and testing, I admit, I did dismally particularly for the X aspect in which we tried to identify which unit was being played after we had listened to portions of pieces of music that we identified as to which machine performance we preferred. I did substantially better, but not great at identifying my preference for the sound of the upgraded unit. The same was true of Dennis. We all did better with the stereo than with the multichannel listening sessions.

Jeff seems to be taking great glee in the fact that we had difficulty in recognizing which machine was which in our sessions. But, again, I believe that if the tests were conducted differently and with wires running directly into the amps, etc. that there would have been more noticeable differences in the sound. I definitely do not like ABX testing, question their validity and may decide not to participate in this kind of testing again in the future. It puts undue pressure on the listener and I also believe that some subtleties may well be lost in this testing method. Additionally, I do not like the usage of a switching box and believe that it may well "level the playing field" to the point that it removes much of the advantage of a superior product. In addition, it may take substantially more time than only about 1 minute to recognize the performance of 1 product over another.

After Jeff had left, Dennis and I continued some listening and we added the upgraded NuForce Edition Oppo 83SE to the system for listening. Both Dennis and I thought that there was a cumulative affect of both machines working together, though we only listened and we did not perform any ABX testing. Additionally, Dennis and I also swapped out one of the new TimePortal Reference Power cords (which we also used for both Onkyo 5508 units) for a stock cord and we found that the TimePortal power cord did add to the performance of the Oppo and was particularly apparent in the bass output.

It is possible that Dennis may be given an opportunity to test an upgraded Onkyo 5508 for some period of time and perhaps another listening session will be arranged, but as I have indicated, I want a much different type of listening test if this is to take place.

Rich

fas42 · Apr 3, 2012

naturephoto1 said:
Jeff seems to be taking great glee in the fact that we had difficulty in recognizing which machine was which in our sessions. But, again, I believe that if the tests were conducted differently and with wires running directly into the amps, etc. that there would have been more noticeable differences in the sound. I definitely do not like ABX testing, question their validity and may decide not to participate in this kind of testing again in the future. It puts undue pressure on the listener and I also believe that some subtleties may well be lost in this testing method. Additionally, I do not like the usage of a switching box and believe that it may well "level the playing field" to the point that it removes much of the advantage of a superior product. In addition, it may take substantially more time than only about 1 minute to recognize the performance of 1 product over another.

Rich, I sympathise with you: the people who are gung-ho about ABX tests never seem to get it, that the differences that are discernable are subtle enough, and complex enough in the causes for the differences, that the actual procedure for doing the ABX can often swamp or just dilute the differences enough so the testing yields poor results

And if you point this out to them, they angrily stamp the ground, and demand, insist that this can't be the case: ABX is "scientifically valid" and therefore must reveal the truth. Scientists, of course, know that a lot more has to be taken into account if you want to be sure that a test is truly measuring what you think it's measuring ...

Frank

pepar · Apr 3, 2012

In science new data causes a theory to be questioned. In religion - and faith-based audio - new data causes the data to be questioned.

I maintain that there is NO TEST that could be devised and performed whose results would be accepted by Rich - and apparently some others here - if the modded unit was not a clear winner. There would always be something else at fault because the benefit of the mods is unquestioned by him. He KNOWS that they have improved his system. If testing doesn't bear that belief out, then the test was flawed.

Jeff

audioguy · Apr 3, 2012

Summary of this test: differences could be heard sighted but not blind. I've participated in blind tests (not ABX), failed miserably and realized I had foolishly spent about $6000 on equipment that I could not hear improvement in.

I am familiar with all of the arguments against all forms of blind testing. BUT FOR ME, if it is THAT hard to easily hear differences that cost real money, I spend my money elsewhere.

YMMV!

Gregadd · Apr 3, 2012

By that logic the pseudo science of ABX would be open to challenge?

OK let's get back on Topic.

naturephoto1 · Apr 3, 2012

audioguy said:
Summary of this test: differences could be heard sighted but not blind. I've participated in blind tests (not ABX), failed miserably and realized I had foolishly spent about $6000 on equipment that I could not hear improvement in.

I am familiar with all of the arguments against all forms of blind testing. BUT FOR ME, if it is THAT hard to easily hear differences that cost real money, I spend my money elsewhere.

YMMV!

Actually that is not the case. The differences seem to have been heard both "sighted" (as I say I could not see them around the corner on the floor) as well as in the AB portion of the testing (where we had a tendency to pick one of the two units for the sound preference) and there may have been a preference noted without statistically high enough levels. I in particular was under extreme stress and was unable to distinguish between the units in the X portion of the session.

Rich

fas42 · Apr 3, 2012

pepar said:
I maintain that there is NO TEST that could be devised and performed whose results would be accepted by Rich - and apparently some others here - if the modded unit was not a clear winner. There would always be something else at fault because the benefit of the mods is unquestioned by him. He KNOWS that they have improved his system. If testing doesn't bear that belief out, then the test was flawed.

Jeff

A bit silly. A simple, though expensive way: buy two unmodded units, confirm they are close enough in quality as is to make them equivalent. Have one modified, with no external indication of a change. Then slot one at random into your system until the setup stabilises, over a number of days if necessary; assess it. Then do this a number of times: should be good enough to get a clear result, or non-result ...

Frank

audioguy · Apr 3, 2012

naturephoto1 said:
I in particular was under extreme stress and was unable to distinguish between the units in the X portion of the session.

Rich

Maybe because you had a vested interest in the outcome - i.e. you wanted to not have wasted your $. As long as you are happy with your modded equipment, then the results of this test (scientific or otherwise) should not make any difference.

In the past, I spent HUGE amounts of $ on stuff because I wanted better sound, and after I installed the equipment sat in harmonious bliss listening to real (or imagined) improvements. It was only after I tried (with a friend volunteer) to validate what I (thought) I was hearing blind that I tried to re-direct the way I spent my audio dollars - in many cases, I simply could not consistently and reliably hear what I paid for. Crappy ears? Untrained listening skills? A system that was not of high enough resolution? Makes no difference. I could not hear the difference -- much less improvement.

BUT that's just me. (And I have saved a bunch of money since I made that choice).

Kal Rubinson · Apr 3, 2012

fas42 said:
Have one modified, with no external indication of a change.

This particular mod shop puts labels and seals on the modified product.

pepar · Apr 3, 2012

naturephoto1 said:
Actually that is not the case. The differences seem to have been heard both "sighted" (as I say I could not see them around the corner on the floor) as well as in the AB portion of the testing (where we had a tendency to pick one of the two units for the sound preference) and there may have been a preference noted without statistically high enough levels. I in particular was under extreme stress and was unable to distinguish between the units in the X portion of the session.

Honestly, Rich, the only pressure on you was what you placed on yourself for fear that you might not be able to tell the difference in a fair test, i.e the blind portion.

Otherwise, you participated in: selecting the recordings to be used in the blind test, selecting which passages in those songs were the best for revealing the differences, and deciding which recordings we could not hear a difference on. Of course, this was all done sighted with everyone having full knowledge which unit was being heard. It further involved us, the listeners, deciding when to switch ... many times back and forth with only a second or two spent per unit. In the sighted portion, we ALL heard the difference on the songs we had selected because we COULD hear a difference. Well, duh! And we ALL preferred the modded unit. Well .. double duh! We sat there and one of us would comment on what we just heard and the others would nod in agreement. Now it's "DOH!" and not "duh."

This sighted song selection was not hurried in any respect and came after you and Dennis had done some listening before I arrived. Then came lunch. For the blind portion, I went first and you stood, nay paced, around behind me and listened. Several times I had asked for a repeat of the A/B/X, so this "I was under pressure" is part of your subconscious rationalization on why the test was at fault.

Jeff

MylesBAstor · Apr 3, 2012

pepar said:
In science new data causes a theory to be questioned. In religion - and faith-based audio - new data causes the data to be questioned.

I maintain that there is NO TEST that could be devised and performed whose results would be accepted by Rich - and apparently some others here - if the modded unit was not a clear winner. There would always be something else at fault because the benefit of the mods is unquestioned by him. He KNOWS that they have improved his system. If testing doesn't bear that belief out, then the test was flawed.

Jeff

That's because of your parochial engineering viewpoint, ignoring the vast literature on the biology of hearing, testing and responses and statistics. Ever hear of the inverted U in relationship to perception? If not, then you should take a look at it. Ever read up on biological adaptation and our bodies reaction to stressors? Ever read the literature on how our hearing varies so widely from person to person as to make interaural testing methods worthless? Most of all, have you studied how long and short term memory works? If you had, you would find essentially all that ABX testing does is show that our short term memory has a limited capacity. Big whoopti do. That's been known for years.

Taking DBT testing methodology used say in the pharma industry, and applying it willy nilly to all situations doesn't work. And that holds true for many scientific tests and methodologies. This subject has been discussed ad nauseum on here many times.

naturephoto1 · Apr 3, 2012

MylesBAstor said:
That's because of your parochial engineering viewpoint, ignoring the vast literature on the biology of hearing, testing and responses and statistics. Ever hear of the inverted U in relationship to perception? If not, then you should take a look at it. Ever read up on biological adaptation and our bodies reaction to stressors? Ever read the literature on how our hearing varies so widely from person to person as to make interaural testing methods worthless? Most of all, have you studied how long and short term memory works? If you had, you would find essentially all that ABX testing does is show that our short term memory has a limited capacity. Big whoopti do. That's been known for years.

Taking DBT testing methodology used say in the pharma industry, and applying it willy nilly to all situations doesn't work. And that holds true for many scientific tests and methodologies. This subject has been discussed ad nauseum on here many times.

Thank you Myles.

At this point though as I have just posted in the AVS thread, I am tempted to just stop participating entirely in all audio forums.

Rich

amirm · Apr 3, 2012

This is a general comment to everyone: let's not ratchet up the tone. I suspect you enjoy having this conversation with your peers and prefer to stay and keep talking to you

. In that sense, let's consider each other friends in this hobby, collectively trying to better our understanding of audio. And that no animals are ever hurt in any such tests.

MylesBAstor said:
That's because of your parochial engineering viewpoint, ignoring the vast literature on the biology of hearing, testing and responses and statistics.

I am curious about something Myles. If there were gross differences, would you expect them to be heard in the type of test that was run?

pepar · Apr 3, 2012

MylesBAstor said:
That's because of your parochial engineering viewpoint, ignoring the vast literature on the biology of hearing, testing and responses and statistics. Ever hear of the inverted U in relationship to perception? If not, then you should take a look at it. Ever read up on biological adaptation and our bodies reaction to stressors? Ever read the literature on how our hearing varies so widely from person to person as to make interaural testing methods worthless? Most of all, have you studied how long and short term memory works? If you had, you would find essentially all that ABX testing does is show that our short term memory has a limited capacity. Big whoopti do. That's been known for years.

Taking DBT testing methodology used say in the pharma industry, and applying it willy nilly to all situations doesn't work. And that holds true for many scientific tests and methodologies. This subject has been discussed ad nauseum on here many times.

I find it interesting that all you need to do is to cast doubt on the test without any real data that invalidates it - or data that invalidates A/B/X testing for audio gear in general. So your bar is very low, while the one for us testors is impossibly high.

The Upgrade Company

New Member

WBF Founding Member

New Member

Reviewer

Reviewer

Reviewer

Member

Addicted To Best

New Member

WBF Founding Member

WBF Founding Member

Member

Addicted To Best

WBF Founding Member

Reviewer

New Member

Reviewer

Member

Banned

New Member

Similar threads