Measurements & the stereo illusion

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
As our perception of the location of a sound is based on time & level differences of the sound reaching each ear, the sound stage illusion is created for us primarily using these factors in the signal (room reflections, etc are also at play but let's ignore them)

As sound stage is one of the improvements most often reported in listening to better quality reproduction, I wondered why there wasn't a test which measured this inter-channel timing & level. Another factor often mentioned with better reproduction systems is that they expose more subtle detail in the sound - is this one reason why they produce a more realistic sound stage - it goes down to a lower signal level & so is more like what we hear in nature & the illusion is more real? As regards timing, I don't know of any tests that have tested the inter-channel subtle timing differences that are also part of the picture.

Maybe there is such a test - I asked on another thread but got no answer but perhaps my question was lost in the noise or maybe there is no test :) so I decided to open a thread to explicitly ask this question again & discuss what are the implications of having (or not having) such a test. If such a test exists I would like to see some results. If no such test exists are we not missing a crucial primary test that is essential to evaluating a stereo audio system & goes some way to connecting measurements to perception?
 

Phelonious Ponk

New Member
Jun 30, 2010
8,677
23
0
Define "inter-channel subtle timing differences."

Tim
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Define "inter-channel subtle timing differences."

Tim

Good question. I guess one needs to go to psychoacoustics to find this out. If we are hearing a point source, we judge it's distance & location by the differences in timing & level of the soundwave impinging on each ear - the Head Related Transfer Function (HRTF). Three cues seem to be responsible for how we localise & perceive sound in the horizontal plane - ILD (interaural Level Differences), ITD (Interaural Time Differences) & IC (Interaural Coherence). The spatial impression (width) of the audio event is related to IC

In a stereo system we use ICTD, ICLD & IC where IC stands for Inter-channel. Fast changes in ICTD & ICLD can be detected but seem to be interpreted as changes in width rather than fast movements. Also, it seems that perceived spatial attributes of a signal are dominated by the first few milliseconds of the signal onset, ignoring the remainder of the signal.

The best answer to your request for a definition is probably given in the literature that deals with spatial audio encoding - MP3, etc.
This AES paper gives the explanation/definitions you are looking for & is a good example of the type of testing suitable for audio reproduction system differences "Improved Time Delay Analysis/Synthesis for Parametric Stereo Audio Coding"

Edit: Maybe this gives a better handle on the topic: http://asmp.eurasipjournals.com/content/2008/1/618104 "Estimation of Interchannel Time Difference in Frequency Subbands Based on Nonuniform Discrete Fourier Transform"
Generic BCC (Binaural Cue Coding) scheme estimates ICTD in frequency subbands partitioned according to psychoacoustic critical bands [9]. When DFT is used to implement time-to-frequency transform, the subband bandwidth in the range of low frequency is much narrower than that in the high frequency range due to the uniform sampling. However, to account for human auditory perception, spatial cues contained in low-frequency subbands are more important than those in high-frequency subbands.
 
Last edited:

JackD201

WBF Founding Member
Apr 20, 2010
12,316
1,426
1,820
Manila, Philippines
Well, J-J just joined the forum. Hopefully he'll chime in.
 

Whatmore

Well-Known Member
Jun 2, 2011
1,011
2
438
Melbourne, Australia
Good question. I guess one needs to go to psychoacoustics to find this out. If we are hearing a point source, we judge it's distance & location by the differences in timing & level of the soundwave impinging on each ear - the Head Related Transfer Function (HRTF). Three cues seem to be responsible for how we localise & perceive sound in the horizontal plane - ILD (interaural Level Differences), ITD (Interaural Time Differences) & IC (Interaural Coherence). The spatial impression (width) of the audio event is related to IC

In a stereo system we use ICTD, ICLD & IC where IC stands for Inter-channel. Fast changes in ICTD & ICLD can be detected but seem to be interpreted as changes in width rather than fast movements. Also, it seems that perceived spatial attributes of a signal are dominated by the first few milliseconds of the signal onset, ignoring the remainder of the signal.

The best answer to your request for a definition is probably given in the literature that deals with spatial audio encoding - MP3, etc.
This AES paper gives the explanation/definitions you are looking for & is a good example of the type of testing suitable for audio reproduction system differences "Improved Time Delay Analysis/Synthesis for Parametric Stereo Audio Coding"

Edit: Maybe this gives a better handle on the topic: http://asmp.eurasipjournals.com/content/2008/1/618104 "Estimation of Interchannel Time Difference in Frequency Subbands Based on Nonuniform Discrete Fourier Transform"

I may be wrong as I haven't read them, but don't your links/definitions contain the answer to your original question?
 

jeromelang

Well-Known Member
Dec 26, 2011
438
66
935
As soon as the audio signal enters into the acoustical domain, the results are anything but predictable....
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Well, J-J just joined the forum. Hopefully he'll chime in.
Yea, it's a question just up his street!! What I'm mainly interested in is if these tests have been run to reveal differences in stereo reproduction hardware, rather than
in audio encoding methodologies.
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
I may be wrong as I haven't read them, but don't your links/definitions contain the answer to your original question?

These links have all got to do with encoding schemes (MP3 or Binaural) - I haven't seen any such tests run against ordinary playback hardware to reveal possible differences between them.
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Some other datapoints from here:http://en.wikipedia.org/wiki/Precedence_effect
The precedence effect or law of the first wavefront is a binaural psychoacoustic effect. When a sound is followed by another sound separated by a sufficiently short time delay (below the listener's echo threshold), listeners perceive a single fused auditory image; its perceived spatial location is dominated by the location of the first-arriving sound (the first wave front). The lagging sound also affects the perceived location. However, its effect is suppressed by the first-arriving sound.

The precedence effect appears if the subsequent wave fronts arrive between 2 ms and about 50 ms later than the first wave front. This range is signal dependent. For speech the precedence effect disappears for delays above 50 ms, but for music the precedence effect can also appear for delays of some 100 ms

In two-click lead–lag experiments, localization effects include aspects of summing localization, localization dominance, and lag discrimination suppression. The last two are generally considered to be aspects of the precedence effect:[8]
Summing localization: for time delays below 2 ms, listeners only perceive one sound; its direction is between the locations of the lead and lag sounds. An application for summing localization is the intensity stereophony, where two loudspeakers emit the same signal with different levels, resulting in the localized sound direction between both loudspeakers. The localized direction depends on the level difference between the loudspeakers.
Localization dominance: for delays between 2 and 5 ms, listeners also perceive one sound; its location is determined by the location of the leading sound.
Lag discrimination suppression: for short time delays, listeners are less capable of discriminating the location of the lagging sound.
For time delays above 50 ms (for speech) or some 100 ms (for music) the delayed sound is perceived as an echo of the first-arriving sound. Both sound directions are localized correctly. The time delay for perceiving echoes depends on the signal characteristics. For signals with impulse characteristics echoes are perceived for delays above 50 ms. For signals with a nearly constant amplitude the echo threshold can be enhanced up to time differences of 1 to 2 seconds.
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
It's amazing when you ask a question how you begin to find the answer yourself :)
When I was learning programming we had a phenomena where in the short time it took for the instructor to reach your computer desk you had usually come up with the solution - it led to the concept of putting cardboard cutouts of the instructor behind each computer desk :)

Anyway, here's an interesting demo of the trade-off between ILD (level) & ITD (timing) http://auditoryneuroscience.com/topics/time-intensity-trading
It demonstrates how level & Time are interrelated in sound location

In it we can hear how even 0.22mS shifts the localisation of the sound somewhat - so maybe this begins to address Tim's request for a definition of "inter-channel subtle timing differences" - if by that he also meant a value range.
 
Last edited:

Phelonious Ponk

New Member
Jun 30, 2010
8,677
23
0
I'm sure the relative timing and level of the two stereo channels can be measured. Is it? I don't know, but in my years in the audiophile community, I can't recall it ever having been presented as much of an issue. And I can say that in the studio, you can grab each instrument and move them where you want them in the horizontal plane. When you've got good recordings and good monitoring and the signal is dry (time-based effects, natural or electronic, tend to blur imaging), that movement is so substantial it is almost visual. In fact, when horizontal imaging is really presice - active monitors in well-treated mixing rooms (over treated for home listening), headphones - it is almost too much. A nice illusion that helps replace the visual reference for some, but not really natural.

I think solutions to this may not have been discussed or explored in audio hardware because it's not a problem in audio hardware. At least that's what my ears tell me.

Tim
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
I'm sure the relative timing and level of the two stereo channels can be measured. Is it? I don't know, but in my years in the audiophile community, I can't recall it ever having been presented as much of an issue. And I can say that in the studio, you can grab each instrument and move them where you want them in the horizontal plane. When you've got good recordings and good monitoring and the signal is dry (time-based effects, natural or electronic, tend to blur imaging), that movement is so substantial it is almost visual. In fact, when horizontal imaging is really presice - active monitors in well-treated mixing rooms (over treated for home listening), headphones - it is almost too much. A nice illusion that helps replace the visual reference for some, but not really natural.

I think solutions to this may not have been discussed or explored in audio hardware because it's not a problem in audio hardware. At least that's what my ears tell me.

Tim
Well, I have a tenuous theory that maybe the differences heard/noted in good replay systems - solidity of sound stage & reproduction of lower level signals, may well be related to improvements in the stability of these cues - inter-channel level & timing (ICLD & ICTD).

I don't accept your statement "it's not a problem in audio hardware" - have you tried the link to the demo I gave? Do you hear the shift in sound from left to right resulting from changes in ICLD &/or ICTD? Imagine what would be the result if ICLD & ICTD is not as stable as we encounter in real-life soundscapes - we would perceive it as an unnatural sound. A more stable reproduction of these cues would give a more natural sound. How can you say it's not a problem, if this is the case & we have no measurements for it?
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Here's a test to see how sensitive you are to ICTD time differences (some people aren't): http://auditoryneuroscience.com/topics/binaural-cue-demos
Individuals tend to vary somewhat in how sensitive they are to ITDs, and if your sensitivity to ITDs is very low then you may not hear any movement in the first example, and not much difference in the 2nd to 4th examples. (However, it is also possible that the sound card or sound software on your computer is cutting corners and not reproducing ITDs accurately, as seems to be the case for example on my Acer Aspire 1810 laptop). In contrast, if you are very sensitive to ITDs, you may find that the first example gives more compelling movement than the 2nd. Your individual sensitivity to ITDs will affect how much movement you hear in the 4th example, if any, and in which direction.

Running directly from my laptop's h/phone output, I can hear the location changes in all 4 examples (without looking at the screen for prompting :))
 
Last edited:

Kal Rubinson

Well-Known Member
May 4, 2010
2,361
702
1,700
NYC
www.stereophile.com
Anyway, here's an interesting demo of the trade-off between ILD (level) & ITD (timing) http://auditoryneuroscience.com/topics/time-intensity-trading
It demonstrates how level & Time are interrelated in sound location.
This is definitely related to the two discrete brain mechanisms that distinguish interaural differences for localization.

Above 2KHz, intensity differences are created by the shadow of the head. One mechanism opposes L/R amplitude signals in a mutually inhibitory circuit to create a (different) difference on each side of the brain. Input from one side excites ipsi LSO and, via the MNTB, inhibits contra LSO. Both LSOs needed for full range of horizontal source positions.
Picture2.jpg

The other uses arrays of neurons on each side of the brain each of which receives input from both sides but with pathways that introduce subtle timing delays. When the introduced latency equals the actual difference at the two ears, that neuron signals the specific laterality. MSO neurons respond best to simultaneous contralateral and ipsilateral input (coincidence detection). Most useful below 3KHz. Picture1.jpg
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Thanks Kal,
Yes, I saw a nice animation of the last diagram in your post - called the Jeffries model which is a theory of the mechanism of action
http://auditoryneuroscience.com/topics/jeffress-model-animation

It also states this "Whether the Jeffress model, such as it is shown here, is really a good description of the operation of the mammalian MSO is becoming increasingly controversial."
 

Phelonious Ponk

New Member
Jun 30, 2010
8,677
23
0
Well, I have a tenuous theory that maybe the differences heard/noted in good replay systems - solidity of sound stage & reproduction of lower level signals, may well be related to improvements in the stability of these cues - inter-channel level & timing (ICLD & ICTD).

I don't accept your statement "it's not a problem in audio hardware" - have you tried the link to the demo I gave? Do you hear the shift in sound from left to right resulting from changes in ICLD &/or ICTD? Imagine what would be the result if ICLD & ICTD is not as stable as we encounter in real-life soundscapes - we would perceive it as an unnatural sound. A more stable reproduction of these cues would give a more natural sound. How can you say it's not a problem, if this is the case & we have no measurements for it?

Let's deal with level, as it can very easily be understood by the non-technical (me). I have no doubt that the aural shift occurs when the Inter Channel Levels are changed in these experiments. That is exactly what happens to individual instruments when you turn the pan control on a mixing board or to the finished mix when you turn the balance knob on your preamp: You change the relative levels of the channels. The question is do inconsistencies like the ones that have been created for the experiment occur in quality playback equipment? I don't know the answer, but my ears tell me no, because good monitoring systems/rooms and headphones can achieve horizontal imaging that is actually more stable than what we encounter in real-life soundscapes, and that imaging relaxes, and becomes more natural, when it is smeared a bit by natural room acoustics.

Can it be measured? Yes I'm sure you could measure the relative level of a signal in the channels of a stereo playback system. Feed a mono signal to both channels. Measure the outputs. It will tell you nothing about the system's ultimate ability to image in the horizontal plane that couldn't be altered by the mix, the room, or the balance control on your preamp, but I can't think of a single good reason why you couldn't measure it.

Maybe I'm missing something...

Tim
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Let's deal with level, as it can very easily be understood by the non-technical (me). I have no doubt that the aural shift occurs when the Inter Channel Levels are changed in these experiments.
Can you also hear shifts in location when the timing is changed? I'm interested in your reply as it directly relates to your last question of this post "What am I missing". You have steadfastly maintained that you hear no differences between systems where others have steadfastly reported hearing differences & I'm interested in why.
That is exactly what happens to individual instruments when you turn the pan control on a mixing board or to the finished mix when you turn the balance knob on your preamp: You change the relative levels of the channels. The question is do inconsistencies like the ones that have been created for the experiment occur in quality playback equipment? I don't know the answer, but my ears tell me no, because good monitoring systems/rooms and headphones can achieve horizontal imaging that is actually more stable than what we encounter in real-life soundscapes, and that imaging relaxes, and becomes more natural, when it is smeared a bit by natural room acoustics.
In relation to level, I also doubt that differences of 3dB would happen between channels without noticeable degradation in the sound elsewhere.
Edit: However, I wonder what effect a fluctuating signal level might have on the sound stage. In other words, how much lower in level it needs to be before we sense it when it is fluctuating? Could it be below 1dB? Who knows unless tests/measurements are done? Maybe they have been?

Can it be measured? Yes I'm sure you could measure the relative level of a signal in the channels of a stereo playback system. Feed a mono signal to both channels. Measure the outputs. It will tell you nothing about the system's ultimate ability to image in the horizontal plane that couldn't be altered by the mix, the room, or the balance control on your preamp, but I can't think of a single good reason why you couldn't measure it.

Maybe I'm missing something...

Tim
Edit: You are maybe missing the idea of fluctuation & how much more sensitive we are to relative changes, rather than absolute changes i.e we can tell if a tone alters but not necessarily the absolute pitch of a tone. Yes, I think you may also be missing the time domain, Tim? Did you notice a shift in localisation in that demo when altering the ICTD (timing)? A timing of 0.22mS caused a shift in localisation according to my hearing. Is this timing difference possible in playback hardware/software? Has it been measured? At what level is this timing difference inaudible, 0.1mS, lower?
 
Last edited:

Robh3606

Well-Known Member
Aug 24, 2010
1,477
467
1,155
Destiny
In relation to level, I also doubt that differences of 3dB would happen between channels without noticeable degradation in the sound elsewhere.

Well look the frequency response windows that many manufacturers use for speakers. +/- 3 db in a good example and a very common widow spec. Unless you pair match your speakers your imaging can certainly suffer. Using those numbers you could have a worst case 6db difference between the two speakers in certain bands. Depending on where they were and how wide in bandwidth they are it could shift a mono noise signal left or right as you sweep through the frequency range.

Yes, I think you may also be missing the time domain,

Why would you be worried about timing errors between a pair of speakers?? I would assume they are symetrically placed in the room. They don't move and any timing errors between drivers should be the same for each and not change with the source.

Rob
 

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
Well look the frequency response windows that many manufacturers use for speakers. +/- 3 db in a good example and a very common widow spec. Unless you pair match your speakers your imaging can certainly suffer. Using those numbers you could have a worst case 6db difference between the two speakers in certain bands. Depending on where they were and how wide in bandwidth they are it could shift a mono noise signal left or right as you sweep through the frequency range.
As I said, it may not be the absolute level differences that matter but rather any fluctuation between speakers. I'm not suggesting that the speakers themselves fluctuate (although there may be group delay differences between different parts of the frequency spectrum which won't help project a realistic image) but rather the reproduction system upstream of the speakers

Why would you be worried about timing errors between a pair of speakers?? I would assume they are symetrically placed in the room. They don't move and any timing errors between drivers should be the same for each and not change with the source.

Rob
Again, it's not the speakers in isolation but the full reproduction system that I'm talking about.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing