Griesinger's teachings show up in Klippel, Linkwitz, Toole, and Geddes

Duke LeJeune · Dec 3, 2019

Hi Klaus,

Thank you for another well thought-out reply.

KlausR. said:
When a room sounds bad, it is generally reverb time which is too high. Speech intelligibility is a good indicator.

Well at least we agree that reflections done wrong can make a room sound bad!

I don't dispute that high reverb times are detrimental. Toole's book (3rd edition page 81) says that "typical reverberation times for domestic listening rooms and studio control rooms are 0.2 - 0.4 seconds." On the next page is a graph showing that the IEC target is between 0.25 and 0.6 seconds, and on the following page he writes "it is clear that a well-furnished domestic room can be a good basis for a listening environment, without any additional acoustical devices." In other words, reverb times are not normally an issue in typical domestic rooms.

And of course reverb times being potentially detrimental does not preclude early reflections also being potentially detrimental.

KlausR. said:
Well, I’m siding with Toole, not because it’s Toole, but because the relevant literature supports his point of view. That’s why I would like to see the literature Geddes is basing his differing opinion upon. Based on what I’ve read so far I disagree with him.

That's fine.

KlausR. said:
If you happen to have good relation with Earl, maybe you could ask him.

I would not be comfortable asking Earl to dig up references for me for the sake of an internet disagreement.

* * * *

Klaus, by any chance, do you ever attend any US audio shows?

The reason I ask is, our display system is switchable between a "Conventional" mode and a "Two-Streams Paradigm" mode, though most of the time we don't make switching back and forth part of our demo because it's a bit unwieldy. But I would like to demonstrate it for you, if the opportunity arises, and it would be good to meet you (regardless of what you think of our demo).

KlausR. · Dec 6, 2019

Hello Duke,

Duke LeJeune said:
Well at least we agree that reflections done wrong can make a room sound bad!
And of course reverb times being potentially detrimental does not preclude early reflections also being potentially detrimental.

Fully agree with both statements, so we finally are in agreement: reflections have the potential to do bad things. That’s what I always said, and the literature supports that: reflections may do harm under certain conditions, but they don’t do harm under all conditions.

I would not be comfortable asking Earl to dig up references for me for the sake of an internet disagreement.

Rather as basic question than related to our discussion here. Maybe I give it a try via linkedin.

Klaus, by any chance, do you ever attend any US audio shows?

I’m living in good ole Europe, so no, I won’t attend any US show any time soon.

The reason I ask is, our display system is switchable between a "Conventional" mode and a "Two-Streams Paradigm" mode, though most of the time we don't make switching back and forth part of our demo because it's a bit unwieldy. But I would like to demonstrate it for you, if the opportunity arises, and it would be good to meet you (regardless of what you think of our demo).

Yeah, would be nice to meet you in person, and to listen to those 2-stream-speakers, but I’m afraid, that’ll have to be in another life.

Klaus

jkeny · Dec 7, 2019

Coming late to this thread but I first came across the late ceiling splash speakers from Jim Romeyn when I supplied my Dacs to him & Duke for the invitation they received to setup a room in the 2017 RMAF innovation area. I believe they showcased the Azel speaker at that time & it was very well received. Duke, correct me if I'm wrong on any of this.

My first thoughts when Jim told me about some of the concepts behind the late ceiling splash speakers, reminded me of Griesinger's research on hall acoustics, envelopment, clarity, reverberation LOC etc. & the wider research of auditory scene analysis (ASA). (my initial email reply is below)

Maybe some way of reconciling the divergent views about this late reflections concept comes from a study of ASA which really has at it's core an understanding that perceptions are the brain's best fit analysis of the nerve impulses being received - the best fit being to the internal models/rules/knowledge/behaviour built up over time/experience from the natural world of sounds/sights. Why is it only a "best fit" & not an 100% accurate analysis? - because there is not enough information in the nerve impulses being received by the brain to derive a unique solution in the time required (real time)

Given this as my core understanding (others may disagree) for how our perceptions work, I derive a number of aspects, among which are some I believe are relevant to this discussion:
- a best fit analysis does not imply 100% accuracy so things like the scanning rate of TV screens are fine for almost all people & the image is considered normal
- the whole area of illusions in perception uncovers this "best fit" approach & shows that we use more than the signals themselves - we use knowledge about the behaviour of the sight/sound in the surrounding context. So take away the context & present the image/sound in isolation & the illusion disappears.

My take on the late reverberation idea is that even though the generated "late ceiling splash" may not be an accurate recreation of the recorded late reverberations but it may work fine in lots of circumstances for our auditory perceptual engine in that it can present a set of late signals which are grouped together (this is what ASA does) & categorised as the late reflections of the direct sound. I believe that this late signals need to have certain characteristics in order to be categorised as belonging to the direct signal - Harman, I believe demonstrated that speakers are preferred which have an off-axis reflection spectral makeup of a reasonably close match to the direct, on-axis, spectral makeup.

I believe the late splash itself can be modified in terms of timing & spectral content in order to tweak it to various rooms/conditions? Again Duke can correct anything I have wrong here

My initial response to Jim's emails in 2017 was:

Here's my take:
- First off I find Toole/Geddes phrase "second Look" is misleading in its simplicity - no offense meant but I think it hides some salient issues about auditory processing
- remember I said that once we understand that what we perceive is 90% processing & 10% signal, it changes our whole perspective on this hobby?
- ASA is the study of this processing & here's a bit of a summary of it's findings:

the data we get from our eardrum (nerve signals on the auditory nerve) are what linguists call "poverty of the data" - in other word there isn't enough info in the signals to uniquely solve the problem of auditory perception - creating an auditory scene in our brain that matches the outside physical world of sound objects.

In other words we need to analyse what signals belong to what objects - grouping them together & keeping track of this grouping as a solid auditory object through space & time.

There are many techniques ASA uses to do this but I believe what we are seeing with your technique is a perfect example of this grouping in operation

ASA relies heavily on pattern matching & learns these patterns as babies & throughout life by experiencing the behaviour of sound objects in the real world. So we have embedded this as the template by which all sounds are interpreted.

In the real world, such as a large hall, we hear the direct sound & we hear its reflections at > 10mS later. We group these two sets of sounds together even though they are separated in time because their characteristics best match one another (timing, spectral fingerprint, harmonics, etc)

So it's not that we get a "second look" at this - it's that we get a better match to what we intrinsically know happens in the real world - a reflection which is spectrally & temporally close to the main sound.

When we use just a stereo pair of speakers we are relying on its off-axis response to create these reflections & most speakers are not producing anything like the amplitude & spectral accuracy to the main sound

so your clever bit of lateral thinking, is not to try to improve the off-axis response of the main speakers but to swamp it's off-axis behaviour with a controllable better match to the main sound

ASA will naturally group this with the main sound as it is the best match to what it knows & expects from real world experience - it essentially discards the off-axis stuff in favour of the better match.

That's why listening is more relaxed - our processing system is not straining trying to solve the mismatches it's finding between the off-axis reflections & the main sound - this processing task tires us

Hope all that makes sense?

Duke LeJeune · Dec 7, 2019

jkeny said:
Coming late to this thread but I first came across the late ceiling splash speakers from Jim Romeyn when I supplied my Dacs to him & Duke for the invitation they received to setup a room in the 2017 RMAF innovation area. I believe they showcased the Azel speaker at that time & it was very well received. Duke, correct me if I'm wrong on any of this:

Hello Jkeny, thank you very much for your in-depth and illuminating analysis commentary!

Dude, you just shifted my paradigm. I think your analysis is "the best fit"... including your insight about perception being 90% analysis and 10% signal. Thank you for bringing your informed perspective as an expert in the field of signal processing to this thread... and THANK YOU for letting us use your Ciunas DAC and your power supply (for the Ultra Rendu) at that 2017 RMAF show!

For those wondering what he's referring to, at RMAF 2017 we showed a floorstanding speaker which incorporated an upwards-and-backwards firing coaxial array in the back of the cabinet, down close to the floor. This rear-firing array's job was to contributed some spectrally-correct, relatively late-onset reverberant energy. This energy was aimed to bounce from wall to ceiling to listening area, to maximize how long it took to arrive without needing as much distance from the wall as a bipole or dipole speaker. Using a remote we could toggle between having the rear array on and off.

The rear firing array was not very loud relative to the front-firing array, and when it was on its calculated impact on the SPL at the seated position was an increase of less than 1/2 decibel. I'd ask people whether they were hearing "more of the recording" or "more of the hotel room" when the rear-firing array was engaged, and nobody ever said "more of the hotel room".

Having read your analysis, Jkeny, here is what I think was going on (and please feel free to correct me!): With the rear-firing arrays engaged, the ears' "best fit" as to what it was hearing was tipped towards the "soundscape on the recording", and correspondingly away from "hotel room".

DaveC stopped by the room and included some commentary in the opening post of this thread: https://www.whatsbestforum.com/threads/davecs-best-of-rmaf-2017.24101/

jkeny said:
Maybe some way of reconciling the divergent views about this late reflections concept comes from a study of ASA which really has at it's core an understanding that perceptions are the brain's best fit analysis of the nerve impulses being received - the best fit being to the internal models/rules/knowledge/behaviour built up over time/experience from the natural world of sounds/sights. Why is it only a "best fit" & not an 100% accurate analysis? - because there is not enough information in the nerve impulses being received by the brain to derive a unique solution in the time required (real time)

Given this as my core understanding (others may disagree) for how our perceptions work, I derive a number of aspects, among which are some I believe are relevant to this discussion:
- a best fit analysis does not imply 100% accuracy so things like the scanning rate of TV screens are fine for almost all people & the image is considered normal
- the whole area of illusions in perception uncovers this "best fit" approach & shows that we use more than the signals themselves - we use knowledge about the behaviour of the sight/sound in the surrounding context. So take away the context & present the image/sound in isolation & the illusion disappears.

My take on the late reverberation idea is that even though the generated "late ceiling splash" may not be an accurate recreation of the recorded late reverberations but it may work fine in lots of circumstances for our auditory perceptual engine in that it can present a set of late signals which are grouped together (this is what ASA does) & categorised as the late reflections of the direct sound. I believe that this late signals need to have certain characteristics in order to be categorised as belonging to the direct signal - Harman, I believe demonstrated that speakers are preferred which have an off-axis reflection spectral makeup of a reasonably close match to the direct, on-axis, spectral makeup.

The above is all such great stuff I'm quoting it here again for inclusion, though I don't have anything to add to it.

Finally, I hope you don't mind if I cut-n-paste from your e-mail to Jim Romeyn, just in case anyone skipped over. This is paradigm-shifting stuff, and again I thank you for bringing your advanced insights and analysis into this conversation:

- Once we understand that what we perceive is 90% processing & 10% signal, it changes our whole perspective on this hobby.

- Auditory Scene Analysis (ASA) is the study of this processing & here's a bit of a summary of it's findings:

the data we get from our eardrum (nerve signals on the auditory nerve) are what linguists call "poverty of the data" - in other word there isn't enough info in the signals to uniquely solve the problem of auditory perception - creating an auditory scene in our brain that matches the outside physical world of sound objects.
In other words we need to analyse what signals belong to what objects - grouping them together & keeping track of this grouping as a solid auditory object through space & time.
There are many techniques ASA uses to do this but I believe what we are seeing with your technique is a perfect example of this grouping in operation
ASA relies heavily on pattern matching & learns these patterns as babies & throughout life by experiencing the behaviour of sound objects in the real world. So we have embedded this as the template by which all sounds are interpreted.
In the real world, such as a large hall, we hear the direct sound & we hear its reflections at > 10mS later. We group these two sets of sounds together even though they are separated in time because their characteristics best match one another (timing, spectral fingerprint, harmonics, etc)
So it's not that we get a "second look" at this - it's that we get a better match to what we intrinsically know happens in the real world - a reflection which is spectrally & temporally close to the main sound.
When we use just a stereo pair of speakers we are relying on its off-axis response to create these reflections & most speakers are not producing anything like the amplitude & spectral accuracy to the main sound
so your clever bit of lateral thinking, is not to try to improve the off-axis response of the main speakers but to swamp it's off-axis behaviour with a controllable better match to the main sound
ASA will naturally group this with the main sound as it is the best match to what it knows & expects from real world experience - it essentially discards the off-axis stuff in favour of the better match.
That's why listening is more relaxed - our processing system is not straining trying to solve the mismatches it's finding between the off-axis reflections & the main sound - this processing task tires us.

[That last point you make has been my perception as well... that the net result is quite non-fatiguing. It is great to have an explanation from an expert in the field of auditory scene analysis.]

jkeny · Dec 7, 2019

Duke LeJeune said:
Hello Jkeny, thank you very much for your in-depth and illuminating analysis commentary!

Dude, you just shifted my paradigm. I think your analysis is "the best fit"... including your insight about perception being 90% analysis and 10% signal. Thank you for bringing your informed perspective as an expert in the field of signal processing to this thread... and THANK YOU for letting us use your Ciunas DAC and your power supply (for the Ultra Rendu) at that 2017 RMAF show!

Thanks Duke but I'm no expert in the field of signal processing or ASA, just an interested researcher so I'm open to correction at any stage - I consider these ideas my working hypothesis until I find or am shown evidence or logic which changes my hypotheses.
The 90/10% split is a simplification for the sake of trying to emphasise the main point - that all our perceptions are actually the brain's processing of nerve impulses thus creating a useful internal representation of the world of objects in which we exist. It is far more complicated than the 90/10 simplification but by not getting into the weeds it helps to communicate the concept.

Once that concept is absorbed lots of interesting thoughts/ideas come from it which have a bearing on our hobby of music playback (in the main).

For those wondering what he's referring to, at RMAF 2017 we showed a floorstanding speaker which incorporated an upwards-and-backwards firing coaxial array in the back of the cabinet, down close to the floor. This rear-firing array's job was to contributed some spectrally-correct, relatively late-onset reverberant energy. This energy was aimed to bounce from wall to ceiling to listening area, to maximize how long it took to arrive without needing as much distance from the wall as a bipole or dipole speaker. Using a remote we could toggle between having the rear array on and off.

The rear firing array was not very loud relative to the front-firing array, and when it was on its calculated impact on the SPL at the seated position was an increase of less than 1/2 decibel. I'd ask people whether they were hearing "more of the recording" or "more of the hotel room" when the rear-firing array was engaged, and nobody ever said "more of the hotel room".

Having read your analysis, Jkeny, here is what I think was going on (and please feel free to correct me!): With the rear-firing arrays engaged, the ears' "best fit" as to what it was hearing was tipped towards the "soundscape on the recording", and correspondingly away from "hotel room".

DaveC stopped by the room and included some commentary in the opening post of this thread: https://www.whatsbestforum.com/threads/davecs-best-of-rmaf-2017.24101/

Yes, the bolded text is exactly what I was trying to convey - you say it more succinctly than I, however

.....

That's why listening is more relaxed - our processing system is not straining trying to solve the mismatches it's finding between the off-axis reflections & the main sound - this processing task tires us.

That last point you make has been my perception as well... that the net result is quite non-fatiguing. It is great to have an explanation from an expert in the field of auditory scene analysis.]

Again, I don't claim any expertise as the readings & research in this field of psychoacoustics is vastly greater than I could ever hope to absorb or even read. Not only that but trying to cross disciplines & understand how it might relate to music playback, audio electronics is another body of knowledge that requires study & absorption - again I'm not an expert in any of this but I try to allow my intuition arising from my own experiments in audio playback & the reports of others direct me in improving a little bit the area of audio playback I focus on.

jkeny · Dec 8, 2019

Duke LeJeune said:
....
Having read your analysis, Jkeny, here is what I think was going on (and please feel free to correct me!): With the rear-firing arrays engaged, the ears' "best fit" as to what it was hearing was tipped towards the "soundscape on the recording", and correspondingly away from "hotel room".
.....

Having re-read this I wish to correct my earlier comment on it - I was too hasty - it was a late night reply & I'm no expert as I already said so my antennae don't automatically react, I have to think through things rather than be in command of the field as an expert would be - as a result, I'll make mistakes along the way

I actually don't think the Late Ceiling Splash (LCS) is tipping our auditory perception "towards the "soundscape on the recording", and correspondingly away from "hotel room"." - what I believe is that the LCS is overlaying the room reflections which naturally result from a combination the speakers on-axis & off-axis waveforms reflecting off boundaries - it's overlaying them with a the rear firing version of the direct sound which has been delayed & changed spectrally.

The result of this is far more complicated - does this LCS waveform perceptually dominate the other room reflections? If it does then my guess is that it will tip our perceptions towards hearing a bigger & possibly more natural illusion.

Remember that what is happening in 2 channel stereo playback is a creation of a believable illusion that real instruments, real voices are performing in a believable space whether manufactured in the studio or recorded as a live event. The whole production process of micing, recording, mixing, etc is an art form which results in a recording that we playback to appreciate. We probably need to treat it more like we do a film - it's a realistic illusion for our auditory perception just as watching a movie is for our visual perception. The illusions created should be good enough to not draw attention to themselves & then it allows our higher level immersion in the visual or auditory scene. If there are aspects that perceptually jar then our immersion is interrupted or more correctly, we find it more difficult to be immersed as our perceptual engine is finding it more difficult to find the best fit or it has to keep adjusting to different best fits as more signals arrive.

So, for instance we notice immediately if there is a delay between sound & video in a film, particularly when it comes to voice. Why voice particularly? Because our internal perceptual models know that in the real world there is no noticeably perceived delay between the movement of the lips & the sound of the words. OK, dubbed film can be OK for many if we don't focus too much on the lips & allow the other visual aspects dominate but it also illustrates how malleable our auditory perception is - the "best fit" is not a binary choice, it has varying degrees. So in a film a delay in sound other than voice may be correct - it may represent a reverberant space - it's context dependent.

The same applies in stereo audio playback - we are playing a recording through two sources & our auditory perception is applying the 'best fit' analysis to ALL the received pressure waves & building moment to moment, an internal auditory model of the resultant stream of nerve impulses. Now if this 'best fit' analysis doesn't encounter any difficult to match signals (i.e stream of nerve impulses), it will do its job without undue fatigue & the result will be more relaxing, pleasing & immersive. If however the analysis struggles with anomalous signals we will expend more brain processing in solving this & have less available for the higher level functions which allow this relaxation & immersion in the soundscape.

So, my long-winded explanation is that I believe LCS is doing something along these lines which allow our auditory perception to latch onto a 'better fit' version of a reverberation signal than if the LCS was not in play?

The comments you heard at the RMAF 2017 "I'd ask people whether they were hearing "more of the recording" or "more of the hotel room" when the rear-firing array was engaged, and nobody ever said "more of the hotel room". I believe is them reporting that they are perceiving a room size which you have dialled into the LCS signal rather than the existing room reverberation but I don't think it means they are necessarily hearing the room ambience (if any) in the recording.

Hope this makes sense - it's better explained in my email bullet points:

- Auditory Scene Analysis (ASA) is the study of this processing & here's a bit of a summary of it's findings:

the data we get from our eardrum (nerve signals on the auditory nerve) are what linguists call "poverty of the data" - in other word there isn't enough info in the signals to uniquely solve the problem of auditory perception - creating an auditory scene in our brain that matches the outside physical world of sound objects.
In other words we need to analyse what signals belong to what objects - grouping them together & keeping track of this grouping as a solid auditory object through space & time.
There are many techniques ASA uses to do this but I believe what we are seeing with your technique is a perfect example of this grouping in operation
ASA relies heavily on pattern matching & learns these patterns as babies & throughout life by experiencing the behaviour of sound objects in the real world. So we have embedded this as the template by which all sounds are interpreted.
In the real world, such as a large hall, we hear the direct sound & we hear its reflections at > 10mS later. We group these two sets of sounds together even though they are separated in time because their characteristics best match one another (timing, spectral fingerprint, harmonics, etc)
So it's not that we get a "second look" at this - it's that we get a better match to what we intrinsically know happens in the real world - a reflection which is spectrally & temporally close to the main sound.
When we use just a stereo pair of speakers we are relying on its off-axis response to create these reflections & most speakers are not producing anything like the amplitude & spectral accuracy to the main sound
so your clever bit of lateral thinking, is not to try to improve the off-axis response of the main speakers but to swamp it's off-axis behaviour with a controllable better match to the main sound
ASA will naturally group this with the main sound as it is the best match to what it knows & expects from real world experience - it essentially discards the off-axis stuff in favour of the better match.
That's why listening is more relaxed - our processing system is not straining trying to solve the mismatches it's finding between the off-axis reflections & the main sound - this processing task tires us.

Duke LeJeune · Dec 8, 2019

Thank you again, John!

jkeny said:
The comments you heard at the RMAF 2017 "I'd ask people whether they were hearing "more of the recording" or "more of the hotel room" when the rear-firing array was engaged, and nobody ever said "more of the hotel room". I believe is them reporting that they are perceiving a room size which you have dialled into the LCS signal rather than the existing room reverberation but I don't think it means they are necessarily hearing the room ambience (if any) in the recording.

The reason why I think the additional spectrally-correct, late-onset reverberant energy is conveying room ambience from the recording is this: The spatial "feel" seems to change noticeably and often significantly from one recording to the next, and even from one track to another on the same album. At least that has been my observation, and I think Jim Romeyn would agree, but let me cite some commentary from presumably competent listeners who don't have a dog in the fight.

At Axpona 2015 a reviewer for The Absolute Sound, Andrew Quint, stopped by our room. I gave him my little schpiel, and in response he pulled out a thumb drive and declared, "I'd like to challenge that". On the thumb drive was a classical piece recorded in a concert hall Andrew is familiar with, the Concertgebouw in Amsterdam.

We played the track on his thumb drive and when it was over I asked him, "Well, how did we do?"

He replied, "It passed. It's not a gimmick; it works. I could clearly hear the Concertgebow hall."

Andrew later posted this on my AudioCircle forum:

"The multichannel program on the RCO Live SACDs (there are dozens) get this last aspect ["truly a sense of sound being present in the air around you"] right; so did the Bienville Suite, nearly to the same degree, despite the presence of only two channels. My concern when Duke told me about the rear-firing drivers was that this would impart some generic, Bose-like spaciousness to the recording, but that wasn't the case—what I heard was the unique acoustic signature of the Concertgebouw."

Here's the link: https://www.audiocircle.com/index.php?topic=142537.msg1521087#msg1521087

Let me quote some imo relevant commentary from the reviewer team of Tyson and Jason (aka Pez) from the 2013 Rocky Mountain Audio Fest. First a bit of background on what these guys were doing, as they were the most efficient reviewer team I have ever seen in action (one of the major print magazine saw their work and wanted to hire them but they declined).

Before the show they picked out rooms to visit and five clips to listen to. The first three clips were Tyson's picks and he sat in the sweet spot, then they traded places for the last two, which were Jason's clips. One would type while the other was in the sweet spot, then after the last clip they'd quickly wrap it up and move on to the next room. They did this pretty much all day for three days, fifteen minutes per room as I recall, and visited our room late on the third day.

So the first three tracks mentioned here have Tyson's commentary, and the last two have Jason's. Tyson describes himself as a "tonality freak", while Jason is "more a soundstage and detail freak":

"Saint-Saens Violin & Piano – Maybe, no definitely the most beautiful presentation of this track at the entire show."

"Handel Aria – Ditto, just beautiful. I really expected the imaging to be washy in this room, due to the LCS technology. But the image is rock solid center. And a lot of ambiance stretching WAAAAYYYYY back, way past the physical boundaries of the room. Side to side imaging is not as impressive, but still very good."

"Mahler Symphony – Really nice, big spacious open sound. Again, it actually expands beyond the physical confines of the room. I don’t think ANY other system at the show has been able to pull that off."

"Ok outside of the sweet spot these sound good, but not mind blowing. In the sweet spot these are amongst the best I have heard in the show. I traded with Tyson for my tracks starting with the Civil Wars and there is a soooooo much more in the sweet spot. Great dynamics anywhere in the room, but imaging and soundstage that is quite breath taking and emotionality that is pretty heart stopping."

"Tom Waits ohhhhhh wow. Just amazing. This feels like a real performance. I haven't heard such a focused soundstage at this show period. Absolutely phenomenal. Easily a contender for best in show."

Here's the link; scroll down about halfway:
https://www.audiocircle.com/index.php?topic=120504.msg1267150#msg1267150

Now these are all anecdotes and anecdotes don't add up to data. So I cannot claim to have direct supporting data.

Anyway my point is, rather than resulting in the same LCS-enhanced larger-room spatial signature being grafted onto each recording (which we might expect if the playback room's reflections dominated spatial perception), there seems to be a variation in the ear's spatial analysis from one recording to the next, with the known hall signature on Andrew Quint's recording apparently being recognizable. And I think that is most likely coming from the recording itself.

What are your thoughts?

jkeny · Dec 8, 2019

Great examples, Duke & you know what they say, real world experience beats theory. We had a PM in Ireland who once said " that works in practise now let's see if it works in theory"

I'm sure you can see the logic in my thinking above so I need to think some more & see if I can reconcile the reported perceptions to psychoacoustic principles

My initial thoughts are that we are listening to an illusion, an art form not an accurate recreation of what we would hear if we were at the live event (presuming it's a recording of a live event). So even the stereo recording in the Concertgebouw concert hall will not be the same as listening to that same concert live in that hall - it depends on the mics & micing configuration used & any recording adjustments made afterwards, etc. We are only capturing a small portion of the actual soundscape. But the illusion can be wholly satisfying & in a concert hall or room we know how the reflections will behave - we have an internal model for this which defines the delay time, HF attenuation, change in envelope shape, change in timbre, etc. Each & every on of these if correctly presented will add to the believability of the illusion i.e. how natural sounding it is. I suspect that the LCS reflections present a more accurate version of the reflected sound of the recording venue than the reflections of the direct sound speakers & the room itself. It's always been a conundrum how can playback in a room very different to the recording venue just not end up being very confused. It just goes to show how our auditory perception works on 'best fit' principles (within a model constrained by our experience of how sound behaves in the world - this is the guiding principle) rather than on accuracy.

To elucidate further, the normal non-LCS playback produces some reflections which don't always match spectrally well with the direct sound so they aren't perceptually merged with the direct sound & are perceived somewhat apart from it at times - in other words we notice the sound of the room because of these reflections which are slightly wrong at times. Although I don't know the off-axis spectral plots of your main speakers that deliver direct sound but I suspect using the LCS method is a much more controlled way of ensuring late reflections are more correct.

When our auditory perception finds this more correct reflected sound, ASA principles suggest that the direct sound & reflections are grouped as elements of the one sound & not as a room sound - the LCS signals become the principle sounds determining the background sound of the soundscape.

So, in essence, the room sound reflections are being swamped/replaced by the 'more correct' LCS reflections or to be more precise our auditory perception is finding that the LCS reflections better match our internal model of natural reflections & these form an integral part of the sound rather than any other reflections

KlausR. · Dec 10, 2019

jkeny said:
To elucidate further, the normal non-LCS playback produces some reflections which don't always match spectrally well with the direct sound so they aren't perceptually merged with the direct sound & are perceived somewhat apart from it at times - in other words we notice the sound of the room because of these reflections which are slightly wrong at times. Although I don't know the off-axis spectral plots of your main speakers that deliver direct sound but I suspect using the LCS method is a much more controlled way of ensuring late reflections are more correct.

The way Duke describes the process, i.e. bounce from wall to ceiling to listening area, this "LCS" looks like a somewhat delayed ceiling reflection, which would still arrive well within the time window of the precedence effect for speech and music. In my own listening room this reflection would result in a level increase of about 1 dB (assuming that sources are coherent; audible) and 0.3 dB (assuming that sources are incoherent sources; probably audible), respectively.

Non-LCS speakers may or may not have well-behaving off-axis response, but since very few manufacturers provide measurements, you simply don’t know. This wall-ceiling bounce is on-axis, hence spectrally more correct, but it travels via two boundaries so there might be some change in spectrum due to absorption.

In my room this delayed ceiling reflection would have a delay of 16 ms. Griesinger’s clip direct + first, which Griesinger uses to demonstrate the detrimental effects of early reflections, and which is the one Duke refers to to support his POV, includes a 17 ms reflection: “The soundfield data in figure 16 [soundfield data for seat DD 11] shows a strong reflection off the right side wall at 17ms, and the spectrum of this reflection shows that it is a double, a first order reflection from the side wall, and a second order reflection from the balcony underside and the side wall.” (quote from Griesinger, ”Laboratory reproduction of binaural concert hall measurements through individual headphone equalization at the eardrum”, AES paper 9691, which paper very much looks like the basis for Griesinger’s lecture).

Geddes draws a line at 10 ms, calling these the very early reflections VER, Duke draws the same line calling the respective reflections early (before) and late (after). Griesinger states that reflections before 10 ms can corrupt transients. His clip has a 17 ms reflection but sounds bad. In Duke’s concept this reflection would be called late and should sound good. I am confused.

Klaus

jkeny · Dec 10, 2019

Maybe this will just add to your confusion or maybe not - it's a complex field & I wish I was the expert that Duke named me earlier but unfortunately, I'm not. This is the latest Griesinger video presentation - a honorary Heiser lecture given in March this year at AES in Dublin which I unfortunately missed..

His algorithm for the perception of 'Proximity or Clarity' in concert halls is captured in a measurement he calls LOC which is roughly a plot of the direct sound pressure & reflected sound pressure build-up plotted over time for a listening position - "LOC, our measure that predicts from a binaural impulse response whether a particular seat will provide “Proximity”. The experiments shown in the paper for that talk showed that reflections that arrive up to 5ms after the direct sound augment the direct sound, and increase the likelihood of hearing proximity. But reflections that arrive after 7ms decrease proximity." This is somewhat different to the 10mS for "spaciousness" perception. But it's not just about timing of reflections, it's also about their total & relative sound energy as compared to direct sound energy, as the LOC measurement shows

Now all this relates to concert halls but there are universal characteristics about perception which presumably will apply to small rooms such as our playback rooms?:
Taken from his first slide

I believe acoustic research has overlooked three critical concepts:

1. Pitch: Why are humans able to distinguish pitch to an accuracy of six parts in 10,000?

(Because pitch (periodicity) enables us to separate pitched signals from noise.)

2. Phase: It is commonly thought that phase is inaudible above 1500Hz.

(False! The phases of the upper harmonics of tones are critical to proximity and source separation.)

3. Attention: Acoustic quality is measured with intelligibility. Attention is more important.

(Sounds that are proximate involuntarily attract attention.)

I wonder how a LOC measurement of our playback listening rooms would look?

So to try to answer your specific questions

KlausR. said:
The way Duke describes the process, i.e. bounce from wall to ceiling to listening area, this "LCS" looks like a somewhat delayed ceiling reflection, which would still arrive well within the time window of the precedence effect for speech and music. In my own listening room this reflection would result in a level increase of about 1 dB (assuming that sources are coherent; audible) and 0.3 dB (assuming that sources are incoherent sources; probably audible), respectively.

OK, are there absorptive surfaces? I think the back-firing sound from an LCS speaker may be possible for users to adjust time delay & level withing a range but I may be wrong - Duke will correct this , I'm sure

Non-LCS speakers may or may not have well-behaving off-axis response, but since very few manufacturers provide measurements, you simply don’t know. This wall-ceiling bounce is on-axis, hence spectrally more correct, but it travels via two boundaries so there might be some change in spectrum due to absorption.

Sure but the reflections of a a direct sound reflecting off multiple surfaces is likely far batter, spectrally than an off-axis signal bouncing off similar surfaces? Also I think there's a perceptual difference between vertical & lateral reflections

In my room this delayed ceiling reflection would have a delay of 16 ms. Griesinger’s clip direct + first, which Griesinger uses to demonstrate the detrimental effects of early reflections, and which is the one Duke refers to to support his POV, includes a 17 ms reflection: “The soundfield data in figure 16 [soundfield data for seat DD 11] shows a strong reflection off the right side wall at 17ms, and the spectrum of this reflection shows that it is a double, a first order reflection from the side wall, and a second order reflection from the balcony underside and the side wall.” (quote from Griesinger, ”Laboratory reproduction of binaural concert hall measurements through individual headphone equalization at the eardrum”, AES paper 9691, which paper very much looks like the basis for Griesinger’s lecture).

Geddes draws a line at 10 ms, calling these the very early reflections VER, Duke draws the same line calling the respective reflections early (before) and late (after). Griesinger states that reflections before 10 ms can corrupt transients. His clip has a 17 ms reflection but sounds bad. In Duke’s concept this reflection would be called late and should sound good. I am confused.

Klaus

Do you have the link the above Griesinger paper with figure 16 ?

Duke LeJeune · Dec 10, 2019

jkeny said:
I suspect that the LCS reflections present a more accurate version of the reflected sound of the recording venue than the reflections of the direct sound speakers & the room itself. It's always been a conundrum how can playback in a room very different to the recording venue just not end up being very confused. It just goes to show how our auditory perception works on 'best fit' principles (within a model constrained by our experience of how sound behaves in the world - this is the guiding principle) rather than on accuracy.

John, thank you once again for sharing your enlightening perspectives.

I apologize for being so slow to reply... mine is a one-man show and when there's a glitch, my schedule gets re-arranged and internet time is often the first thing to go.

jkeny said:
So, in essence, the room sound reflections are being swamped/replaced by the 'more correct' LCS reflections or to be more precise our auditory perception is finding that the LCS reflections better match our internal model of natural reflections & these form an integral part of the sound rather than any other reflections.

Great stuff!

I wonder if you could shed some light on something I'm unclear about:

In the presence of “competing” reflection packages – such as the LCS package vs the speaker's inherent off-axis package – does the ear tend to pick the best package and ignore the other, or does it tend to perceive a “weighted average” of the two? I presume the answer is some variation of “it depends”, but if you have anything to share on the subject, I'm all ears!

Duke LeJeune · Dec 10, 2019

KlausR. said:
Geddes draws a line at 10 ms, calling these the very early reflections VER, Duke draws the same line calling the respective reflections early (before) and late (after). Griesinger states that reflections before 10 ms can corrupt transients. His clip has a 17 ms reflection but sounds bad. In Duke’s concept this reflection would be called late and should sound good. I am confused.

Excellent question, Klaus! Thank you for asking it!

If I understand correctly the question is something like this: If reflections arriving before 10 milliseconds are detrimental in home audio but reflections arriving later are beneficial, how can a reflection arriving at 17 milliseconds in a concert hall be detrimental?

Here is what I think is going on:

Griesinger's 17 millisecond double reflection (sidewall + balcony) is louder, relative to the direct sound, than the 16 millisecond ceiling bounce would be in your room. This is largely because the direct sound path is much shorter in your room than at Griesinger's location in the concert hall.

Our in-house testing indicates that there is a beneficial loudness threshold for the LCS section, relative to the direct sound. If the loudness of the LCS section exceeds this threshold, clarity IS degraded. In practice the threshold approximately corresponds with a roughly .25 dB increase in SPL at the listening position (it varies somewhat with the specifics, which is why the LCS section has its own level and spectral adjustments).

So I think the relative loudness of that 17 millisecond double reflection is great enough to exceed the threshold and become detrimental to clarity.

Recall what Geddes had to say about early reflections: ”The earlier and the greater in level the first room reflections are, the worse they are.”

So it's the COMBINATION of delay time and loudness of the early reflections, not just the delay time, that can “add up to” detrimental effects.

At the location in the concert hall where Griesinger made his recording, the direct sound is weak enough relative to the reverberant sound that we need a very long reflection-free time window before enough of the early reflections have been removed to restore clarity.

In our home listening rooms, the direct sound is much stronger (relative to the reverberant sound) than in Griesinger's concert hall location, so we do not need as long of a reflection-free time window to restore clarity.

I think Griesinger's clips are relevant even though the reflection times are much longer than what we experience at home because they illustrate a combination of reflection arrival time and strength which is detrimental to clarity. In home audio the reflections are earlier and weaker, but the same basic principle applies – there can be a combination of reflection arrival time and strength which is detrimental to clarity. In both situations the same general solution is effective (though the specifics of implementation will differ): Separate the direct sound and the reverberant sound into two distinct streams. This kills two birds with one stone (okay, maybe two stones): We get clarity AND immersion/envelopment.

jkeny · Dec 11, 2019

Duke LeJeune said:
......
I wonder if you could shed some light on something I'm unclear about:

In the presence of “competing” reflection packages – such as the LCS package vs the speaker's inherent off-axis package – does the ear tend to pick the best package and ignore the other, or does it tend to perceive a “weighted average” of the two? I presume the answer is some variation of “it depends”, but if you have anything to share on the subject, I'm all ears!

Good question I can only have a stab at answering it as I have no direct experience of LCS in action

I suspect that the 'grouping' which happens in auditory perception according to ASA understanding of this perception would go a long way to explaining this? According to ASA, the continually running analysis happening in the brain is on-going & realtime grouping together of the nerve signals arriving into auditory objects - in this particular example if the LCS signals qualify as a better match to the direct sound then they are coalesced into being part of the same auditory object - a binding takes place in perception i.e these signals are treated as being a characteristic of the direct sound - it's natural reflection. In my understanding, this grouping or binding is like a focus on a camera where when out of focus (i.e no grouping) everything is just a blurred background but when focus is achieved objects snap into perception into foreground objects in a background

So what happens to the other non-LCS reflections - I don't know whether they become part of the background & less dominant sound of the room or whether they are close enough in character to the direct sound that they also have a bit of a blurring effect? From what Griesinger says & what you also seem to report in your listening experiments, there is a certain window of loudness where this snaps into place & outside this window the effect is diffuse? This would concur with the idea of an analytic process continually running part of whose job is grouping/binding of signals according to known/learned behaviour of sound objects i.e it expects a reflection to be withing a certain spectral closeness to the direct sound & also to fall withing certain range of acceptable differences to the direct sound i.e the timing difference but also attenuation in HF, etc. PS: this spectral makeup does not really form part of Griesinger's main consideration but is an added complexity when talking about sounds coming from speakers & the spectral differences between on-axis Vs off axis signals.

Again, I'm only thinking my way through this given my little understanding of how I believe auditory processing works as an analysis engine which is constantly using 'best fit' analysis to bind together the diverse realtime signals & create the perception of auditory objects in an auditory scene - pretty much the same as visual perception is doing the same. So similarly, we have foreground/background sound, just as we have foreground/background images. We can focus on different parts of this vista/soundscape consciously or by our attention being drawn to a part of this visual or auditory scene. And it all operates based on our accumulated knowledge & internal models of how sound objects or visual objects behave in the world - something that begins before we are born & the brain begins to deal with the signals it is receiving from the senses as they develop in the womb.

(I learned something fascinating just the other day - that newborn babies cry in the cadence or melody of their native language - https://parenting.nytimes.com/baby/wermke-prespeech-development-wurzburg }

What is your experience of turning LCS on/off - can you still hear the original reflections when LCS is turned on? How do they present to perception? Do they become part of the background of room sound?

Duke LeJeune · Dec 11, 2019

jkeny said:
I suspect that the 'grouping' which happens in auditory perception according to ASA understanding of this perception would go a long way to explaining this? According to ASA, the continually running analysis happening in the brain is on-going & realtime grouping together of the nerve signals arriving into auditory objects - in this particular example if the LCS signals qualify as a better match to the direct sound then they are coalesced into being part of the same auditory object - a binding takes place in perception i.e these signals are treated as being a characteristic of the direct sound - it's natural reflection.

jkeny said:
[The ear] it expects a reflection to be withing a certain spectral closeness to the direct sound & also to fall withing certain range of acceptable differences to the direct sound i.e the timing difference but also attenuation in HF, etc. PS: this spectral makeup does not really form part of Griesinger's main consideration but is an added complexity when talking about sounds coming from speakers & the spectral differences between on-axis Vs off axis signals.

This is very interesting and very helpful.

jkeny said:
What is your experience of turning LCS on/off - can you still hear the original reflections when LCS is turned on? How do they present to perception? Do they become part of the background of room sound?

That's a good question, I'm not aware of the original reflections when the LCS is on, and I suspect that the better "fit" of the LCS package of reflections may relegate the initial reflections to being "background noise".

I hesitate to rattle on too much about what I think I hear, as I have too many dogs in the fight. It is probably inevitable that I listen with confirmation bias.

While writing this post I recalled that Scott Hull of The Part-Time Audiophile had written about what he did and did not hear when we swtiched the Space Generators on and off. The Space Generators are an evolution of the original Late Ceiling Splash concept. Reading his words now, it sounds like the tonal balance isn't affected but the spatial cues are revamped (LCS reflections effectively replacing the original off-axis reflections?). Quoting from his coverage of our room at RMAF 2015:

"The demo I heard was with/without the Space Generators, Duke manning the kill-switch sitting on the floor in front of the amplifier. With the Generators, the sound was quite good — warmth, with great tone, the character of that Electra-Fidelity amp just as clear as you’d want. This was comfort (as opposed to “tidy”, as I put it above), and I was a happy, happy boy. Then, Duke killed the Generators.

"The word non-plussed doesn’t get a lot of play in modern usage, and it’s a shame. Essentially, I didn’t “get it”. What just happened? I asked Duke to cut it back in again, and he proceeded to cut the Generators in and out every 20 seconds or so, but after that first surprise-at-no-surprise, I heard it. Or rather, realized I had missed something.

"I suppose what I was expecting was “total change” — and this was not that. The Generators don’t change the system, or how it sounds, not fundamentally. That system will sound as good, or not, as it happens to be. The Generators are very straightforwardly adding something. Not PRaT. No, the speakers didn’t magically transform themselves into Magicos or Magnepans. They sounded exactly the same. What was added? A bigger room. No, seriously — it was as if the room got 10? bigger in every dimension. More specifically, it was as if the music was played on a much larger stage. Audiophiles talk about how sound stage might “spread past the boundaries of the speakers” and how rare that is. That’s exactly what was happening here. There was this moment when I found myself bracing, my mind repeating “Whoa whoa whoa …!” And then the speakers fell away and vanished. The walls did too. Space yawned, darkness fell, and we were all spinning around with the stars. It was cool.

"Honestly, the experience also told me how silly A/B comparisons can be; after all, my first reaction was a raised eyebrow and shrug. I was listening for tone, bass, detail. Soundstage wasn’t something I was keying off of and it took me more than a second to notice the (profound) change."

Scott's description seems consistent with your words: "If the LCS signals qualify as a better match to the direct sound then they are coalesced into being part of the same auditory object - a binding takes place in perception i.e these signals are treated as being a characteristic of the direct sound - it's natural reflection."

jkeny · Dec 11, 2019

Duke LeJeune said:
This is very interesting and very helpful.

Thanks

From Scott's description:

"I suppose what I was expecting was “total change” — and this was not that. The Generators don’t change the system, or how it sounds, not fundamentally. That system will sound as good, or not, as it happens to be. The Generators are very straightforwardly adding something. Not PRaT. No, the speakers didn’t magically transform themselves into Magicos or Magnepans. They sounded exactly the same. What was added? A bigger room. No, seriously — it was as if the room got 10? bigger in every dimension. More specifically, it was as if the music was played on a much larger stage. Audiophiles talk about how sound stage might “spread past the boundaries of the speakers” and how rare that is. That’s exactly what was happening here. There was this moment when I found myself bracing, my mind repeating “Whoa whoa whoa …!” And then the speakers fell away and vanished. The walls did too. Space yawned, darkness fell, and we were all spinning around with the stars. It was cool.

"Honestly, the experience also told me how silly A/B comparisons can be; after all, my first reaction was a raised eyebrow and shrug. I was listening for tone, bass, detail. Soundstage wasn’t something I was keying off of and it took me more than a second to notice the (profound) change."

Scott's description seems consistent with your words: "If the LCS signals qualify as a better match to the direct sound then they are coalesced into being part of the same auditory object - a binding takes place in perception i.e these signals are treated as being a characteristic of the direct sound - it's natural reflection."

Yes, we are so used to looking for specific distortions or specific individual points of difference in frequency i.e sound of a cymbal, sibilance, etc, that other differences can be overlooked. A lot of people think that we hear everything in the soundfield at the same time & so any differences should immediately become noticeable - not true - we have different aspects in focus at any point in time although we can switch to focus on other aspects seemingly instantly. This is more obvious with visual perception where the object in focus is where our attention is & everything else we are sort of aware of but it is out of focus & lacking detail - we can quickly switch our focus to any of these 'background' images & see the detail.

When confronted with changes that effect the holistic presentation of the sound, we can initially miss it because we are focused elsewhere but given an open mind & an ability to listen in a relaxed way rather than analytical way, we can pick it up. Unfortunately, A/B (particularly ABX) has a tendency to direct our focus for these specific elements/differences in sound, sibilance, cymbals, voices, etc.

I believe this is at the heart of the problem that some have when people report hearing 'night & day' differences - these sorts of changes are so fundamental & widespread, effecting the whole presentation of the sound that we hear our replay systems in a fundamentally different way.

I get the same with people reporting what they hear with my digital audio devices but this is happening for a different technical reason - all got to do with ASA & how, I believe, auditory processing works.

KlausR. · Dec 12, 2019

jkeny said:
His algorithm for the perception of 'Proximity or Clarity' in concert halls is captured in a measurement he calls LOC which is roughly a plot of the direct sound pressure & reflected sound pressure build-up plotted over time for a listening position

Had a quick look and found this:

www.davidgriesinger.com/ICA2013/What is Clarity4.pptx, go to slide 26, the right diagarm in slide 25 is basically fig 16 of Griesinger's AES paper.

I also found this: “The measure is based on the idea that if we count the number of nerve firings (roughly proportional to the logarithm of the sound pressure) that result from the direct sound in the first 100ms, and compare that number to the number that result from reflections in the first 100ms, then the ratio of those two numbers predicts whether or not we will be able to localize a sound and perceive it as close to us. A ratio greater than 2 predicts implies good hearing, less than one predicts muddy sound.” Quote from Griesinger, “Pitch, Timbre, Source Separation and the Myths of Loudspeaker Imaging”, AES paper 8610 (2012), permalink http://www.aes.org/e-lib/browse.cfm?elib=16248

With LOC being defined/determined in a box of 100 ms wide, it looks hardly appropriate for domestic rooms.

Also I think there's a perceptual difference between vertical & lateral reflections.

Another can of worms, can we skip that?

Do you have the link the above Griesinger paper with figure 16 ?

AES permalink is http://www.aes.org/e-lib/browse.cfm?elib=18569

If you don’t have access, drop me a note with an email address and I send the pdf.

Klaus

KlausR. · Dec 12, 2019

Hello Duke,

Duke LeJeune said:
Griesinger's 17 millisecond double reflection (sidewall + balcony) is louder, relative to the direct sound, than the 16 millisecond ceiling bounce would be in your room. This is largely because the direct sound path is much shorter in your room than at Griesinger's location in the concert hall.

For 17 ms delay Griesinger’s reflection would have about 3 times the SPL, assuming that seat DD11 is 20 m from the stage. That’s another reason why concert hall stuff cannot be transcribed directly to domestic rooms. I noticed that I had made a mistake when crunching the numbers, the delay of the LCS in my room would not be 16 but only 6 ms, hence not late at all, but still benign according to your concept. I’m still confused.

Recall what Geddes had to say about early reflections: ”The earlier and the greater in level the first room reflections are, the worse they are.”

It seems to be not as easy as that, if I look at fig. 4 in my writeup on early reflections, where thresholds depend on direction and signal type, earlier is not always worse.

Klaus

jkeny · Dec 12, 2019

KlausR. said:
Had a quick look and found this:

www.davidgriesinger.com/ICA2013/What is Clarity4.pptx, go to slide 26, the right diagarm in slide 25 is basically fig 16 of Griesinger's AES paper.

I also found this: “The measure is based on the idea that if we count the number of nerve firings (roughly proportional to the logarithm of the sound pressure) that result from the direct sound in the first 100ms, and compare that number to the number that result from reflections in the first 100ms, then the ratio of those two numbers predicts whether or not we will be able to localize a sound and perceive it as close to us. A ratio greater than 2 predicts implies good hearing, less than one predicts muddy sound.” Quote from Griesinger, “Pitch, Timbre, Source Separation and the Myths of Loudspeaker Imaging”, AES paper 8610 (2012), permalink http://www.aes.org/e-lib/browse.cfm?elib=16248

With LOC being defined/determined in a box of 100 ms wide, it looks hardly appropriate for domestic rooms.

Yea, LOC is probably not appropriate for small listening rooms such as our domestic playback rooms as I already stated "

"Now all this relates to concert halls but there are universal characteristics about perception which presumably will apply to small rooms such as our playback rooms?:.

Taken from his first slide

I believe acoustic research has overlooked three critical concepts:

1. Pitch: Why are humans able to distinguish pitch to an accuracy of six parts in 10,000?

(Because pitch (periodicity) enables us to separate pitched signals from noise.)

2. Phase: It is commonly thought that phase is inaudible above 1500Hz.

(False! The phases of the upper harmonics of tones are critical to proximity and source separation.)

3. Attention: Acoustic quality is measured with intelligibility. Attention is more important.

(Sounds that are proximate involuntarily attract attention.)

Another can of worms, can we skip that?

Is it? Why would we skip it?

AES permalink is http://www.aes.org/e-lib/browse.cfm?elib=18569

If you don’t have access, drop me a note with an email address and I send the pdf.

Klaus

Thanks Klaus

Duke LeJeune · Dec 12, 2019

KlausR. said:
For 17 ms delay Griesinger’s reflection would have about 3 times the SPL, assuming that seat DD11 is 20 m from the stage. That’s another reason why concert hall stuff cannot be transcribed directly to domestic rooms.

Did you include the balcony reflection in your calculation? Greisinger says the 17 millisecond reflection is a double, with the balcony reflection included, so it would be louder than the sidewall reflection alone.

I agree that "concert hall stuff cannot be transcribed directly to domestic rooms." But I do believe the same psychoacoustic principles apply, which I think Geddes' wording expresses well: ”The earlier and the greater in level the first room reflections are, the worse they are.” The implication is that it's a combination of timing and level.
So we have powerful reflections at 17 milliseconds being detrimental to clarity in the concert hall, and weaker but earlier reflections potentially being detrimental to clarity in home audio.

The solution is evidently the same in principle if not in specifics: Separate the direct sound and the reverberant sound into two distinct streams.

KlausR. said:
It seems to be not as easy as that, if I look at fig. 4 in my writeup on early reflections, where thresholds depend on direction and signal type, earlier is not always worse.

Detection thresholds do not tell us whether the reflection is beneficial or detrimental. But I do agree that direction
and signal type matter.

* * * *

Let me lift a quote posted yesterday in a thread in the Speakers forum, about the Alsyvox full-range planar speakers:

"The Alsyvox can be as close to 1M from the back wall although I have tried them even at 1/2M and they still sound great, BUT when any of the Alsyvox models are about 1.5-2M from the back wall you get more layering on the 3D soundstage than you do if the speakers are placed closer to the back wall." [Emphasis mine]

The 10 milliseconds figure Geddes talks about, and which I subscribe to, falls right smack in the middle of that 1.5-2 meter distance from the wall.

The geometry I currently use typically results in 10 milliseconds delay with the speakers about 1 meter out into the room, as the rear-firing array is angled upwards at 45 degrees such that we get a wall bounce and a ceiling bounce before the "backwave" reaches the listening area. The spectrum of this energy is user-adjustable so some adaptation to different kinds of surfaces is possible.

KlausR. · Dec 13, 2019

jkeny said:
Also I think there's a perceptual difference between vertical & lateral reflections.

Click to expand...

”KlausR.” said:

Another can of worms, can we skip that?

Click to expand...

Is it? Why would we skip it?

Can of worms was maybe not the appropriate term. This issue would require some extensive digging in the scientific literature, but should you have an archive well filled in heat respect, give it a shot.

Griesinger's teachings show up in Klippel, Linkwitz, Toole, and Geddes

[Industry Expert]/Member Sponsor

Well-Known Member

Industry Expert, Member Sponsor

[Industry Expert]/Member Sponsor

Industry Expert, Member Sponsor

Industry Expert, Member Sponsor

[Industry Expert]/Member Sponsor

Industry Expert, Member Sponsor

Well-Known Member

Industry Expert, Member Sponsor

[Industry Expert]/Member Sponsor

[Industry Expert]/Member Sponsor

Industry Expert, Member Sponsor

[Industry Expert]/Member Sponsor

Industry Expert, Member Sponsor

Well-Known Member

Well-Known Member

Industry Expert, Member Sponsor

[Industry Expert]/Member Sponsor

Well-Known Member

Similar threads