What's Wrong With Loudspeaker Preference Testing?

tmallin · Mar 20, 2021

Over on the REG Forum I instigated a spirited discussion of loudspeaker preference testing of the type done by Floyd Toole. This accompanied my recent reading of Toole's book, Sound Reproduction (3rd Edition), as well as my recent acquisition of speakers, the Dutch & Dutch 8c, which perform well in the Spinorama test advocated by Toole.

I highly recommend reading Toole's book, but if that seems too daunting, you should at least watch his video presentation here: Floyd Toole - Sound reproduction – art and science/opinions and facts - YouTube

As I understand Toole's position, what he calls the Audio Circle of Confusion can be reduced by designing and using in mixing and mastering speakers which do well in double blind loudspeaker listening preference tests which he has developed and refined over the years. This Circle of Confusion can be further reduced if home listeners also use such speakers for playback of music at home.

What could be wrong with this approach? It has long been recognized that there is a disconnect between what the recording folks hear in the control room versus what one hears at home if for no other reason than the speakers used to judge the quality of the recording are not the same as those usually used at home by audiophiles. We all know how much different speakers sound from one another; thus, if recording monitoring and home playback are not done with the same or similar speakers you will not hear the sound the recording engineers intended.

Toole's Research for NRCC & Harmon

In the words of his book:

Upon graduation I was employed as a research scientist at the National Research Council of Canada (NRCC) in Ottawa. My job was to ask questions and find answers by applying the scientific method. I was in the Applied Physics Division, so the emphasis was on real-world issues. A major performance metric was peer-reviewed publications, but also, for my chosen line of investigation, evidence that industry benefitted from the work—the NRCC was taxpayer funded. There could not have been a better place to engage in this kind of research. First, I worked among, and was tutored by, some of the best acoustical scientists in the world. We had access to excellent anechoic and reverberation chambers, as well as all of the latest measuring equipment necessary to quantify sound. There was also the budget and space to create listening rooms for subjective evaluations. The research was successful, and publications resulted. The embryonic Canadian loudspeaker industry rented the NRCC measurement and listening facilities to design products, and, importantly, Canadian audio magazines paid for the facilities to perform product reviews—anechoic measurements and double-blind listening tests. The products that were designed and reviewed became part of the database for the research, and everyone benefitted from the knowledge as it emerged. A small staff was hired, and I traveled widely, telling the science story to Audio Engineering Society (AES) audiences and to interested manufacturers. The relatively unknown Canadian loudspeaker manufacturers used the credibility of the NRCC and the research to help gain recognition (some are now well known and respected international suppliers). All was as it should be. One day in early 1991, the phone rang. It was a headhunter offering the possibility of an interesting job with a major audio corporation, Harman International Industries. After 26 years of research, I was intrigued by this opportunity to get directly involved with applying the science to product development—moving closer to the “real” world. Soon I was hired as the corporate vice president of Acoustical Engineering, but very quickly it became more, because I was able to convince the company leaders that we could afford, and indeed needed, a corporate research group that was not attached to the brands and that did not develop audio products. Knowledge was the product, and obviously some of it would migrate into products if it proved to be of value. Harman generously permitted us to publish freely, following the scientific tradition of free exchange of knowledge at AES conventions, conferences and in the journal (some corporations do not allow this). Harman spent large sums on improved engineering facilities and innovative listening rooms for product evaluation. The benefits were soon seen in improved consistency and quality of sound from the products. Nevertheless, there were arguments from some sales and marketing people who may not have had the same faith in science as we did. Good sound does not guarantee good sales. There are many factors involved in that aspect: appearance, price, size, marketing and retail distribution. These all fall outside the domain of engineering. Still, there was a resolute effort to ensure that products at all price points were competitive in sound quality. A program of measurements and double-blind evaluations of competing products was set up and continues. However, it has been difficult to maintain for all products because of tight schedules, the numbers of new products being developed, and the decentralization of design and manufacturing as Harman grew into an enormous worldwide, diversified corporation. When I joined Harman in 1991, sales were about $500M, we had a few thousand employees, and we were primarily an audio company headquartered in California. Things changed. Now sales are about $7B, there are about 26,000 employees worldwide, and audio is just part of what Harman does. It is a different company. I retired in 2007, but I have remained in a consulting role since then.

Toole, Floyd E.. Sound Reproduction (Audio Engineering Society Presents) (pp. xix-xx). Taylor and Francis. Kindle Edition.

As is evident, Toole's research has from the start been strongly motivated by marketing loudspeakers to the public. While the NRCC was government funded, this was not ivory tower stuff at all. It was aimed at helping companies develop products which would be more competitive in the marketplace.

tmallin · Mar 20, 2021

Toole's Listening Test Methods

While the listening tests evolved over time, Toole states that one listener at a time is the best paradigm so that the speaker and listener positions are more controlled and so that there is no group interaction in judging the loudspeakers under test. Tests are usually conducted using four different speaker models. Each is mechanically moved into the same test position in about three seconds. The three speakers not being listened to at the moment are move backward out of the way of the forward position of the speaker under test.

The tests are all double blind and random; neither the listener nor the tester knows which speaker is being auditioned at any time; an acoustically transparent but visually opaque curtain prevents both listener and tester from seeing any of the speakers. The listening and speaker positioning is done so as to maximize bass smoothness in the room. No electronic equalization is used in the tests to correct the designed-in frequency response of the speaker/room combination. The room is ordinary in that it does not use any specific acoustical treatment.

Most tests are done monophonically with a single speaker. While stereo tests have also been performed, Toole says the statistical data shows that mono tests reveal listener preferences more quickly and consistently.

Toole has developed a set of recorded musical selections for use in these tests and they run the gamut of various musical types and genres. Individual listeners are free to listen to each selection and each speaker as long as they like before changing musical selections or speakers.

Listeners rate each speaker from 0 to 10 on various criteria. The criteria under test usually include frequency response smoothness and aberrations, spatial placement (this is done in stereo), and distortion. Test subjects are asked to rate the speakers on the basis of their preference with 10 being most preferred and 0 least preferred.

Listening panels are composed of people who have "normal" hearing as determined by an audiologist's test. If a person does not have "normal" hearing, that person is rejected as a listening panelist.

Various groups and professions are included in the testing from ordinary university students, to audio professionals, audio marketing people, and even audio equipment reviewers. The amount of experience in such tests also varies. One category is referred to as "trained listeners" and those are people who have successfully completed a training course aimed at helping people recognize frequency response errors, spatial placement, and distortions. Here is a description of such a course: Audio Musings by Sean Olive: How to Listen: A Course on How to Critically Evaluate the Quality of Recorded and Reproduced Sound Statistically speaking, such "trained listeners" tend to make consistent judgments about sound quality more quickly.

Test Results

Over the course of time Toole and his associates determined that a certain type of results, in frequency response tests dubbed the spinorama, could fairly accurately predict listening panel preference ratings of speakers. Basically, in the spinorama the response of the speaker is measured at 70 different positions in a horizontal and vertical circle around the speaker. Several frequency response curves are computed, including on-axis response, the listening window response, early reflections response, and total sound power response. In additions two different directivity indexes are computed. The theory is that when these curves are properly analyzed, a close correlation exists between the spinorama graph results and listener preferences. This interpretation is explained, for example, at this reference.

While conducting accurate spinorama tests traditionally requires a large expensive anechoic chamber—something not available to many speaker manufacturers—recent advances in digital measurement software makes comparable testing available without access to an anechoic chamber, bringing such testing within reach of many more speaker manufacturers. See, for example: Microsoft Word - The Klippel Near Field Scanner Voice Coil May(VC edited).docx

There are even some websites which have applied agreed algorithms to spinorama test results resulting in a composite spinorama "score" which can be used to rate different speakers against each other. See this site, for example: All Spinorama data index | Speaker Data 2034 and A collection of loudspeakers measurements (pierreaubert.github.io)

Let's assume that there is absolutely nothing less than scientifically rigorous about the test design in terms of the testing being double-blind and random. Let's further assume that the proper statistical analysis is always applied.

What criticisms can be leveled at such testing procedures? Are better alternatives available today?

tmallin · Mar 20, 2021

Listener Preference vs. Verisimilitude

The primary criticism is that these are preference tests aimed at marketing speakers to people, pure and simple. They were created to enable speaker manufacturers to design speakers which would be subjectively preferred by the public and thus could be sold in larger quantities and/or at higher prices.

These preferences are not anchored in any way to the sound of live unamplified acoustical instruments playing in a concert hall or even in the listening room where the speakers are tested. Thus, such listener preferences have nothing whatsoever to do with how closely a particular loudspeaker replicates the sound of actual acoustical musical instruments heard live in a concert hall. Serious music listeners can ignore such tests as to classical music recordings and all other recordings of music which is usually heard unamplified when heard live. The verisimilitude of the speaker sound to the sound of a real musical instrument is not being tested at all. All that is being tested is the average assessment of "goodness" of the sound, not anchored at all to the actual sound of musical instruments or voices.

This is a serious criticism indeed. Nowhere in Toole's book or other materials dealing with this research is there documented any serious attempt to compare the sound of any speaker with live unamplified music. Experience in listening to live unamplified music is not a criteria for being a test subject in the preference tests. Even the "trained listener" course does not involve such comparisons.

It can be today assumed that a decreasing percentage of test subjects have any significant experience hearing live unamplified acoustic performances of music in an indoor concert space. Such concerts are increasingly rare. Some occur in churches, but church attendance continues to decline and the abandonment of acoustical music even in church services continues; increasingly such music is amplified. The amount of classical music recordings sold today is far less than 5% of the total market. It must thus be assumed that most test subjects really don't know what "real music" sounds like and can't be expected to prefer the sound of speakers which most correctly reproduce such sound.

But surely we are all familiar with the spoken unamplified voice. Isn't this enough? Unfortunately, Toole's findings indicate that solo vocal recordings are not among the type of recordings which can most easily and consistently determine listener preferences among speakers.

In Toole's defense, his research does not show that those who are audio critics or who are employed in any other audio-related profession, such as audio equipment sales or marketing people or musicians are any better at consistently identifying preferences among speakers. In fact, none of these groups even approaches the speed or consistency with which those who undergo the "trained listener" course perform in Toole's preference tests. Of course, this could just be that the training course is training for discriminating differences and preferences along the criteria being tested in the listening tests—frequency response errors, spatial placement, and distortion—not how much recordings played through the speaker in question actually resemble the live sound of the music in question.

Also, Toole believes that listener preference is in fact also a judgment of sonic accuracy. Again to quote from his book:

"Is It Preference or Accuracy That Is Evaluated? These tests have been criticized for many reasons, and I would like to think that all have been satisfactorily put to rest. Perhaps the most persistent challenge is the belief that as long as the term “preference” is involved, the results merely reflect the likes, dislikes and tastes of an inexpert audience, not the analytical criticism of listeners who know what good sound really is. They assert that the objective should be “realism,” “accuracy,” “fidelity,” “truth” and the like. As it turns out, this is where my investigations started, many years ago. I required listeners to report their summary opinions, their “Gestalt,” as a number on a 0 to 10 “Fidelity” scale. The number 10 represented the most perfect sound they can recall, and 0 was unrecognizable rubbish. When these tests began, it is fair to say that some of the loudspeakers approached “rubbish” and none came close to perfection. Consequently, scores extended over a significant range. To help listeners retain an impression of what constituted good and bad sound, the Fidelity scale was anchored by always including in the population of test loudspeakers at least one that was at the low end of the scale and one that was at the high end, usually the one we dubbed the “king of the hill,” the best of the test population to that date. A parallel scale was provided, called “preference,” thinking that there might be a difference, but the two ratings “fidelity” and “preference” simply tracked each other. Figure 3.19 illustrates the cumulative results from our NRCC tests in the early 1980s—a slice of the history of audio. It is obvious that there are many loudspeakers clustered near the top of the scale. As time passed more products were rated highly, and listeners complained about wasting time listening to bad sound. I stopped anchoring the scale at the low end, and an interesting thing happened: most listeners continued to use the same numerical range to describe what they heard. The scores rarely went above 8, which is a common subjective scaling phenomenon—people like to leave some “headroom” in case something better comes along. The advantage was that it was now easier to distinguish the rankings of the remaining loudspeakers because they were more widely separated on the scale. However, it was evident that they were no longer judging on the same scale—the upper end of it had stretched. When listening to just a few of the top-rated loudspeakers, which in Figure 3.19 would have been crowded together, statistically difficult to separate, the same product ratings would cover a much larger portion of the scale and be easy to separate and rank. Listeners were still judging fidelity, but it was rated on an elastic scale. I chose to call it preference, and it still is.

Listeners obviously have a “preference” for high “fidelity,” also known as “accuracy,” “realism” and so on. It is really one scale."

Toole, Floyd E.. Sound Reproduction (Audio Engineering Society Presents) (p. 56-58). Taylor and Francis. Kindle Edition.

Toole also argues that attempting to compare recordings to the original sound produced by musicians is wrongheaded. First, most recordings are an art form unto themselves. The vast majority of recordings are not made with verisimilitude to the actual sound of the live unamplified music in the concert hall as the goal. Sound engineers most usually manipulate the microphone feed in various ways (e.g., equalization, dynamic range compression or expansion, phasing shifts, volume control of individual microphones) known only to them and usually not written down for posterity. Second, the stereo microphone feed itself conveys only part of the sound field a listener would hear in the concert hall. Thus, for both these reasons a comparison of the sound of speakers reproducing recordings to that of musical instruments in terms of verisimilitude of the speaker sound to the original musical instrument sound is not a fair comparison at all.

tmallin · Mar 20, 2021

The BBC Approach

On the REG Forum are many huge fans of BBC-legacy speakers—either those designed through BBC research in the 1950s – 1970s (e.g., owners of classic Spendor models such as the BC-1) or those whose designers have extrapolated from the design principles established by this old BBC research in developing newer products (e.g., newer Harbeth, Graham, and Stirling models). As documented in BBC monographs still available today, the design methodology involved constant comparison of the live sound or unamplified classical music in the BBC studios or other venues with the sound heard through the BBC-designed speakers in the control room. Thus, this BBC design process was in fact anchored to the sound of real acoustic music in a real concert space. Such speaker design IS possible.

But is such design possible or even practical TODAY? No government-sponsored acoustic research of this type is currently done by the BBC or in any other country, as far as I know. No speaker manufacturer to my knowledge has daily access to classical music recording groups, and both live recording studios/concert halls and their monitoring facilities the way the BBC did.

In addition, this research is now old enough that most of those speaker designers once involved in this halcyon period of acoustic research and development are either now retired or have passed on. One exception is Derek Hughes, son of Spencer Hughes (the "Spen" of Spendor—the "Dor" was for his wife, Doris) who has done recent designs for Stirling and Graham. But even for such designers, a current back and forth comparison of live sound with recorded sound is no longer possible for their current design efforts.

Perhaps those interested in speakers which are aimed at reaching the closest possible verisimilitude to live unamplified music one should pay particular attention to speaker designs from designers who are also serious amateur musicians or who at least attend a lot of live unamplified classical music concerts. Look at the designer's biography, in other words. Paul S. Barton of PSB speakers is one such designer, a violinist. In actuality, however, the PSB designs seem to take liberally from the Toole/Harmon playbook.

Mono vs. Stereo Comparisons

Some also object to Toole's primary use of monophonic, single-speaker comparisons in the preference tests. They feel that a stereo pair is necessary for demonstrating the virtues of single-centered-listener stereo playback. This seems quite persuasive to those of us for whom the recreation of the recorded space is a large part of the thrill of home music reproduction. The feeling of ambiance and envelopment which can be developed from a particular recording varies considerably from one speaker to another. To not test for this subjective aspect of reproduction seems ludicrous to some, if not many listeners who are all too familiar with the potential joys of "the sweet spot" in their home listening room.

Possible Effects of Room Treatment and Equalization on Listener Preferences

Toole's tests do not attempt to ascertain what effect, if any, the application of acoustical room treatments to the testing room surfaces, or the application of electronic equalization to the frequency response of speakers under test would have on the speaker preference rankings. In fact, both of these methods are routinely employed by at least some audiophiles (me included) to enhance the subjective realism of the sound reproduction from the speakers in the listening room from the sweet spot.

Electronic equalization can today easily be applied to ameliorate resonance problems and correct frequency on-axis frequency response issues. Correcting such problems could easily change the rank order of listener preferences for speakers whose primary design issues involve such problems. Off axis frequency response problems can often be ameliorated by dispersion or absorption of sound from otherwise reflective room surfaces, also potentially altering the rank order of listener preferences.

Even moving the speaker and listener positions within the testing room is not a tested factor in the experiments, even though audiophiles know that in any given room, room and listener positioning may need to be changed for the best subjective result from any given pair of speakers. In Toole's tests the speaker and listener positions within the room are basically the same for each speaker. It may be that when such measures are taken, the preference rankings would be reordered.

In general, the Toole testing does not figure on the home listener taking any active steps to maximize the sound quality of a given pair of speakers. At least for serious listeners, this seems like an unrealistic assumption.

In Toole's defense, his testing methods are meant to show meaningful results for probably the majority of listeners who, like Toole himself, are unwilling to compromise home decor with purpose-made acoustic room surface treatments. Toole's work also shows that since listeners in his tests react favorably to room reflections from speakers whose off-axis response is smooth and follows the contour of the on-axis response. Toole would say that electronic equalization in an untreated room is at best a compromise since the equalization only directly affects the on-axis response and cannot equalize off-axis response for speakers whose off-axis response does not smoothly follow the contour of the on-axis response of the speaker.

Where I Come Down

I have owned and loved a series of speakers derived from BBC-inspired designs since about 2002: the Harbeth Monitor 40, Monitor 40.1, Monitor 40.2 and Stirling Broadcast LS3/6. In my audiophile lifetime I have also owned and loved many speakers which bore no direct connection to either the BBC school of design or the Toole school of design. My current Dutch & Dutch 8c speakers are, I believe, the only speakers I have ever owned which clearly were designed to have spinorama results which would indicate a high degree of listener preference in Toole-type listening tests. I currently find these D&Ds to be the best speakers I've ever owned, and by not a slim margin.

But that is not to say that I'm a "Toole school" convert. I believe serious music lovers and audiophiles should listen to examples of each type of speaker and judge for themselves. The performance of a home stereo playback system is highly subjective. For example, your interest in the actual sound of live classical or other unamplified acoustic music may be minimal, so minimal that the issue of whether your home audio system can reproduce the gestalt of hearing such music live is unimportant. You may just want to hear sound reproduction which sounds "good" to you.

Even if your primary interest is reproducing at home as closely as possible the gestalt of the sound of live unamplified classical music in a good concert hall, you must decide for yourself whether high listening preference in Toole-type listening tests are sufficient stand-ins for verisimilitude to the sound of live music. Once the pandemic is over or you are fully vaccinated, do some serious comparative listening to speakers of both the BBC heritage school and those designed with great spinorama measurement results in mind. You may find the choice easy to make. Then again you may find that the best choice to your ears is not the choice you expected.

Duke LeJeune · Mar 21, 2021

One of the reasons I read Toole and Robert E. Greene and Griesinger and Linkwitz and Geddes and others (including posts on this site) is to become better informed about "where the goal posts are", as well as about "how to reach them".

At the risk of oversimplifying, imo the Harman studies do an excellent job of establishing where the goal posts are for conventional direct-radiator loudspeaker designs when listened to in mono, and positioned well away from the side walls in a fairly large, presumably well-treated room. Imo their findings do not necessarily directly carry over to other listening situations or other loudspeaker types, BUT their findings are IMMENSELY informative about at least SOME of the attributes a really good loudspeaker would have.

Imo Robert E. Greene provides "reality checks" which have been very helpful to me, as I do not begin to have his experience with live orchestral music. If he and Toole were best buddies, Robert might not post as much expert analysis as he does.

tmallin · Mar 21, 2021

Toole says that his research finds no difference between classical music and other types of music (acoustic, electrified, electronic) in allowing listeners to form preferences. His playlist presented to listening panels thus includes samples of many types of music. I suppose that is a factor in favor of the listener preferences having a more "absolute" application.

Those who object to the lack of a preference "anchor" in the real world of music are of course using acoustic music as the basis for their argument since such music is the only type which has a real-world reference anchor. They would also argue that speakers should only be judged with such music because it's impossible to know what music that does not exist outside a mixed recording context is supposed to sound like.

Toole would argue back that even with recordings of unamplified classical music it is impossible to tell exactly what the recording is supposed to sound like played back at home because of the Audio Circle of Confusion. For example, one usually doesn't know what sonic manipulations the sound engineers performed on the microphone feed, what speakers were used to monitor the recording, and the microphones only picked up part of the sound field a live listener would have heard.

And so it goes . . . .

Duke LeJeune · Mar 21, 2021

In my opinion replicating "what the recording engineer heard" is not necessarily "what best creates the perception of hearing live music".

An ideal monitoring setup is optimized for perfecting the recording, arguably a different priority from that of an ideal playback setup, which is presumably optimized for presenting a convincing illusion.

In other words, I think a good playback system may create a more convincing illusion than what the recording engineer heard, regardless of whether or not the recording ever existed as live music. For instance, your Dutch & Dutch speakers in your room may well create a more convincing illusion of a live performance than what the recording engineer heard on his KRKs and/or Genelecs and/or B&Ws.

Gregm · Mar 21, 2021

tmallin said:
Toole says that his research finds no difference between classical music and other types of music (acoustic, electrified, electronic) in allowing listeners to form preferences.

That is interesting and puzzling: there is a different complexity to recording, mastering, and reproducing between electronic music (e-circuit produced pure tones) and unamplified instruments producing complex harmonics.
Unless one has never heard a live violin (for example), the differences should be audible in a reasonably performing well setup system. (By well performing i mean capable of reproducing the signal fed to it reasonably well.)

Duke LeJeune said:
In my opinion replicating "what the recording engineer heard" is not necessarily "what best creates the perception of hearing live music".

Agreed.

Duke LeJeune said:
In other words, I think a good playback system may create a more convincing illusion than what the recording engineer heard, regardless of whether or not the recording ever existed as live music.

It seems that the objectives of the recording engineer and the music-phile differ Somewhat: one wishes to save the sounds he captured in the best possible way onto the medium he has. The other wishes to create and enjoy a musical experience. Regards

tmallin · Mar 21, 2021

Thanks for your comments, Duke. As to whether Toole's preference findings are stable from room to room, his presentation, Floyd Toole - Sound reproduction – art and science/opinions and facts - YouTube, beginning at about 16:56, specifically deals with rooms as a variable. Apparently there are a number of different shaped rooms which can be configured in the testing facility. The preference findings were stable across all room types. Toole concludes that listeners are able to "stream" or "hear through" the room to hear the quality of the speaker as a separate entity. He likens this process as similar to the fact that we all can recognize a familiar voice regardless of the room or position in that room from which the speaker is speaking, even though the frequency response measurements of the voice will dramatically change from, say, bathroom to outside.

As to different purposes of a monitoring system vs. a home playback system, Toole's constant emphasis is on what needs to be done to reduce what he calls the Audio Circle of Confusion. While his research can be viewed as merely a marketing tool for Canadian speaker manufacturers and later Harmon, since this was all publicly revealed, peer-reviewed stuff, any manufacturer is free to use the research in their own design and marketing. As lists of speakers which do well on spinorama tests reveal, such speakers are not limited to Canadian or Harmon products.

I think that we must assume that for maximum home music replay enjoyment we would want to more consistently hear what the sound engineers heard when monitoring and mastering the recordings we play at home, that is, reduce the Audio Circle of Confusion. I think it's best to assume that the audio professionals involved in making the recordings know what they are doing in terms of what they want the recording to sound like. They like a convincing illusion just as much as audiophiles. Sure, there are instances where their customers insist on a certain type of sound (those are documented in the book), but if given a free hand, I think we should assume that they practice their recording art to the best of their considerable ability.

What creates errors, as Toole notes in his book, is that most audio professionals are at least as averse to judging speakers by a full set of meaningful measurements as audiophiles are. As a result, Toole finds that many audio professionals use monitoring speakers which measure terribly. They adjust their mixes to make their recorded product sound as fine as it can through those speakers. Thus, when home audiophiles listen through more neutral speakers, they find such recordings to sound "bad." The Audio Circle of Confusion at its destructive worst.

sbnx · Mar 22, 2021

I can understand why Toole used only one speaker in mono as two speakers can cause frequency response aberrations due to phase cancellation. The listener is trying to judge how good/bad a speaker sounds and not how well they are set up.

Of course listening to only one speaker says a lot about how much you like its frequency response/balance however it doesn't have much to say on how well that speaker images or produces a lifelike soundstage.

Overall, I like Toole's book. He did a good job of designing listening experiments to determine peoples general preferences and what property of the sound is it that makes people prefer A over B.

Thanks for writing a summary.

Duke LeJeune · Mar 22, 2021

tmallin said:
As to whether Toole's preference findings are stable from room to room, his presentation, Floyd Toole - Sound reproduction – art and science/opinions and facts - YouTube, beginning at about 16:56, specifically deals with rooms as a variable. Apparently there are a number of different shaped rooms which can be configured in the testing facility. The preference findings were stable across all room types. Toole concludes that listeners are able to "stream" or "hear through" the room to hear the quality of the speaker as a separate entity.

Toole reached the conclusion that the ear can "hear through" the room based on controlled listening tests of conventional loudspeaker systems in three normal but differently-shaped rooms, and I don't disagree with him. BUT the geometry of the single-speaker setup in Harman's speaker-shuffler room is so different from a normal stereo setup that imo it is no longer representative in some ways.

In the speaker-shuffler room the single speaker is about TEN FEET from the side walls - which is extremely unlikely to occur in a normal home audio situation. As a result the lateral reflections arrive MUCH LATER than they normally would. Also, the long sidewall reflection path length + presumably broadband acoustic treatment results in the reflections delivering less energy to the ears than would normally be the case.

In the Harman room wide-pattern speakers are not penalized for undesirable early sidewall reflections and are more likely to produce an adequate amount of reflected sound, and therefore are likely to be preferred over narrow-pattern speakers. In other words, imo the test conditions are not representative of normal listening conditions.

Nor does mono listening adequately evaluate a loudspeaker's spatial qualities. Yes I know Toole stated that loudspeaker preference does not change when going from mono to stereo but that position can be challenged using information from his book (which is rather tedious to do so I'll refrain unless you really want me to). In particular there can be significant movement of perceived "spatial quality" when going from mono to stereo.

Anyway let me quote from his book a bit:

“A serious examination of listener reactions to complex sound fields in stereo reproduction was undertaken by [Wolfgang] Klippel. The investigation attempted to relate listener descriptions of what they heard to measured quantities... According to Klippel the relevant loudspeaker measurements are the anechoic on-axis frequency response and the sound power response – or at least a sufficient collection of off-axis measurements to describe the reflected sounds arriving at the listening location.

“Of special interest was his finding that what he called a “feeling of space” figured prominently into listener responses... responses were solicited for two broad categories, “naturalness” and “pleasantness”, one relating to realism and accuracy, and the other to general satisfaction or preference, without regard to realism."

Klippel found that “naturalness” (realism and accuracy) was 30% related to sound quality (coloration, or the lack thereof); 20% related to tonal balance; and 50% related to the “feeling of space”.

“Pleasantness” (general satisfaction or preference) was 30% related to sound quality and 70% related to the “feeling of space”.

I would not have expected the “feeling of space” to make a 50% contribution to "realism", and a 70% contribution to "preference"! And obviously the "feeling of space" in a stereo recording is something which two loudspeakers can do but one cannot.

Toole continues: “Therefore, whether one is a picky purist or a relaxed recreational listener, the impression of space is a significant factor... Klippel chose as his measure of the “feeling of space” the difference between the sound levels of the multidirectional reflected sound and the direct sound at the listening location.”

In other words, Klippel found that the reflected-to-direct sound ratio was related to this highly desirable “feeling of space”. BUT (quoting Toole again; emphasis mine):

“There is an optimum amount of reflected sound; there can be too much or too little.” This has been my experience as well.

Tool writes in conclusion: “A good loudspeaker for this purpose would therefore be one that has two qualities: wide dispersion, thereby promoting some amount of reflected sound, and a relatively constant directivity index, so that the direct sounds and reflected sounds have similar spectra.”

In my opinion not only does the single-speaker listening setup in the Harman speaker-shuffler room fail to adequately evaluate spatial qualities (which according to KIippel matter a great deal), but also the inherently weaker reflections in that room favor speakers which generate a greater amount of off-axis energy than would be desirable under normal conditions.

cal3713 · Mar 23, 2021

I personally love the focus on preference.

I spent 17-years as a cognitive scientist specializing in preference formation, change, and measurement. The neural systems in charge of preferences are more basic (and I suspect more sensitive) than those involved in identification/matching.

For example, infants will express adaptive preferences long before they know anything about what they're experiencing and whether it matches some prior event. And adults with cognitive dysfunction regularly lose the ability to match experiences (e.g., by failing to recognize prior interaction partners), but retain adaptive preferences towards these people. Thus, when you're doing a matching test (e.g., which speaker sounds more like this guitar), you're relying on a system that comes on line later and leaves sooner.

For this reason, I'm much more likely to trust preference judgements than any other tool of evaluation in this domain. The other judgments are not as basic or natural and are likely more prone to error.

You've got to learn what to listen to to make many of these judgements, but no instruction is necessary in the realm of pleasure (/liking/preference).

tmallin · Mar 23, 2021

In my opinion not only does the single-speaker listening setup in the Harman speaker-shuffler room fail to adequately evaluate spatial qualities (which according to KIippel matter a great deal), but also the inherently weaker reflections in that room favor speakers which generate a greater amount of off-axis energy than would be desirable under normal conditions.

If you've read my discussions of any of the speakers I've recently used in my small room, Duke, you will know that I totally agree that things sound better with absorption or at least diffusion for the side wall reflections. In my room, any practical stereo set up which is symmetrical will put the speakers at most about four feet from the sidewalls. My Dutch & Dutch set up has the tweeters only about half that far from the side walls. Of course the D&D tweeters are wave guided and angled away from the nearest sidewall by 30 degrees.

I also use absorption or dispersion at the other specular reflection points of the room, except for the ceiling. I now may have come up with a non-damaging way to attach foam to the ceiling and I will probably try treating that reflection as well.

Duke LeJeune · Mar 23, 2021

tmallin said:
If you've read my discussions of any of the speakers I've recently used in my small room, Duke, you will know that I totally agree that things sound better with absorption or at least diffusion for the side wall reflections.

That was my recollection, and many if not most people do something similar if they have a dedicated listening room.

The unnaturally long distance to the sidewalls in Harman's listening room takes the early lateral reflections out of the picture completely, skewing the presentation of reflections away from the early ones and towards the late ones to an extent that could only be replicated in a superb dedicated room. Imo this artificially benefits speakers which may not fare as well in real-world rooms, while arguably penalizing speakers which are more real-world room-friendly.

By way of example, I think your Dutch & Dutch speakers are more likely to be preferred in a real-world room (whether a dedicated one like yours or not) than they would be in Harman's room. At least some of the benefits of their extraordinary pattern control would be wasted in the Harman room and may actually work against them, as the amount of reflected sound in that room will be reduced relative to conventional speakers and may end up being less than what would be desirable.

Tim Link · Mar 23, 2021

Duke LeJeune said:
In my opinion not only does the single-speaker listening setup in the Harman speaker-shuffler room fail to adequately evaluate spatial qualities (which according to KIippel matter a great deal), but also the inherently weaker reflections in that room favor speakers which generate a greater amount of off-axis energy than would be desirable under normal conditions.

Yes, I've found the excellent measuring wider dispersion speakers to have too much dispersion for my room. What we need is an adjustable dispersion speaker. I've thought about a line array with adjustable side wings.

tmallin · Mar 23, 2021

Over the years I've tried putting absorbing foam very near the speaker rather than on the walls. Much less foam is needed, for example, when I created "hood" around the top and sides of the tweeter/midrange area of the speakers. I've made homemade versions of the old Watkins Echo Muffs product for floor standing speakers. I also tried mounting foam very near the back of dipole speakers like the Carver Amazing Platinum Mk IV and Sanders 10C. Sound Lab makes the SALLIE product intended to absorb the rear wave from the electrostatic panel from just behind the panel at its rear focus point.

These efforts of mine were never really successful. Yes, they change the sound quite a bit, and measurably so. With foam on the walls, the measured on-axis response of speakers usually doesn't change much, but putting foam very near the drivers changes the response quite measurably. As you can expect, it sounds deader, more rolled off on top. Three dimensionality is negatively affected. In contrast, putting foam or diffusive treatment on the walls a couple of feet or more away from the drivers usually improves the three dimensionality of the sound and creates a truer ability to hear the space which is on the recording as opposed to generated by sidewall reflections in your listening room.

Tim Link · Mar 23, 2021

I found this article by Floyd Toole just now:

Room Reflections & Human Adaptation for Small Room Acoustics

Dealing with acoustics in small rooms is no trivial matter. This article focuses on how to treat early reflections based on listening habits, your loudspeaker choice. Absorb, diffuse, reflect; decide.

www.audioholics.com

"For optimum stereo listening if your music tastes are as eclectic as mine, one really needs adjustable acoustics and, possibly, variable-directivity loudspeakers, but we know that won’t happen"

We've joked, semi seriously, about remote controlled adjustable acoustics here at ASC. In truth I don't think most of us would want to deal with that no matter how slick it worked. The acoustics can be adjusted to some degree by rotating the reflectors and moving them around. The speaker dispersion is a bigger problem. What could possibly be done is make a speaker that comes in three different dispersion configurations. Try them all and pick the one that works best.

tmallin · Mar 23, 2021

Tim Link said:
I found this article by Floyd Toole just now:

Room Reflections & Human Adaptation for Small Room Acoustics

Dealing with acoustics in small rooms is no trivial matter. This article focuses on how to treat early reflections based on listening habits, your loudspeaker choice. Absorb, diffuse, reflect; decide.

www.audioholics.com

"For optimum stereo listening if your music tastes are as eclectic as mine, one really needs adjustable acoustics and, possibly, variable-directivity loudspeakers, but we know that won’t happen"

We've joked, semi seriously, about remote controlled adjustable acoustics here at ASC. In truth I don't think most of us would want to deal with that no matter how slick it worked. The acoustics can be adjusted to some degree by rotating the reflectors and moving them around. The speaker dispersion is a bigger problem. What could possibly be done is make a speaker that comes in three different dispersion configurations. Try them all and pick the one that works best.

I think B&O has done that in its most expensive Beolab 90 ($95k and up) model. It has remote controllable dispersion patterns. See: Beolab 90 - Ultimate High-End 8,200W Speakers | B&O (bang-olufsen.com)

Tim Link · Mar 23, 2021

tmallin said:
Over the years I've tried putting absorbing foam very near the speaker rather than on the walls. Much less foam is needed, for example, when I created "hood" around the top and sides of the tweeter/midrange area of the speakers. I've made homemade versions of the old Watkins Echo Muffs product for floor standing speakers. I also tried mounting foam very near the back of dipole speakers like the Carver Amazing Platinum Mk IV and Sanders 10C. Sound Lab makes the SALLIE product intended to absorb the rear wave from the electrostatic panel from just behind the panel at its rear focus point.

These efforts of mine were never really successful. Yes, they change the sound quite a bit, and measurably so. With foam on the walls, the measured on-axis response of speakers usually doesn't change much, but putting foam very near the drivers changes the response quite measurably. As you can expect, it sounds deader, more rolled off on top. Three dimensionality is negatively affected. In contrast, putting foam or diffusive treatment on the walls a couple of feet or more away from the drivers usually improves the three dimensionality of the sound and creates a truer ability to hear the space which is on the recording as opposed to generated by sidewall reflections in your listening room.

I've tried various things myself. See funny pictures. The Darth Vader speakers actually sounded pretty good. The Polk speakers on top provided wide dispersion with the larger bookshelf giving it some bottom end. Darth's cape was a black towel and I thought it sounded better with it. The bedroom had a problem with imaging pulling to the right. The wide dispersion speakers seemed to take my mind off it by pulling the imaging wider and blurrier overall. As someone who generally prefers more directional speakers I was taken back a bit by how easy and pleasant it was to listen to in there. The acoustic boxes I made created a very strange sound in the room. I was hoping to build better versions if they sounded good but they were a lot bigger looking than I expected and didn't do anything magically good in terms of imaging or musicality. I recall they worked great for improving vocal clarity in a way that was very noticeable outside the room. Others in the house noted it and came in to see what I had done. They sounded great for listening to podcasts. Not so good for music.

Duke LeJeune · Mar 23, 2021

tmallin said:
Over the years I've tried putting absorbing foam very near the speaker rather than on the walls... I also tried mounting foam very near the back of dipole speakers like the Carver Amazing Platinum Mk IV and Sanders 10C. Sound Lab makes the SALLIE product intended to absorb the rear wave from the electrostatic panel from just behind the panel at its rear focus point.

These efforts of mine were never really successful...

Your experience is similar to mine. I have never been happy with absorbing the backwave of dipoles. I tried the Sallies with both SoundLabs and Maggies, and ime adequate distance from the wall and/or aggressive toe-in gives much better results. Imo the Sallies are a "last resort" when there are no other options.

Tim Link said:
I've found the excellent measuring wider dispersion speakers to have too much dispersion for my room. What we need is an adjustable dispersion speaker. I've thought about a line array with adjustable side wings.

Tim Link said:
... speaker dispersion is a bigger problem. What could possibly be done is make a speaker that comes in three different dispersion configurations. Try them all and pick the one that works best.

I agree that some adjustability in loudspeaker radiation pattern for different room acoustic situations would be helpful, and imo could perhaps free up one's acoustic treatment budget to significantly IMPROVE the overall room acoustics instead of having to first FIX what the speakers are doing wrong.

Tim, if you don't mind, what was your solution to "excellent wider dispersion speakers" having "too much dispersion" for your room?

tmallin said:
I think B&O has done that in its most expensive Beolab 90 ($95k and up) model. It has remote controllable dispersion patterns. See: Beolab 90 - Ultimate High-End 8,200W Speakers | B&O (bang-olufsen.com)

Lacking the resources of B&O, I opted for a less ambitious approach: Fairly narrow-pattern main array plus a level-adjustable rear-firing array. The dispersion pattern itself may not be adjustable, but the direct-to-reverberant ratio is.

What's Wrong With Loudspeaker Preference Testing?

WBF Technical Expert

WBF Technical Expert

WBF Technical Expert

WBF Technical Expert

[Industry Expert]/Member Sponsor

WBF Technical Expert

[Industry Expert]/Member Sponsor

Well-Known Member

WBF Technical Expert

Well-Known Member

[Industry Expert]/Member Sponsor

Well-Known Member

WBF Technical Expert

[Industry Expert]/Member Sponsor

VIP/Donor

WBF Technical Expert

VIP/Donor

WBF Technical Expert

VIP/Donor

Attachments

[Industry Expert]/Member Sponsor

Similar threads