How to Listen: A Course in How to Critically Evaluate Sound Quality

tonmeister2008

WBF Technical Expert
Jun 20, 2010
210
6
0
Westlake Village,CA
Next month, I will be giving a one day course on How to Listen at the 2011 ALMA Winter Symposium in Las Vegas,held Jan. 4th and 5th, just prior to the CES show. The symposium will also feature other courses, workshops and paper sessions on loudspeaker and headphone design, testing and evaluation. You can register for my course and other events at the ALMA Symposium here. Below is a brief preface for my course How to Listen, which I encourage you to attend.



Figure 1: A listener training in the Harman International Reference Listening Room using the original version of the How to Listen training software.​


Whether you are involved in the mixing of live and recorded sound, the design and calibration of sound systems, or just shopping for a new audio system, the question “Does it sound good?” is usually foremost on your mind. With sufficient prodding, most people can offer an opinion on the overall sound quality of a recording and its reproduction. Beyond that, listeners generally lack the necessary experience, training and vocabulary to describe which specific aspects of the sound they like and dislike. Sadly, the audio industry has no standardized terminology that allows musicians, audio engineers and audiophiles to communicate with each other about sound quality in a concise and meaningful way. Courses in critical listening are not commonly available in audio programs at universities, and are even less available to the general public. In summary, there is a real need for a comprehensive course that teaches audio enthusiasts how to critically evaluate sound quality.

A Scientific Approach Towards Training Listeners
To address this need, the author has developed a critical listening course called How to Listen. The course aims to teach students how to evaluate sound quality using percepts well established in the auditory perception field. These sound quality percepts are taught and demonstrated in a controlled way using real-time processing of recorded sounds. This has two benefits. First, the intensity of each attribute can be adjusted according to the aptitude and performance of the listener. Second, closely tying the physical properties of the stimulus to its perception and evaluation (a science known as psychoacoustics) there is theoretical basis behind the training approach. For example, the listener training data can be used to better understand how we perceive sound quality, which physical aspects of sound matter most of its perceived quality, and possibly identify the important underlying sound quality attributes that influence our preferences. Critical listening is treated as a science, rather than the black art it currently is.

How to Listen also includes classroom topics in the fundamentals of human auditory perception, sound quality research in variables that significantly influence the quality of recorded and reproduced sound (e.g. loudspeakers, rooms, recordings, microphones) and a brief tutorial in how to conduct sound quality listening tests that produce accurate, reliable and valid results.

But before we get too far ahead of ourselves, there must be good reasons for training listeners since it requires an investment in time and resources. There is also the question of external validity: Can the sound quality preferences of trained listeners be extrapolated to the preferences of untrained listeners, and does this hold true across different cultures? These questions will be answered in the following sections.

Why Train Listeners?
There are several compelling reasons for training listeners. First, trained listeners have been shown to produce more discriminating and reliable judgment of sound quality than untrained listeners [1]. Fewer listener can be used to achieve a similar level of statistical confidence, which can result in savings in time and money. For example, a panel of 15 trained listeners can provide sound quality ratings with reliable statistical confidence in less than 8 hours. To achieve a similar level of confidence using untrained listeners would require about 10 times more listeners, 10 times more days to complete the testing, and cost 10 times more money to pay the listeners and staff conducting the tests. If the study is conducted by an independent research firm using 200-300 untrained listeners, the cost can easily exceed $100k.

A second reason for training listeners is that they are able to report precisely what they like and dislike about the sound quality using well-defined, meaningful terms. This feedback can provide important guidance for reengineering the product for optimal sound quality.

Besides training listeners for product research, there are benefits in training audio marketing and sales people to become better critical listeners. Training makes them better equipped to communicate sound quality issues to audio engineers and customers. As audio companies expand sales and operations in China, India, and other developing countries, there is a growing need to develop a common cross-cultural understanding as to what constitutes good sound and unacceptable sound.

Does Training Bias Listeners?
An important question is whether the training process itself biases the sound quality preferences of listeners. If the trained listener preferences are different from those of the targeted demographic, there is a danger the product may not be well received in the marketplace. This raises the age old question, “Is preference in sound quality a matter of personal taste - much like food, wine and music - or is it universal?”

To study this question, the author compared the performances and loudspeaker preferences of trained listeners versus untrained listeners [1]. Over 300 untrained listeners were tested over a period of 18 months where they compared four different loudspeakers under controlled, double-blind listening conditions. Their preferences were then compared to the preferences of the trained Harman listening panel.

The results, plotted in Figure 2, show that the rank ordering of the loudspeaker preferences were the same for both the trained and untrained listeners. There were two main differences in how the two groups of listeners responded. First, the trained listeners tended to give lower loudspeaker ratings overall. Second, the trained listeners distinguished themselves from the untrained listeners by generally giving more discriminating and consistent loudspeaker preference ratings.



Figure 2: The mean loudspeaker preference ratings and 95% confidence intervals are shown for four loudspeakers evaluated in a controlled, double-blind listening test. The results of different groups of untrained listeners are compared to those of the 12 Harman listeners.​


Relative Performances of Trained Versus Untrained Listeners
A common performance metric used to quantify the listener’s discrimination and consistency in rating sound quality is the F-statistic. This calculation is done by performing an analysis of variance (ANOVA) on the main variable being tested. In the above study [1], the performances of trained versus untrained listeners were compared by calculating the loudspeaker F-statistic for each individual listener.

Figure 3 shows the relative performance of different groups of untrained listeners based on their mean F-statistics compared to the F-statistics of the trained listeners. The relative performances of the untrained groups were: audio retailers (35%), audio reviewers (20%), audio marketing/sales staff (10%), and college students (4%). The poor performance of the students was explained by their tendency to give all four loudspeakers very similar and high ratings. A likely explanation for this was that they experienced a level of sound quality that was much higher than their everyday common experience: compressed MP3 music reproduced through headphones. The good news is that the students seemed to appreciate the higher fidelity sound based on the high ratings. In time, they will hopefully seek out better quality audio systems.



Figure 3: The relative performance of different groups of untrained listeners compared to the trained Harman listeners. Performance is based on the group’s average loudspeaker F-statistic which represents their ability to give discriminating and consistent preference ratings.​


Are There Cross-Cultural Preferences in Sound Quality?
One of the oldest controversies in audio is the notion that different cultures or geographical regions of the world have different sound quality preferences [see reference 2]. For example, it is often claimed that Japanese listeners have different loudspeaker preferences than Americans due to differences in language, music, cultural practices and norms, and the acoustics of their homes. So far, very little formal research has done on this subject. In some preliminary studies, the author has found no significant differences in sound quality preferences for loudspeakers and automotive audio systems among Chinese, Japanese and American listeners.

How to Listen: A New Listener Training Software Application
Research has found most sound quality percepts fall under the attribute categories of timbre, spatial, dynamic or related nonlinear distortion. Within these four attributes there are additional sub-attributes that describe more specific sonic characteristics of the attribute. For example, Bright-Dull and Full-Thin are timbre sub-attributes related to the relative emphasis and de-emphasis of high and low frequencies, respectively. Sub-attributes for spatial quality deal with the location and width of the auditory image(s), and the perceived sense of spaciousness or envelopment. Distortion sub-attributes include the presence of noise, hum, audible clipping and distortions specific to the audio device(s) under test.

How to Listen focuses on teaching listeners to evaluate sound quality differences based on these four attributes and their sub-attributes (see Figure 4). While listening to music recordings, one or more attributes are manipulated in a controlled way so that listeners recognize and report the magnitude of these changes using the appropriate terms and scales. An analogy to this would the Wine Aroma Wheel where expert wine tasters are trained to identify the intensities of different aroma-flavors perceived in the wine.



Figure 4: A list of the 17 different training tasks that focus on one or more of the four sound quality attributes: spectral (timbral), spatial, distortion and dynamics.​


To facilitate the training process, a proprietary computer-based software program called “How to Listen” was developed by Harman software engineers Sean Hess and Eric Hu. The software runs on both Mac and PC computers, and can play both stereo and multichannel music files. A real-time DSP engine built into the software application allows real-time manipulation of sound quality attributes in response to the listeners’ responses and performance.

There are currently five different types of training tasks that focus on one or more sound quality attributes (see Figure 4):

  1. Band Identification
  2. Spectral Plot
  3. Spatial Mapping
  4. Attribute Test
  5. Preference Test


Band Identification (see Figure 5) teaches listeners to identify spectral distortions based on their frequency, level and Q-factor using combinations of peak/dip and highpass/lowpass filters. In each trial, the listener compares the unequalized version of the music track (FLAT) to a version that has been equalized (EQ) using one of the filters drawn on the screen. The listener must select the correct filter (Filter 1 or 2) they believe has been applied to the equalized version.



Figure 5: A screen capture of the listener training task “Band Identification” in Harman’s “How to Listen” training software. The listener compares the unequalized music “Flat” to an equalized version (EQ) and must select the EQ filter that is associated with its sound.​


The training task Spectral Plot requires the listener to compare different music programs that have been equalized a number of different ways. The listener must select the equalization curve that best matches its sound quality. This task teaches listeners to behave like human spectrum analyzers. Once fully trained, the listener can draw a graph of the audio system’s frequency response based on how it sounds.

The Spatial Mapping task requires the listener to graphically indicate on a two-dimensional map where a sound appears in the listening space. The Attribute training task requires the listener to correctly rank order two or more sounds on a given attribute scale based on the intensity of the attribute (e.g. bright-dull). For the Preference task, the listener must give preference ratings where the sound quality of the music has been modified for one or more sound quality attributes. The performance of the listener is calculated based on a statistical post-hoc test that determines the discrimination and reliably of the listeners’ preference ratings. Together, these different training tasks teach listeners to critically evaluate any type of sound quality variation they are likely to encounter when listening to recorded and reproduced sound.

Conclusions
The evaluation of sound quality remains an elusive art in the audio industry. Better awareness, understanding, and appreciation of sound quality may be possible if there existed a method to teach listeners how to evaluate the quality of reproduced sound and report what they hear using well-defined and meaningful terminology. How to Listen is a listener training course that aims to achieve those goals. Listeners are taught to identify and rate audible changes to different sound quality percepts related to the spectral, spatial, dynamic and distortion qualities of recorded music. Performance metrics based on the discrimination, accuracy and reliably of the listeners’ responses are factored into whether the listener meets the criterion of being a “trained” listener. The question of whether a listener is truly golden eared or not, is no longer a matter of conjecture and debate since How to Listen will ultimately reveal the true answer.

References
[1] Sean E. Olive, "Differences in Performance and Preference of Trained Versus Untrained Listeners in Loudspeaker Tests: A Case Study," J. AES, Vol. 51, issue 9, pp. 806-825, September 2003. Download for free here, courtesy of Harman International.
[2] Sean E. Olive, “Are There Cross-Cultural Preferences in the Quality of Reproduced Sound?” Audio Musings, July 2, 2010.here.
 

amirm

Banned
Apr 2, 2010
15,813
37
0
Seattle, WA
Seems like a great opportunity to learn to really hear Sean. How will the class be conducted? I assume it is not held in a listening room.
 

Phelonious Ponk

New Member
Jun 30, 2010
8,677
23
0
Excellent! Welcome back.

Tim
 

JackD201

WBF Founding Member
Apr 20, 2010
12,308
1,425
1,820
Manila, Philippines
You've been missed Sean! Great to see you back :)
 

tonmeister2008

WBF Technical Expert
Jun 20, 2010
210
6
0
Westlake Village,CA
Hi Lee,

I think there is growing interest within Harman in educating consumers about good sound, and there is clearly a need for it. It is getting increasingly difficult to get a decent in-store demonstration, and if you do, the person may not be very knowledgable about what they are demonstrating. I would expect to see more online education on Harman websites, that might include elements of this course. The training software we've developed will also become available soon.
 
Last edited:

tonmeister2008

WBF Technical Expert
Jun 20, 2010
210
6
0
Westlake Village,CA
Seems like a great opportunity to learn to really hear Sean. How will the class be conducted? I assume it is not held in a listening room.

Unfortunately, the class will be taught in whatever room we are given at the Orleans Hotel where the ALMA Symposium is being held. So,I will have little control over the room acoustics. I intend to take my own 7.1 playback system consisting of decent loudspeakers. The course will consist of a variety of topics on auditory perception, how to conduct listening tests, factors that influence the quality and perception of sound quality, drawing upon research that I and others have conducted in the subject. Of course, there will be a lot of critical listening evaluation of reproduced sound.
 

amirm

Banned
Apr 2, 2010
15,813
37
0
Seattle, WA
Thanks Sean. How much does the course cost? The ALMA site forces you to register to figure that out.
 

tonmeister2008

WBF Technical Expert
Jun 20, 2010
210
6
0
Westlake Village,CA
Thanks Sean. How much does the course cost? The ALMA site forces you to register to figure that out.

I've made a copy of the registration form available here:

The cost is: $400 (members) and $500 (non-members) for the 6.5 hour course. I have waived my fee so the money goes to ALMA, which is a non-profit organization for the loudspeaker industry.

You can become an ALMA member for as little as $35 for students, $100 for professionals
 

kevinzoe

New Member
Nov 7, 2010
4
0
0
Hi Sean - Nice to see you back again. I and the others missed your wit and contributions to our audio addictions.

Just for fits and giggles, would you consider teaching the course in Monteal when the Son & L'Image show is one (usually late March or early April I think)? This is Canada's largest show and I'd bet my first born that there would be many people willing to register, including myself. Plus, you can get to see your old University buds and family too . . . Just a thought.

cheers,
kevin
 

Robert

Well-Known Member
Nov 10, 2010
163
3
405
Sean,

I think this is a very interesting idea. I wonder what would serve as a reference for learning the different sonic attributes. For example, I have been encountering products that have made me change my assumptions on the capabilities of audio reproduction. Since I will sometimes say, "I never knew it could do that", how would one know, until they heard it done.

There is also a downside for critical listening. While one may make better buying decisions, it may also lead people to focus upon what's missing. As a correlate to the above, once I know something is possible, I would have a very difficult time losing that new-found experience.

On another note, with all due respect, one of my goals in 2011 is to listen less critically to my stereo. Not sure if this is possible. But, I do think educating people about the process and composition, such as you are, is laudable.
 

Gregadd

WBF Founding Member
Apr 20, 2010
10,515
1,773
1,850
Metro DC
Sean Congrats on the HK 990.
 

Phelonious Ponk

New Member
Jun 30, 2010
8,677
23
0
Sean Congrats on the HK 990.

I'll second that and ask a question: Why does it seem that so little of HK's product line is readily available in the US?

Tim
 

KlausR.

Well-Known Member
Dec 13, 2010
291
29
333
Hello Sean,

The Spatial Mapping task requires the listener to graphically indicate on a two-dimensional map where a sound appears in the listening space.

From Blauert “Spatial Hearing” I understand that in natural hearing localisation blur in the horizontal plane may have values of up to 12º. I found some papers relating to localisation blur with phantom sources in 2-channel stereo:

Sandel et al., “Localization of sound from single and paired sources”, J.A.S.A. 27 (1955), no. 5, p.842

Wendt, “Directional hearing with twin-channel stereophony”, Rundfunktechnische Mitteilungen 8 (1964), no. 3, p.171 (in German)

Ortmeyer, “Localisation of sound sources in 2-channel stereophony”, Hochfrequenztechnik u. Elektroakustik 75 (1966), p.77 (in German)

The figures depend on type of test signal, frequency, listening environment (anechoic chamber, reverberation room, normal room), directivity of the loudspeakers, and there are quite substantial differences between the test subjects. Maximum value I found was 80 º off target (anechoic, 1500 Hz sine tone, no head movement).

In view of localisation blur the question is if spatial mapping is a reliable tool. What if in two test conditions the location of the sound source is perceived at two different locations within the blur zone for that test signal, that system type (2-channel, surround, etc), that type of room (anechoic, IEC, etc.): is it a real change in location or is it blur?

Related question: is there any research relating to 2-channel or surround using speech or music and what are the figures for blur?

Klaus
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing