Saturday, July 3, 2010

Are There Cross-Cultural Preferences in The Quality of Reproduced Sound?



Do we need a new user menu where you dial in your nationality to match your taste in sound quality?
[click on image to see a larger version].



The field of audio is ripe with myths and unsubstantiated opinions. One of the most enduring opinions is that there are cross-cultural preferences in the sound quality of reproduced sound. Some of the more common cross-cultural assertions I hear repeated among audiophiles, audio reviewers and audio marketing executives include these:
  1. Americans prefer more bass than Europeans and Japanese
  2. Japanese prefer less bass and more midrange (and listen at lower volumes)
  3. Germans prefer brighter sound
  4. The British prefer “tighter” or more over-damped bass
To my knowledge, these statements are anecdotal, and have not been tested in any rigorous scientific way. Marketing has already given us misguided menus in media players and automotive head units that adjust the equalization based on music genre (e.g.jazz, classical, hip hop, rock, country music, Christian music, and heavy metal, etc). Do we really need another one based on where we were born? What could the “Canadian” sound have in common with a predisposition towards liking cold long winters, hockey, Molson beer, maple syrup, beaver tails, national health care, and the music of KD Lang and Celine Dion?

While it is easy to dismiss the importance of cross-cultural preferences, the subject is gaining serious attention from audio manufacturers expanding into new markets like China, India, Russia and South America. Now the same age-old questions are being asked: Are there cross-cultural preferences in the quality of reproduced sound or is good sound universal and transcend cultural differences?


Possible Reasons Why Cross-Cultural Preferences in Sound Quality May Exist
Very little research in cross-cultural sound quality preferences exists. Nonetheless, here are some proposed reasons why they may exist according to various sources.
Language, Dialect, Music
Certain spectral balances may compliment and enhance the timbre and intelligibility of different languages and dialects. Similarly the culture’s ethnic music and its instrumentation may be enhanced from certain loudspeakers or EQ. Wouldn’t this enhancement be added to the recording by the artist or the producer when it was mixed? If so, why do we need to duplicate in the playback chain? Is there such a thing as too much enhancement (think Dolly Parton)?
Influence of Regional Building Construction and Room Acoustics
One explanation for regional tastes for certain types of loudspeakers is related to the design and construction of the region's homes and apartments. This would affect the noise isolation and acoustical properties of the room, and its interaction with the loudspeaker. Massive, rigid plaster walls commonly found in older construction in Europe would provide more noise isolation and less absorption of bass than less massive and rigid walls used in typical American construction today. It is argued that a loudspeaker with less bass might sound better in the European room. It should be pointed out that if the different rooms and loudspeakers combine in ways that in the final analysis produce the same sound, this doesn't really constitute a difference in preferred sound quality. Different means are being used to achieve the same end goal. Fortunately, there are technological solutions for dealing with loudspeaker-room interactions at low frequencies so that decent bass performance can be achieved regardless of the room’s size, dimensions and stiffness of its walls.
Influence of Social Norms and Practices
Cultural practices and norms may influence how much bass people like, and how loud they listen to their music. For example, Japanese apartment dwellers may prefer to listen to reproduced sound at lower volumes to avoid disturbing their neighbors, which is a serious social infraction. On the other hand, American urban apartment dwellers may be more tolerant of bass and higher playback levels due to better noise isolation from the wall construction. Tolerance to your neighbor's subwoofers and loud music comes more easily if you know they own a handgun. The right to listen to loud music and bass in America is sort of protected under the second amendment (i.e. the right to bear arms). :)


Possible Reasons Why Cross-Cultural Preferences May Not Exist or Matter
The following arguments do not directly prove that cross-cultural sound quality preferences do not exist. They do provide evidence that the cultural entertainment, broadcast, recording and audio industries have largely decided to ignore cross-cultural preferences. Either they don't believe they exist, or if they do, catering to them doesn't make sense from a business or philosophical viewpoint.
Audio Manufacturers: One Product, One Sound
Most audio companies sell the same model of product in every country, only changing the language of the packaging/owners manual and the power supply voltage to meet the local requirements. Measurements of loudspeakers from different countries of origin tend to aim towards the same performance target. There is nothing in the objective measurements or the listening test results that indicate a unique sound, voicing or preference that can be attributed to the country of origin whether the loudspeaker is British, German, Canadian, American, French, Italian, Danish or Japanese [1]-[3]. Accurate sound seems to be the common universal attribute that matters most. These studies did not formally or systematically study the culture or race of the listener as a factor in loudspeaker preference, so the definitive study remains to be done.
Recording/Film Industries: One Product,One Sound
To my knowledge, record companies do not release different mixes of their recordings to satisfy different cultural tastes in sound quality. Fans of Lady Ga Ga apparently equally like (or dislike) her sound on the recordings whether they are in America, Europe or Asia. Similarly, there is no option in the iTunes store where you indicate your nationality or culture before downloading your music.
Universal Loudspeaker / Audio Standards in Broadcast
If you look at international audio standards for broadcasting (AES, IEC, ITU, EBU), and read the loudspeaker papers written by researchers within the BBC (British), CBC (Canadian) and NHK (Japanese), you will find a common set of performance criteria: flat on-axis response, extended bandwidth in bass and treble, smooth off-axis response and low distortion. At the broadcast level, the playback chain in different countries is not being influenced by cross-cultural preferences in the targeted audience where the content will be heard.
Concert Halls and Live Music Performance
Acoustical design of concert halls have generally followed well established standards and practices based on research using international listening panels. Qualities such as spatial envelopment, reverberation, clarity and richness of timbre are universally accepted as desirable qualities. The classical and romantic composers specifically wrote their music for these particular acoustics, and to radically alter the acoustics would not well serve the art.
The Global Economy
In the new global economy, the political, cultural, socioeconomic and technological barriers have been largely removed. As communication between different cultures improves, this will likely influence their attitudes, tastes and perception towards culture, music and sound reproduction. If there are cross-cultural differences in sound quality preferences, it seems likely that in the future these differences will converge, and taste in sound quality will become more homogeneous (hopefully, in a positive way).
Audio is science in the service of art
This philosophy assumes that music, its performance and recording are part of the art, and the goal of sound reproduction is to accurately reproduce the art. To serve the art, there is no room for cultural preferences or individual tastes in the design of the audio equipment used for reproduction of the art. It is presumed that any cultural sound quality preferences will be encoded in when the music when it is performed and recorded, and doesn’t need to be added again in the playback chain.

Here is a parallel analogy in painting: When a Monet art exhibit travels to different countries, the art is not altered, transformed or "improved" to suit the local tastes of the country. Art lovers want to see the original Monet, not a new and improved version with edge enhancements, higher contrast and 3D effects. The same is true of the sound of Vienna Philharmonic when they do a world tour. When they tour Japan, they don’t leave half the bass section at home because the Japanese do not supposedly like bass. So why would we want to tamper with the original sound of the Vienna Philharmonic when playing recordings of them through our audio system?


Research in Cross-Cultural Preference in Sound Quality of Recorded and Reproduced Sound
In the realm of perception there is an essential pan-human unity, and that most differences among cultures is only a “fine tuning” [4].
To date, very little cross-cultural research has been done in the perception of sound quality. One of the challenges in cross-cultural research is ensuring that the listener instructions, sound quality descriptors and semantic definitions of the scales have the same meaning across cultures. Fortunately, there are methods for removing language from the perceptual task. Multidimensional scaling allows listeners to judge different pairs of sounds based on their similarity. Then the perceptual attributes of the sounds (e.g. timbre or spatial related) can be identified through multivariate statistical methods like principal component analysis. In a study of different guitar timbres, Martens et al. found that native speakers of English, Japanese, Bengali, and Sinhala perceived the same underlying dimensions, but used different adjectives/semantics to describe the attribute [5].
In another study that compared Japanese and English speaking listeners’ perception of music recordings made with four different 5-channel microphone techniques, the authors found a common understanding of three critical dimensions in which the quality of the recordings differed [6].
Recently, we have begun testing cross-cultural sound quality preferences of music reproduced through different loudspeakers, equalizations, and automotive audio systems using American, Japanese and Chinese speaking listeners. While this work is still ongoing, the preliminary results do not show any evidence of cross-cultural preferences among the different groups. Accurate sound reproduction seems to be the common link across the preferences of the different cultures.


Conclusions
Very little research has been done in cross-cultural preferences in the sound quality of reproduced sound. What we know is that differe Preliminary investigations by the author in preferred spectral balance of music reproduced through loudspeakers have not revealed any significant differences in cross-cultural preferences to date. If cross-cultural preferences exist, the music and audio industries have largely ignored catering to them, instead distributing products that are optimized for a single universal audience.
Finally, an important question is whether audio companies should even be catering to these cross-cultural preferences if research eventually finds that they indeed exist? If the audio industry takes an “audio science in the service of art” philosophy where the goal is to faithfully and accurately reproduce the art as the artist intended, the question of cross-cultural preferences becomes moot. If certain cultures don’t like the sound of the art, then that becomes an issue between the artist and the recording producer/record executive - not the audio manufacturer.

For more discussion on this topic, please head over to WhatsBestForum.


References
[1] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 1" J. AES Vol. 23, issue 4, pp. 227-235, April 1986. (download for free courtesy of Harman International).
[2]Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2," J. AES, Vol. 34, Issue 5, pp. 323-248, May 1986. (download for free courtesy of Harman International).
[3] Sean E. Olive, "Differences in Performance and Preference of Trained Versus Untrained Listeners in Loudspeaker Tests: A Case Study," J. AES, Vol. 51, issue 9, pp. 806-825, September 2003. (download for free courtesy of Harman International).
[4] John W. Berry, Ype H. Poortinga, Janak Pandey, Handbook of Cross-Cultural Psychology, Volume 1 Theory and Method, 2nd edition, Aug. 21, 1996.
[5] Martens, William L.; Giragama, Charith N. W.; Herath, Susantha; Wanasinghe, Dishna R.; Sabbir, Alam M.” Relating Multilingual Semantic Scales to a Common Timbre Space - Part II,” presented at the 115th Audio Engineering Convention, preprint 5895 (October 2003).

Monday, June 28, 2010

Science in the Service of Art

Last week, I've was given my own front page forum over at WhatsbestForum called "Science in the Service of Art", where I can write about any topic I wish. My first posting is called "Audio Science in the Service of Art".

I will probably post the same articles I write over there, on this blog as well. But for now, I recommend you go over there, read my article, and then leave your comments about what we need to do in order to improve the quality and consistency of recorded and reproduced music.

Harman is committed to a scientific approach towards the design and testing of audio products in the consumer, professional, and automotive audio spaces. Last week, Harman Kardon began a PR campaign called the "Science of Sound" where "Science in the Service of Art" is a major theme. You can read about this on the Harman Kardon web sites (click on the "about" link at the top of the page). Enjoy!

Friday, June 18, 2010

Some New Evidence That Generation Y May Prefer Accurate Sound Reproduction


Sound quality in mainstream music recording and reproduction is all but dead, at least according to the media reports published over the past year [1]-[6]. On the music production side, music quantity (as in volume and decibels) matters more than quality and dynamic range. Record executives and producers are forcing artists to squash the dynamics and life from their music in order to be the loudest record on the charts [5],[6]. Listening to one of these albums can induce an instant migraine, making you wonder if the record companies aren’t secretly owned by the makers of Excedrin (see slides 2-4 in this article’s accompanying PDF slide presentation or this YouTube Video ).
On the music reproduction side, convenience, portability and low cost are the purchase driving factors in this Mobile-Ipod Age of entertainment; sound quality need only be “good enough” [3]. The problem is that no one seems to be able to define what “good enough” sound quality means for Generation Y. Given that they represent the largest and youngest demographic in terms of music and audio equipment consumption, it's important to understand the attitudes and tastes of these twenty-somethings before it's too late. Getting these Millennials hooked on good sound now, means they're more likely to upgrade the audio systems in their future homes and automobiles acquired as they grow older and wealthier.

A common belief being spread by the media is that Generation Y is indifferent to sound quality, or worse, they prefer the tinny, sizzling sound of low-bit rate MP3 over higher quality lossless music formats (slide 4). This is based on an informal study conducted by a Stanford music professor, Jonathan Berger, who over a 7 year period found his students increasingly preferred music coded in lower quality lossy MP3 formats over higher quality lossless music formats [1]-[5]. “I think our human ears are fickle” says Berger. “What’s considered good or bad sound changes over time. Abnormality can become a feature” [1].


While Berger’s unpublished study raises more scientific questions than it provides answers, nonetheless it has been widely reported by the media, and has captured the attention of consumer and automotive audio marketing executives, who ultimately decide what level of sound quality is “good enough” for Generation Y (slides 4-7). There's an increased risk that sound quality may become the sacrificial lamb for products targeted at Millennials (they can’t tell the difference, after all) with the savings diverted to more salient “purchase drivers” such as industrial design, more features, advertising, and celebrity endorsements.
If someone doesn’t soon stand up for Generation Y and show some evidence that they care about sound quality, its death may become a self-fulfilling prophesy.

Some New Experiments on Generation Y Sound Quality Preferences For Music Reproduction
To this end, I recently conducted two listening experiments on a group of high school students (the younger half of Generation Y) to determine if their sound quality preferences in music reproduction were: a) consistent with those of older trained listeners used for product evaluation at Harman International, or alternatively b) indifferent or skewed towards preferring less accurate sound (slide 8).
Two research questions were asked in separate tests:
  1. Do the students prefer the sound quality of lossy MP3 (128 kbps) music reproduction over the original lossless CD version?
  2. Do the students prefer music reproduced through a more accurate loudspeaker given four different options that vary in accuracy and sound quality?
The students, who ranged in ages from 15 to 18 years, were visiting Harman on a class field trip (slide 9). A description of the listening tests and the results are summarized in the following sections.

Do High School Students Prefer Lossy MP3 Music Over Lossless CD-Quality Formats?
In the first double-blind listening experiment (slide 11), the students were presented two versions of the same program selection encoded in:
  1. MP3 (Lame 3.97, version 2.3; constant bit rate @ 128 kbps). Note that this a 2 year old MP3 encoder that may be more representative of what Berger used in his study.
  2. CD - The original lossless CD-quality version (16-bit, 44.1 kHz).
After hearing the same music several times in both MP3 and CD formats, the listeners indicated on a scoresheet which one they preferred: A or B. They were also asked to indicate the magnitude of preference (slight/moderate/strong), and provide comments describing the differences in sound quality they heard.
A total of 12 trials were completed in which preference choices were recorded using four different short program loops in three separate trials (slide 10). Three music programs and a recording of applause at a live concert were chosen based on their ability to provide audible differences between the lossy and lossless formats. The applause provided listeners a familiar acoustic signal that the author felt most listeners could easily judge based on its apparent naturalness.
The order of programs and MP3/CD formats was randomized by the listening test software to eliminate any order-related effects. Switching between A and B was performed by the test administrator via a custom Harman listening test software application. The listening test was conducted in the Harman International Reference Room, which provided a quiet, and controlled acoustic environment typical of a domestic listening room. Listening was done through a high quality, stereo playback system (JBL LSR 6336 with four JBL HB5000 subwoofers) calibrated at the listening locations. A comfortable playback level (on average 78 dB (B)) was used throughout the tests.
Two groups of nine listeners each participated in two separate listening sessions, which lasted about 30 minutes each.

Listening Test Results: Students Prefer Music in Lossless CD Versus MP3 Formats
When all 12 trials were tabulated across all listeners, the high school students preferred the lossless CD format over the MP3 version in 67% of the trials (slide 16). The CD format was preferred in 145 of 216 trials (p<0.001).
As expected, there were differences among individual students in their ability to formulate consistent preference choices (slide 17). Nearly 40% of the listeners gave a sufficient number of preference choices (9 of 12) to establish a statistically significant preference for CD (p <= 0.054). Only one of the 18 listeners preferred MP3 over CD (7 versus 5 trials), although the preference was not statistically significant ( p = 0.19). Other listeners were either guessing, and/or were inconsistent in their choices. With additional training and trials, the performance of these listeners would likely improve.
On average, the magnitude of preference for CD over MP3 was also stronger based on the frequency of responses assigned to the categories of preference: slight, moderate and strong preference (slide 18). When CD format was preferred, listeners assigned a proportionally higher number of moderate-to-strong responses compared to when MP3 was the preferred choice.
The preference for CD over MP3 formats was relatively independent of the program selection (slide 19). CD format was preferred for all four programs, with only a slight drop (68.5 % to 63%) for program JW.
Finally, the comments given by the more consistent listeners (slide 20) reveal the nature of audible differences between MP3 and CD. The CD version was often described as sounding more dynamic and brighter, with more impact on percussive sounds. MP3 versions of the programs were described as sounding duller, dynamically compressed with swirling-pitch modulation artifacts on vocal and strings.

Do High School Students Prefer Neutral/Accurate Loudspeakers?
Given that the high school students preferred the higher quality music format (CD over MP3), would their taste for accurate sound reproduction hold true when evaluating different loudspeakers? To test this question, the students participated in a double-blind loudspeaker test where they rated four different loudspeakers on an 11-point preference scale. The preference scale had semantic differentials at every second interval defined as: 1 (really dislike), 3 (dislike), 5 (neutral), 7 (like) and 9 (really like). The relative distances in ratings between pairs of loudspeakers indicated the magnitude of preference: ≥ 2 points represent a strong preference, 1 point a moderate preference and ≤ 0.5 point a slight preference.
The four loudspeakers were floor-standing the models (slide 22): Infinity Primus 362 ($500 a pair), Polk Rti10 ($800), Klipsch RF35 ($600), and Martin Logan Vista ($3800). Each loudspeaker was installed on the automated speaker shuffler in Harman International’s Multichannel Listening Lab, which positions each loudspeaker in same the location when the loudspeaker is active. In this way, the loudspeaker positional biases are removed from the test. Each loudspeaker was level-matched to within 0.1 dB at the primary listening location.
Listeners completed a series of four trials where they could compare each of the four loudspeakers reproducing a number of times before rating each loudspeaker on an 11-point preference scale. Two different music programs were used with two observations. At the beginning of each trial, the computer randomly assigned four letters (A,B,C,D) to the loudspeakers. This meant that the loudspeaker ratings in consecutive trials were more or less independent (slide 23).

Results: High School Students Prefer More Accurate, Neutral Loudspeakers
When averaged across all listeners and programs, there was moderate-strong preference for the Infinity Primus 362 loudspeaker over the other three choices (slide 25). In the results shown in the accompanying slide, as an industry courtesy, the brands of the competitors’ loudspeakers are simply identified as Loudspeakers B,C and D.
As a group, the listeners were not able to formulate preferences among the three lower rated loudspeakers B,C, and D, which were all imperfect in different ways. For an untrained listener, sorting out these different types of imperfections and assigning consistent ratings can be a difficult task without practice and training [5].
The individual listener preferences (slide 26) reveal that 13 of the 18 listeners (72%) preferred the Infinity loudspeaker based on their ratings averaged across all programs and trials.
When comparing the student's rank ordering of the loudspeakers to those of the trained Harman listeners (slide 27), we see good agreement between the two groups. The one exception is Loudspeaker C, which the trained listeners strongly disliked. The general agreement between trained and untrained listener loudspeaker preferences illustrated in this test is consistent with previous studies where a different set of listeners and loudspeakers were used [5],[6]. As found in the previous study, the trained listeners, on average, rated each loudspeaker about 1.5 preference rating lower than the untrained listeners, and the trained listeners were more discriminating and consistent in their ratings[5],[7].
The comprehensive set of anechoic measurements for each loudspeaker is compared to its preference rating (slide 28). There are clear visual correlations between the set of technical measurements and listeners’ loudspeaker preference ratings. The most preferred loudspeaker (Infinity Primus 362) had the flattest measured on-axis and listening window curves (top two curves), and the smoothest first reflection, sound power and first reflection/sound power directivity index curves (the third, fourth, fifth and sixth curves from the top). The other loudspeaker models tended to deviate from this ideal linear behavior, which resulted in lower preference ratings. Again, this relationship between loudspeaker preference and a linear frequency response is consistent with similar studies conducted by the author and Toole [9],[10].
Finally, sound quality doesn't necessarily cost more money to obtain as illustrated in these experiments. The most accurate and preferred loudspeaker - the Infinity Primus 362 - was also the least expensive loudspeaker in the group at $500 a pair. It doesn't cost any more money to make a loudspeaker sound good, as it costs to make it sound bad. In fact, the least accurate loudspeaker (Loudspeaker C) cost almost 8x more money ($3,800) than the most accurate and preferred model. Sound quality can be achieved by paying close attention to the variables that scientific research says matter, and then applying good engineering design to optimize those variables at every product price point.

Conclusions
A group of 18 high school students participated in two double-blind listening tests that measured their sound quality preferences for music reproduced in lossy (MP3 @ 128 kbps) and lossless (CD quality) formats, as well as music reproduced through loudspeakers that varied in accuracy. In both tests, the high school students preferred the most accurate option, preferring CD over MP3, and the most accurate loudspeaker over the less accurate options.
While this study is still in its early phase, these preliminary results suggest that these teenagers can reliably discriminate among different degradations in sound quality in music reproduction. When given the opportunity to hear and compare different qualities of sound reproduction, the high school students preferred the higher quality, more accurate reproduction over the lower quality choices.
The audio industry should not discount the potential opportunities to provide a higher quality audio experience to members of Generation Y. The popular belief that they don’t care about or appreciate sound quality needs to be critically reexamined. This data suggests there are opportunities to sell good sounding audio products to Generation Y as long as the products hit the right features and price points,. The audio industry should also provide these consumers the necessary education and information (i.e. meaningful performance specifications) to identify the good sounding products from the duds. Science can already do this (review slide 28), it’s simply a matter of making the information more widely available.

References
[1] Joseph Plambeck, “In Mobile Age, Sound Quality Steps Back,” New York Times, May 9, 2010.
[2] Andrew Edgecliffe-Johnson, “Could a Pair of Headphones Save the Music Business?” Financial Times, June 12 2010.
[3] Robert Capps, “The Good Enough Revolution: When Cheap and Simple Is Just Fine” Wired Magazine, August 24, 2009.
[4] Dale Dougherty, “The Sizzling Sound of Music,” O’Reilly Radar, March 1 2009.
[5] Nora Young, Full Interview: Jonathan Berger on mp3s and “Sizzle”, CBC Radio , March 24, 2009.
[6] The Loudness Wars: Why Music Sounds Worse, from All Things Considered, NPR Music, December 31, 2009.
[5] Sean E. Olive, "Differences in Performance and Preference of Trained Versus Untrained Listeners in Loudspeaker Tests: A Case Study," J. AES, Vol. 51, issue 9, pp. 806-825, September 2003. (download for free courtesy of Harman International).
[6] Sean Olive, “Part 1 - Do Untrained Listeners Prefer the Same Loudspeakers as Untrained Listeners?” Audio Musings, December 26, 2008.
[7] Sean Olive, Part 2 - Differences in Performance of Trained Versus Untrained Listeners, Audio Musings, December 27, 2008.
[8] Sean Olive, “Part 3 - Relationship between Loudspeaker Measurements and Listener Preferences”, Audio Musings, December 28, 2008.
[9] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 1" J. AES Vol. 23, issue 4, pp. 227-235, April 1986. (download for free courtesy of Harman International).
[10] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2," J. AES, Vol. 34, Issue 5, pp. 323-248, May 1986. (download for free courtesy of Harman International).

Saturday, May 1, 2010

Evaluating the Sound Quality of Ipod Music Stations: Part 3 Measurements



In Part 3 of this article, the acoustical measurements of three popular Ipod Music Stations (Harman Kardon MS100, Bose SoundDock 10 and Bowers & Wilkins Zeppelin) are examined to see if they corroborate listeners’ sound quality ratings of the products based on controlled double-blind listening tests. Part 2 summarized the results of those listening tests, and Part 1 described the listening test methodology used for this research.
Throughout this article, I will refer to some slides of a presentation that can be downloaded as a PDF or viewed as a YouTube video.
Mono or Stereo Acoustical Measurements?
There is a substantial body of scientific research on the subjective and objective testing of conventional stereo loudspeakers [1]-[5]. Unfortunately, the same is not true for Ipod Music Stations: this raises several research questions about how they should be evaluated and measured.
The first important question is whether the acoustical measurements should be done in mono or stereo. Due to the proximity of the left and right channel transducer arrays in Music Stations, there is the potential for constructive and destructive interference when both channels are active that will vary according to frequency and the relative inter-channel levels and phases of the music signals. To study this phenomena, the left and right channels were measured and analyzed as both single and combined channels. Generally, we found very little difference in the frequency responses (magnitude and phase) of the left and right channels. Combining the two channels only led to the expected 6 dB increase in sound pressure level (SPL).
Anechoic Measurements of the Music Stations
Each Music Station was measured at distance of 2 meters in the large anechoic chamber at Harman International. The chamber is anechoic down to 60 Hz and this is extended to 20 Hz through a calibration procedure. Each Music Station was subjected to the same battery of measurements used for designing and testing Revel, Infinity and JBL home loudspeakers. A total of 70 frequency response measurements were taken at 10 degree increments in both horizontal and vertical orbits (slide 4). These measurements were then spatially averaged and weighted to characterize the direct, early and late reflected sounds in a typical listening room, in addition to the calculated directivity indices (slides 5-8).
The family of measurement curves (slide 9) reveal significant differences among the three Music Stations in terms of their smoothness and low frequency extension below 70 Hz.
Music Station A has the smoothest frequency response across the family of curves, which corroborates listeners’ comments about its neutral sound and absence of colorations (see slide 11 of Part 2). There is also physical evidence in the measurements that explain listener comments about Music Station A sounding a bit bright and thin, due to a combination of the upward spectral tilt in its listening window curve, and its higher low frequency cutoff.
Music Station B has even more peaks and dips in the curves that contribute to the higher frequency of listener comments regarding audible coloration. Particularly problematic is the large broad resonance at 500 Hz that is visible in both the direct and reflected sounds produced by the product. However, there is nothing in the measurements to explain listeners’ complaints about its boomy bass.
Music Station C clearly has the least tidy set of measurement curves with a significant hole centered at 2 kHz in the on-axis curve. There are visible resonances in the measurements that elicited frequent listener comments about “midrange unevenness” and “coloration.” Finally, the sound power response and directivity indices reveal that this Music Station becomes increasingly directional at higher frequencies compared to its competitors. This could contribute to coloration and dullness at off-axis listening positions and at further listening distances.
Relationship between Anechoic Measurements and Listener Preference
The anechoic measurements of the Music Stations are shown again in Slide 10 along with the listener preference ratings. From this, we see that the overall smoothness of the family of curves appeared to be important underlying factor that influenced listeners’ Music Station preference ratings.
Correlations Between Anechoic Measurements and Perceived Spectral Balance: The Direct Sound Influences the Perceived Spectral Balance Above 300 Hz
There has been a 30+ year debate in the audio community regarding which set of acoustical measurements best predict the loudspeaker’s perceived sound quality in a typical listening room. There are several different camps that include the direct sound response advocates, the sound power response advocates, the in-room measurement advocates, and others, like myself, who argue that you need a combination of all of the above measurements to accurately predict how the loudspeakers will sound in a room.
One way to tackle this debate is to study the correlation between different loudspeaker measurements and listeners’ perceived spectral balance of the loudspeakers in a room. Slide 11 shows the perceived spectral balance ratings of the Music Stations versus the family of anechoic curves that include the listening window (direct sound), first reflections and sound power response.
For Music Station A, there is good agreement between the perceived spectral balance and the listening window curve, which represents the direct sound over a ± 30 degree horizontal angle. For Music Station B, there is generally poor agreement: listeners complained about boomy bass, yet there is nothing in these measurements to suggest why. There is clearly some information missing in the anechoic measurements and/or perhaps the subjective ratings are faulty. We will come back to this topic later.
For Music Station C, there is good agreement between the perceived spectral balance and the listening window curve (direct sound), with indications that the resonances centered at 1.5 and 3.5 kHz were heard and registered by the listeners.
In summary, it seems that for at least two of the Music Stations, the perceived spectral balance can be approximated by looking at the listening window curves that represent the direct sound. However, there is information missing in the anechoic measurements that don’t explain perceptual effects below 300 Hz.
In-Room Measurements of the Music Stations
Below about 300 Hz, the room acoustics and the Music Station/listener positions can have a significant influence on the perceived quality of reproduced sound. Yet, these physical effects are not captured in the anechoic measurements described in the previous section. To further examine these effects, steady-state frequency response measurements of the Music Stations were taken at the primary listening seat at 6 different microphone positions, and then spatially averaged to remove highly localized acoustical interference effects (slide 12). The 1/6-octave smoothed curves for each Music Station are shown in slide 13. Below 200 Hz, there is evidence of room resonances (high Q peaks and dips) and boundary effects that were absent in the previous anechoic measurements (slide 9). Music Station A had less apparent boundary gain than the other two products, probably because the boundary effect was accounted for in its design.

Correlation Between In-Room Measurements and Perceived Spectral Balance: The Influence of Room and Boundary Effects Below 300 Hz
The in-room measurements are plotted in slide 13 along with listeners’ perceived spectral balance ratings. Here, the in-room measurements have been super-smoothed (1-octave) to better correspond to the frequency resolution of the subjective ratings.
Below 300 Hz, there is better agreement between the in-room measurements and listeners’ spectral ratings than observed using the anechoic measurements (slide 11). However, above 300 Hz, there is generally better agreement between the anechoic measurement and spectral ratings, particularly using the listening window curve that represents the direct sound. This confirms the important role that the direct sound plays in our perception of reproduced sound. Below 300 Hz, the room’s standing waves and boundary effects play a dominant role in the quality and quantity of bass we hear. Previous studies [5] have shown bass quality accounts for 30% of listener preference, and cannot be ignored.
Dynamic Compression Measurements
Our scientific understanding of the perception and measurement of nonlinear distortions in loudspeakers is still quite poor. There are currently no standard loudspeaker measurements that adequately capture the perceptual significance of dynamic compression and the associated distortions it produces. This is an area of audio that is in need of more research.
Listeners reported that Music Station A had fewer audible nonlinear distortions than the other two Music Stations. However, it was not clear if the distortions were real or due to a cognitive bias known as the “Halo effect.” Examining the objective distortion measurements will hopefully clarify what is real and not real.
The dynamic linearity of the Music Stations was tested by measuring their anechoic frequency response at different playback SPL’s from 76 to 100 dB SPL (@ 1 meter distance) in 6 dB increments. A relatively short length 4 s log sweep was used as a test signal to minimize the thermal effects on the transducers. Consequently, the measured dynamic compressions shown below were largely related to the behavior of the electronic limiters in the Music Stations, designed to prevent the amplifier clipping, which could otherwise potentially damage the transducers.
Slide 16 shows the dynamic compression for each Music Station. The frequency response measured at 82, 88, 94 and 100 dB SPL’s have been normalized to the 76 dB measurement. Any dynamic compression effects would be exhibited as a deviation from 0 dB. In examining these graphs, Music Station A produced 6 dB more output (100 dB @ 1 meter) than the other Music Stations without significant compression effects.
On the surface, the relationship between these measurements and listeners’ distortion ratings seems to be straightforward: the Music Stations with the higher amounts of compression received lower distortion ratings (slide 17). However, the SPL’s at which the compression effects occurred (> 94 dB) were higher than those used in the listening test.

Harmonic Distortion Measurements
Harmonic distortion (second and third harmonic only) measurements were made in the anechoic chamber at a SPL of 95 dB. The distortion levels of the harmonics are plotted along with the fundamental for each of the Music Stations in slide 18. Note that the levels of the harmonics have been raised 20 dB for the sake of clarity.
All of the Music Stations exhibited relatively high distortion at low frequencies below 100 Hz, with generally less harmonic distortion at higher frequencies. Music Station B differentiated itself by having higher levels of second and third harmonic distortion between 100 Hz to 1 kHz. Music Station C had the lowest distortion even though it received the lowest preference and distortion ratings from the listeners.
In conclusion, the harmonic distortion measurements of the Music Stations are not particularly good at predicting listeners’ distortion ratings, or overall preference in sound quality. This confirms many previous loudspeaker studies that have reported that harmonic distortion measurements are poor predictors of listeners’ overall impression of the loudspeaker. This can be explained by the fact that the distortions are often below the threshold of audibility, and the measurements themselves do not account for the masking properties of human hearing.

Conclusions
This article has shown evidence that a combination of comprehensive anechoic and in-room measurements can help explain listeners’ preferences and spectral balance ratings of the Music Stations evaluated in controlled listening tests.
Above 300 Hz, the anechoic derived listening window curve correlated well with listeners’ spectral balance ratings, whereas the in-room measurements better explained the Music Station’s acoustical interactions with the room below 300 Hz. In these particular tests, the overall smoothness of the on and off-axis frequency response curves provided the best overall indicator of listeners’ preferences and their comments.
Dynamic compression measurements revealed significant differences among the Music Stations in terms of their linear SPL output capability. The most preferred Music Station could play 6 dB louder (100 dB SPL @ 1 meter) than the other units without exhibiting significant dynamic compression. It is unlikely that this was a factor in the listening tests since the units were evaluated at a comfortable average level of 78 dB (B-weighted, slow). Finally, distortion measurements revealed some differences among the products but had no clear correlation with listeners’ sound quality ratings. This highlights the need for further research into the perception and measurement of nonlinear distortion in loudspeakers so that loudspeaker engineers can optimize their designs using psychoacoustic criteria.
References
[1] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 1" J. AES Vol. 23, issue 4, pp. 227-235, April 1986. (download for free courtesy of Harman International).
[2] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2," J. AES, Vol. 34, Issue 5, pp. 323-248, May 1986. (download for free courtesy of Harman International).
[3] W. Klippel, "Multidimensional Relationship between Subjective Listening Impression and Objective Loudspeaker Parameters", Acustica 70, Heft 1, S. 45 - 54, (1990).
[4] Sean E. Olive, “A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part I - Listening Test Results,” presented at the 116th AES Convention, preprint 6113 (May 2004).
[5] Sean E. Olive, “A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part 2 - Development of the Model,” presented at the 117th AES Convention, preprint 6190 (October 2004).