Audio Musings by Sean Olive: June 2010

Sound quality in mainstream music recording and reproduction is all but dead, at least according to the media reports published over the past year [1]-[6]. On the music production side, music quantity (as in volume and decibels) matters more than quality and dynamic range. Record executives and producers are forcing artists to squash the dynamics and life from their music in order to be the loudest record on the charts [5],[6]. Listening to one of these albums can induce an instant migraine, making you wonder if the record companies aren’t secretly owned by the makers of Excedrin (see slides 2-4 in this article’s accompanying PDF slide presentation or this YouTube Video ).

On the music reproduction side, convenience, portability and low cost are the purchase driving factors in this Mobile-Ipod Age of entertainment; sound quality need only be “good enough” [3]. The problem is that no one seems to be able to define what “good enough” sound quality means for Generation Y. Given that they represent the largest and youngest demographic in terms of music and audio equipment consumption, it's important to understand the attitudes and tastes of these twenty-somethings before it's too late. Getting these Millennials hooked on good sound now, means they're more likely to upgrade the audio systems in their future homes and automobiles acquired as they grow older and wealthier.

A common belief being spread by the media is that Generation Y is indifferent to sound quality, or worse, they prefer the tinny, sizzling sound of low-bit rate MP3 over higher quality lossless music formats (slide 4). This is based on an informal study conducted by a Stanford music professor, Jonathan Berger, who over a 7 year period found his students increasingly preferred music coded in lower quality lossy MP3 formats over higher quality lossless music formats [1]-[5]. “I think our human ears are fickle” says Berger. “What’s considered good or bad sound changes over time. Abnormality can become a feature” [1].

While Berger’s unpublished study raises more scientific questions than it provides answers, nonetheless it has been widely reported by the media, and has captured the attention of consumer and automotive audio marketing executives, who ultimately decide what level of sound quality is “good enough” for Generation Y (slides 4-7). There's an increased risk that sound quality may become the sacrificial lamb for products targeted at Millennials (they can’t tell the difference, after all) with the savings diverted to more salient “purchase drivers” such as industrial design, more features, advertising, and celebrity endorsements.

If someone doesn’t soon stand up for Generation Y and show some evidence that they care about sound quality, its death may become a self-fulfilling prophesy.

Some New Experiments on Generation Y Sound Quality Preferences For Music Reproduction

To this end, I recently conducted two listening experiments on a group of high school students (the younger half of Generation Y) to determine if their sound quality preferences in music reproduction were: a) consistent with those of older trained listeners used for product evaluation at Harman International, or alternatively b) indifferent or skewed towards preferring less accurate sound (slide 8).

Two research questions were asked in separate tests:

Do the students prefer the sound quality of lossy MP3 (128 kbps) music reproduction over the original lossless CD version?
Do the students prefer music reproduced through a more accurate loudspeaker given four different options that vary in accuracy and sound quality?

The students, who ranged in ages from 15 to 18 years, were visiting Harman on a class field trip (slide 9). A description of the listening tests and the results are summarized in the following sections.

Do High School Students Prefer Lossy MP3 Music Over Lossless CD-Quality Formats?

In the first double-blind listening experiment (slide 11), the students were presented two versions of the same program selection encoded in:

MP3 (Lame 3.97, version 2.3; constant bit rate @ 128 kbps). Note that this a 2 year old MP3 encoder that may be more representative of what Berger used in his study.
CD - The original lossless CD-quality version (16-bit, 44.1 kHz).

After hearing the same music several times in both MP3 and CD formats, the listeners indicated on a scoresheet which one they preferred: A or B. They were also asked to indicate the magnitude of preference (slight/moderate/strong), and provide comments describing the differences in sound quality they heard.

A total of 12 trials were completed in which preference choices were recorded using four different short program loops in three separate trials (slide 10). Three music programs and a recording of applause at a live concert were chosen based on their ability to provide audible differences between the lossy and lossless formats. The applause provided listeners a familiar acoustic signal that the author felt most listeners could easily judge based on its apparent naturalness.

The order of programs and MP3/CD formats was randomized by the listening test software to eliminate any order-related effects. Switching between A and B was performed by the test administrator via a custom Harman listening test software application. The listening test was conducted in the Harman International Reference Room, which provided a quiet, and controlled acoustic environment typical of a domestic listening room. Listening was done through a high quality, stereo playback system (JBL LSR 6336 with four JBL HB5000 subwoofers) calibrated at the listening locations. A comfortable playback level (on average 78 dB (B)) was used throughout the tests.

Two groups of nine listeners each participated in two separate listening sessions, which lasted about 30 minutes each.

Listening Test Results: Students Prefer Music in Lossless CD Versus MP3 Formats

When all 12 trials were tabulated across all listeners, the high school students preferred the lossless CD format over the MP3 version in 67% of the trials (slide 16). The CD format was preferred in 145 of 216 trials (p<0.001).

As expected, there were differences among individual students in their ability to formulate consistent preference choices (slide 17). Nearly 40% of the listeners gave a sufficient number of preference choices (9 of 12) to establish a statistically significant preference for CD (p <= 0.054). Only one of the 18 listeners preferred MP3 over CD (7 versus 5 trials), although the preference was not statistically significant ( p = 0.19). Other listeners were either guessing, and/or were inconsistent in their choices. With additional training and trials, the performance of these listeners would likely improve.

On average, the magnitude of preference for CD over MP3 was also stronger based on the frequency of responses assigned to the categories of preference: slight, moderate and strong preference (slide 18). When CD format was preferred, listeners assigned a proportionally higher number of moderate-to-strong responses compared to when MP3 was the preferred choice.

The preference for CD over MP3 formats was relatively independent of the program selection (slide 19). CD format was preferred for all four programs, with only a slight drop (68.5 % to 63%) for program JW.

Finally, the comments given by the more consistent listeners (slide 20) reveal the nature of audible differences between MP3 and CD. The CD version was often described as sounding more dynamic and brighter, with more impact on percussive sounds. MP3 versions of the programs were described as sounding duller, dynamically compressed with swirling-pitch modulation artifacts on vocal and strings.

Do High School Students Prefer Neutral/Accurate Loudspeakers?

Given that the high school students preferred the higher quality music format (CD over MP3), would their taste for accurate sound reproduction hold true when evaluating different loudspeakers? To test this question, the students participated in a double-blind loudspeaker test where they rated four different loudspeakers on an 11-point preference scale. The preference scale had semantic differentials at every second interval defined as: 1 (really dislike), 3 (dislike), 5 (neutral), 7 (like) and 9 (really like). The relative distances in ratings between pairs of loudspeakers indicated the magnitude of preference: ≥ 2 points represent a strong preference, 1 point a moderate preference and ≤ 0.5 point a slight preference.

The four loudspeakers were floor-standing the models (slide 22): Infinity Primus 362 ($500 a pair), Polk Rti10 ($800), Klipsch RF35 ($600), and Martin Logan Vista ($3800). Each loudspeaker was installed on the automated speaker shuffler in Harman International’s Multichannel Listening Lab, which positions each loudspeaker in same the location when the loudspeaker is active. In this way, the loudspeaker positional biases are removed from the test. Each loudspeaker was level-matched to within 0.1 dB at the primary listening location.

Listeners completed a series of four trials where they could compare each of the four loudspeakers reproducing a number of times before rating each loudspeaker on an 11-point preference scale. Two different music programs were used with two observations. At the beginning of each trial, the computer randomly assigned four letters (A,B,C,D) to the loudspeakers. This meant that the loudspeaker ratings in consecutive trials were more or less independent (slide 23).

Results: High School Students Prefer More Accurate, Neutral Loudspeakers

When averaged across all listeners and programs, there was moderate-strong preference for the Infinity Primus 362 loudspeaker over the other three choices (slide 25). In the results shown in the accompanying slide, as an industry courtesy, the brands of the competitors’ loudspeakers are simply identified as Loudspeakers B,C and D.

As a group, the listeners were not able to formulate preferences among the three lower rated loudspeakers B,C, and D, which were all imperfect in different ways. For an untrained listener, sorting out these different types of imperfections and assigning consistent ratings can be a difficult task without practice and training [5].

The individual listener preferences (slide 26) reveal that 13 of the 18 listeners (72%) preferred the Infinity loudspeaker based on their ratings averaged across all programs and trials.

When comparing the student's rank ordering of the loudspeakers to those of the trained Harman listeners (slide 27), we see good agreement between the two groups. The one exception is Loudspeaker C, which the trained listeners strongly disliked. The general agreement between trained and untrained listener loudspeaker preferences illustrated in this test is consistent with previous studies where a different set of listeners and loudspeakers were used [5],[6]. As found in the previous study, the trained listeners, on average, rated each loudspeaker about 1.5 preference rating lower than the untrained listeners, and the trained listeners were more discriminating and consistent in their ratings[5],[7].

The comprehensive set of anechoic measurements for each loudspeaker is compared to its preference rating (slide 28). There are clear visual correlations between the set of technical measurements and listeners’ loudspeaker preference ratings. The most preferred loudspeaker (Infinity Primus 362) had the flattest measured on-axis and listening window curves (top two curves), and the smoothest first reflection, sound power and first reflection/sound power directivity index curves (the third, fourth, fifth and sixth curves from the top). The other loudspeaker models tended to deviate from this ideal linear behavior, which resulted in lower preference ratings. Again, this relationship between loudspeaker preference and a linear frequency response is consistent with similar studies conducted by the author and Toole [9],[10].

Finally, sound quality doesn't necessarily cost more money to obtain as illustrated in these experiments. The most accurate and preferred loudspeaker - the Infinity Primus 362 - was also the least expensive loudspeaker in the group at $500 a pair. It doesn't cost any more money to make a loudspeaker sound good, as it costs to make it sound bad. In fact, the least accurate loudspeaker (Loudspeaker C) cost almost 8x more money ($3,800) than the most accurate and preferred model. Sound quality can be achieved by paying close attention to the variables that scientific research says matter, and then applying good engineering design to optimize those variables at every product price point.

Conclusions

A group of 18 high school students participated in two double-blind listening tests that measured their sound quality preferences for music reproduced in lossy (MP3 @ 128 kbps) and lossless (CD quality) formats, as well as music reproduced through loudspeakers that varied in accuracy. In both tests, the high school students preferred the most accurate option, preferring CD over MP3, and the most accurate loudspeaker over the less accurate options.

While this study is still in its early phase, these preliminary results suggest that these teenagers can reliably discriminate among different degradations in sound quality in music reproduction. When given the opportunity to hear and compare different qualities of sound reproduction, the high school students preferred the higher quality, more accurate reproduction over the lower quality choices.

The audio industry should not discount the potential opportunities to provide a higher quality audio experience to members of Generation Y. The popular belief that they don’t care about or appreciate sound quality needs to be critically reexamined. This data suggests there are opportunities to sell good sounding audio products to Generation Y as long as the products hit the right features and price points,. The audio industry should also provide these consumers the necessary education and information (i.e. meaningful performance specifications) to identify the good sounding products from the duds. Science can already do this (review slide 28), it’s simply a matter of making the information more widely available.

References

[1] Joseph Plambeck, “In Mobile Age, Sound Quality Steps Back,” New York Times, May 9, 2010.

[2] Andrew Edgecliffe-Johnson, “Could a Pair of Headphones Save the Music Business?” Financial Times, June 12 2010.

[3] Robert Capps, “The Good Enough Revolution: When Cheap and Simple Is Just Fine” Wired Magazine, August 24, 2009.

[4] Dale Dougherty, “The Sizzling Sound of Music,” O’Reilly Radar, March 1 2009.

[5] Nora Young, Full Interview: Jonathan Berger on mp3s and “Sizzle”, CBC Radio , March 24, 2009.

[6] The Loudness Wars: Why Music Sounds Worse, from All Things Considered, NPR Music, December 31, 2009.

[5] Sean E. Olive, "Differences in Performance and Preference of Trained Versus Untrained Listeners in Loudspeaker Tests: A Case Study," J. AES, Vol. 51, issue 9, pp. 806-825, September 2003. (download for free courtesy of Harman International).

[6] Sean Olive, “Part 1 - Do Untrained Listeners Prefer the Same Loudspeakers as Untrained Listeners?” Audio Musings, December 26, 2008.

[7] Sean Olive, Part 2 - Differences in Performance of Trained Versus Untrained Listeners, Audio Musings, December 27, 2008.

[8] Sean Olive, “Part 3 - Relationship between Loudspeaker Measurements and Listener Preferences”, Audio Musings, December 28, 2008.

[9] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 1" J. AES Vol. 23, issue 4, pp. 227-235, April 1986. (download for free courtesy of Harman International).

[10] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2," J. AES, Vol. 34, Issue 5, pp. 323-248, May 1986. (download for free courtesy of Harman International).

Monday, June 28, 2010

Science in the Service of Art

Friday, June 18, 2010

Some New Evidence That Generation Y May Prefer Accurate Sound Reproduction