Audio Musings by Sean Olive: January 2009

Sunday, January 11, 2009

What Loudspeaker Specifications Are Relevant to Sound Quality?

This past week I attended the International Loudspeaker Manufacturer’s Association (ALMA) Winter Symposium in Las Vegas where the theme was “Sound Quality in Loudspeaker Design and Manufacturing.” Over the course of 3 days there were presentations, round table discussions, and workshops from the industry’s leading experts focused on improving the sound quality of the loudspeaker. Ironically, the important question of whether these improvements matter to consumers wasn’t raised until the final hours of the symposium in a panel discussion called: “What loudspeaker specifications are relevant to perception?”

The panelists included myself, Steve Temme (Listen Inc.), Dr. Earl Geddes (GedLee), Laurie Fincham (THX), Mike Klasco (Menlo Scientific), and Dr. Floyd Toole (former VP Acoustic Engineering at Harman), who served as the panel moderator. After about 30 minutes, a consensus was quickly reached on the following points:

The perception of loudspeaker sound quality is dominated by linear distortions, which can be accurately quantified and predicted using a set of comprehensive anechoic frequency response measurements (see my previous posting here)
Both trained and untrained listeners tend to prefer the most accurate loudspeakers when measured under controlled double-blind listening conditions (see this article here).
The relationship between perception and measurement of nonlinear distortions is less well understood and needs further research. Popular specifications like Total Harmonic Distortion (THD) and Intermodulation Distortion (IM) do not accurately reflect the distortion’s audibility and effect on the perceived sound quality of the loudspeaker.
Current industry loudspeaker specifications are woefully inadequate in characterizing the sound quality of the loudspeaker. The commonly quoted “20 Hz - 20 kHz , +- 3 dB” single-curve specification is a good example. Floyd Toole made the observation that there is more useful performance information on the side of a tire (see tire below) compared to what’s currently found on most loudspeaker spec sheets (see Floyd's new book "Sound Reproduction").

For the remaining hour, the discussion turned towards identifying the root cause of why loudspeaker performance specifications seem stuck in the Pleistocene Age, despite scientific advancements in loudspeaker psychoacoustics. Do consumers really care about loudspeaker sound quality? Or are they mostly satisfied with the status quo? Why do loudspeaker manufacturers continue to hide behind loudspeaker performance numbers that are mostly meaningless, and often misleading?

The evidence that consumers no longer care about sound quality is anecdotal, largely based on the recent down-market trend in consumer audio. Competition from digital cameras, flat panel video displays, MP3 players, computers, and GPS navigation devices, has decimated the consumers' audio budget. This doesn't prove consumers care less about loudspeaker sound quality, only that there is less available money to purchase it. Marketing research studies indicate that sound quality remains an important factor in consumers' audio purchase decisions. Given the opportunity to hear different loudspeakers under controlled unbiased listening conditions, consumers will tend to prefer the most accurate ones. Unfortunately, with the demise of the speciality audio dealer and the growth of internet-based sales, consumers rarely have the opportunity to audition different loudspeakers - even under the most biased and uncontrolled listening conditions. This is a perfect opportunity and reason for why the industry needs to provide new loudspeaker specifications that accurately portray the perceived sound quality of the loudspeaker.

So why is the loudspeaker industry not moving more quickly towards this goal? In my view, complacency and fear are the major obstacles. The loudspeaker industry is very conservative and largely self-regulated. There are no regulatory agencies to force improvement, or even check whether a product's quoted specifications are compliant with reality. Change will only occur as the result of competition, or pressure exerted by consumers, industry trade organizations (e.g.CEDIA, CEA) or consumer product testing organizations, like Consumer Reports. The fear of adopting a new specification stems from the realization that a company can no longer hide beneath the Emperor's new clothes (i.e. the current specifications). A perceptually relevant specification would clearly identify the good sounding loudspeakers from the truly mediocre ones. In the future, a perceptual-based specification like the one illustrated to the right, could provide ratings on overall sound quality, and various timbral, spatial and dynamic attributes. The consumer could then choose a loudspeaker based on these measured attributes.

In conclusion, all evidence suggests that consumers highly value sound quality when purchasing a loudspeaker, yet current loudspeaker specifications provide little guidance in this matter. It is time the loudspeaker industry grows up and realizes this. Adopting a more perceptually meaningful loudspeaker specification would permit consumers to make smarter loudspeaker choices based on how it sounds. This would better serve the interests of consumers and loudspeaker manufacturers who view the sound quality of a loudspeaker to be its most important selling feature.

Saturday, January 3, 2009

Why Consumer Report's Loudspeaker Accuracy Scores Are Not Accurate

For over 35 years, Consumer Reports magazine recommended loudspeakers to consumers based on what many audio scientists believe to be a flawed loudspeaker test methodology. Each loudspeaker was assigned an accuracy score related to the "flatness" of its sound power response measured in 1/3-octave bands. Consumers Union (CU) - the organization behind Consumer Reports - asserted that the sound power best predicts how good the loudspeaker sounds in a typical listening room. Until recently, this assertion had never been formally tested or validated in a published scientific study.

In 2004, the author conducted a study designed to address following question: "Does the CU loudspeaker model accurately predict listeners' loudspeaker preference ratings?" (see Reference 1). A sample of 13 different loudspeaker models reviewed in the 2001 August edition of Consumer Reports was selected for the study. Over the course of several months, the 13 loudspeakers were subjectively evaluated by a panel of trained listeners in a series of controlled, double-blind listening tests. Comparative judgments were made among different groups of 4 speakers at a time using four different music programs. Loudspeaker positional biases were eliminated via an automated speaker shuffler. To control loudspeaker context effects, a balanced test design was used so that each loudspeaker was compared against the other 12 loudspeaker models, an equal number of times. This produced a total of 2,912 preference, distortion and spectral balance ratings, in addition to 2,138 comments.

The above graph plots the mean listener loudspeaker preference rating and 95% confidence intervals (blue circles), and the corresponding CU predicted accuracy score (red squares) for each of the 13 loudspeakers. The agreement between the listener preference and CU accuracy scores is very poor, indeed; in fact, the correlation between the two sets of ratings is actually negative (r = -.22) and statistically insignificant (p = 0.46). The most preferred loudspeaker in the test group (loudspeaker 1) actually received the lowest CU accuracy score (76). Conversely, some of the least preferred loudspeakers (e.g. loudspeakers 9 and 10) received the highest CU accuracy scores. In conclusion, the CU accuracy scores do not accurately predict listeners' loudspeaker preference ratings. Since this study was published, CU has begun to reevaluate their loudspeaker testing methods. Hopefully, their new rating system will more accurately predict the perceived sound quality of loudspeakers in a typical listening room.

In the next installment of this article, I will explain why the CU loudspeaker model failed to accurately predict listeners' loudspeaker preferences, and show some new models that work much better in this regard.

Updated 1/5/2009: Today, I was contacted by Consumer Reports who informed me that since 2006 they no longer publish loudspeaker reviews based on their sound power model that I tested in 2004. I was told their new model for predicting loudspeaker sound quality uses a combination of sound power and other analytics to better characterize what the listener hears in a room. In this regard, it is similar to the predictive model I developed, which I will discuss in an upcoming blog posting.

References

[1] Sean E. Olive, "A Multiple Regression Model for Predicting Loudspeaker Preferences using Objective Measurements: Part 1 -Listening Test Results," presented at the 116th AES Convention, May 2004.

Thursday, January 1, 2009

A Video on How We Measure Loudspeaker Sound Quality at Harman International

Part of my job at Harman International involves participating in audio dealer training and press events. This involves a 1-2 day field trip to Harman's R&D labs in Northridge where the visitors experience first-hand the listener training process, and participate in a double-blind loudspeaker listening test. Visitors usually leave our labs with a heightened appreciation and respect for the scientific efforts behind the development and testing of new models of Revel, JBL, and Infinity loudspeakers.

A few years ago, Infinity commissioned a video known as "Infinity Academy", aimed at encapsulating the 1-2 day training event onto a DVD. Chapter 6, the "Final Test," discusses listener training and the double-blind listening test, where trained listeners evaluate the Harman prototype loudspeaker against its best competitors. The goal is to achieve "best-in-class" performance, attainable only until the prototype receives a preference rating higher than its best competitor. In the event that the loudspeaker fails on its first attempt, the listeners' feedback is used to re-engineer the loudspeaker, after which, it is re-submitted for another listening test.

The picture to the right shows three loudspeakers on the automated speaker shuffler in the Multichannel Listening Lab. The shuffler brings each loudspeaker into the exact same position within 3 seconds, so that any loudspeaker positional biases are removed from the listening test.

Chapter 6 can be downloaded in MPEG-4 (H.264, 41 MB) or MPEG (84 MB) formats. The entire 6 chapters of the DVD are available here.