Showing posts with label Psychoacoustics. Show all posts
Showing posts with label Psychoacoustics. Show all posts

Saturday, May 1, 2010

Evaluating the Sound Quality of Ipod Music Stations: Part 3 Measurements



In Part 3 of this article, the acoustical measurements of three popular Ipod Music Stations (Harman Kardon MS100, Bose SoundDock 10 and Bowers & Wilkins Zeppelin) are examined to see if they corroborate listeners’ sound quality ratings of the products based on controlled double-blind listening tests. Part 2 summarized the results of those listening tests, and Part 1 described the listening test methodology used for this research.
Throughout this article, I will refer to some slides of a presentation that can be downloaded as a PDF or viewed as a YouTube video.
Mono or Stereo Acoustical Measurements?
There is a substantial body of scientific research on the subjective and objective testing of conventional stereo loudspeakers [1]-[5]. Unfortunately, the same is not true for Ipod Music Stations: this raises several research questions about how they should be evaluated and measured.
The first important question is whether the acoustical measurements should be done in mono or stereo. Due to the proximity of the left and right channel transducer arrays in Music Stations, there is the potential for constructive and destructive interference when both channels are active that will vary according to frequency and the relative inter-channel levels and phases of the music signals. To study this phenomena, the left and right channels were measured and analyzed as both single and combined channels. Generally, we found very little difference in the frequency responses (magnitude and phase) of the left and right channels. Combining the two channels only led to the expected 6 dB increase in sound pressure level (SPL).
Anechoic Measurements of the Music Stations
Each Music Station was measured at distance of 2 meters in the large anechoic chamber at Harman International. The chamber is anechoic down to 60 Hz and this is extended to 20 Hz through a calibration procedure. Each Music Station was subjected to the same battery of measurements used for designing and testing Revel, Infinity and JBL home loudspeakers. A total of 70 frequency response measurements were taken at 10 degree increments in both horizontal and vertical orbits (slide 4). These measurements were then spatially averaged and weighted to characterize the direct, early and late reflected sounds in a typical listening room, in addition to the calculated directivity indices (slides 5-8).
The family of measurement curves (slide 9) reveal significant differences among the three Music Stations in terms of their smoothness and low frequency extension below 70 Hz.
Music Station A has the smoothest frequency response across the family of curves, which corroborates listeners’ comments about its neutral sound and absence of colorations (see slide 11 of Part 2). There is also physical evidence in the measurements that explain listener comments about Music Station A sounding a bit bright and thin, due to a combination of the upward spectral tilt in its listening window curve, and its higher low frequency cutoff.
Music Station B has even more peaks and dips in the curves that contribute to the higher frequency of listener comments regarding audible coloration. Particularly problematic is the large broad resonance at 500 Hz that is visible in both the direct and reflected sounds produced by the product. However, there is nothing in the measurements to explain listeners’ complaints about its boomy bass.
Music Station C clearly has the least tidy set of measurement curves with a significant hole centered at 2 kHz in the on-axis curve. There are visible resonances in the measurements that elicited frequent listener comments about “midrange unevenness” and “coloration.” Finally, the sound power response and directivity indices reveal that this Music Station becomes increasingly directional at higher frequencies compared to its competitors. This could contribute to coloration and dullness at off-axis listening positions and at further listening distances.
Relationship between Anechoic Measurements and Listener Preference
The anechoic measurements of the Music Stations are shown again in Slide 10 along with the listener preference ratings. From this, we see that the overall smoothness of the family of curves appeared to be important underlying factor that influenced listeners’ Music Station preference ratings.
Correlations Between Anechoic Measurements and Perceived Spectral Balance: The Direct Sound Influences the Perceived Spectral Balance Above 300 Hz
There has been a 30+ year debate in the audio community regarding which set of acoustical measurements best predict the loudspeaker’s perceived sound quality in a typical listening room. There are several different camps that include the direct sound response advocates, the sound power response advocates, the in-room measurement advocates, and others, like myself, who argue that you need a combination of all of the above measurements to accurately predict how the loudspeakers will sound in a room.
One way to tackle this debate is to study the correlation between different loudspeaker measurements and listeners’ perceived spectral balance of the loudspeakers in a room. Slide 11 shows the perceived spectral balance ratings of the Music Stations versus the family of anechoic curves that include the listening window (direct sound), first reflections and sound power response.
For Music Station A, there is good agreement between the perceived spectral balance and the listening window curve, which represents the direct sound over a ± 30 degree horizontal angle. For Music Station B, there is generally poor agreement: listeners complained about boomy bass, yet there is nothing in these measurements to suggest why. There is clearly some information missing in the anechoic measurements and/or perhaps the subjective ratings are faulty. We will come back to this topic later.
For Music Station C, there is good agreement between the perceived spectral balance and the listening window curve (direct sound), with indications that the resonances centered at 1.5 and 3.5 kHz were heard and registered by the listeners.
In summary, it seems that for at least two of the Music Stations, the perceived spectral balance can be approximated by looking at the listening window curves that represent the direct sound. However, there is information missing in the anechoic measurements that don’t explain perceptual effects below 300 Hz.
In-Room Measurements of the Music Stations
Below about 300 Hz, the room acoustics and the Music Station/listener positions can have a significant influence on the perceived quality of reproduced sound. Yet, these physical effects are not captured in the anechoic measurements described in the previous section. To further examine these effects, steady-state frequency response measurements of the Music Stations were taken at the primary listening seat at 6 different microphone positions, and then spatially averaged to remove highly localized acoustical interference effects (slide 12). The 1/6-octave smoothed curves for each Music Station are shown in slide 13. Below 200 Hz, there is evidence of room resonances (high Q peaks and dips) and boundary effects that were absent in the previous anechoic measurements (slide 9). Music Station A had less apparent boundary gain than the other two products, probably because the boundary effect was accounted for in its design.

Correlation Between In-Room Measurements and Perceived Spectral Balance: The Influence of Room and Boundary Effects Below 300 Hz
The in-room measurements are plotted in slide 13 along with listeners’ perceived spectral balance ratings. Here, the in-room measurements have been super-smoothed (1-octave) to better correspond to the frequency resolution of the subjective ratings.
Below 300 Hz, there is better agreement between the in-room measurements and listeners’ spectral ratings than observed using the anechoic measurements (slide 11). However, above 300 Hz, there is generally better agreement between the anechoic measurement and spectral ratings, particularly using the listening window curve that represents the direct sound. This confirms the important role that the direct sound plays in our perception of reproduced sound. Below 300 Hz, the room’s standing waves and boundary effects play a dominant role in the quality and quantity of bass we hear. Previous studies [5] have shown bass quality accounts for 30% of listener preference, and cannot be ignored.
Dynamic Compression Measurements
Our scientific understanding of the perception and measurement of nonlinear distortions in loudspeakers is still quite poor. There are currently no standard loudspeaker measurements that adequately capture the perceptual significance of dynamic compression and the associated distortions it produces. This is an area of audio that is in need of more research.
Listeners reported that Music Station A had fewer audible nonlinear distortions than the other two Music Stations. However, it was not clear if the distortions were real or due to a cognitive bias known as the “Halo effect.” Examining the objective distortion measurements will hopefully clarify what is real and not real.
The dynamic linearity of the Music Stations was tested by measuring their anechoic frequency response at different playback SPL’s from 76 to 100 dB SPL (@ 1 meter distance) in 6 dB increments. A relatively short length 4 s log sweep was used as a test signal to minimize the thermal effects on the transducers. Consequently, the measured dynamic compressions shown below were largely related to the behavior of the electronic limiters in the Music Stations, designed to prevent the amplifier clipping, which could otherwise potentially damage the transducers.
Slide 16 shows the dynamic compression for each Music Station. The frequency response measured at 82, 88, 94 and 100 dB SPL’s have been normalized to the 76 dB measurement. Any dynamic compression effects would be exhibited as a deviation from 0 dB. In examining these graphs, Music Station A produced 6 dB more output (100 dB @ 1 meter) than the other Music Stations without significant compression effects.
On the surface, the relationship between these measurements and listeners’ distortion ratings seems to be straightforward: the Music Stations with the higher amounts of compression received lower distortion ratings (slide 17). However, the SPL’s at which the compression effects occurred (> 94 dB) were higher than those used in the listening test.

Harmonic Distortion Measurements
Harmonic distortion (second and third harmonic only) measurements were made in the anechoic chamber at a SPL of 95 dB. The distortion levels of the harmonics are plotted along with the fundamental for each of the Music Stations in slide 18. Note that the levels of the harmonics have been raised 20 dB for the sake of clarity.
All of the Music Stations exhibited relatively high distortion at low frequencies below 100 Hz, with generally less harmonic distortion at higher frequencies. Music Station B differentiated itself by having higher levels of second and third harmonic distortion between 100 Hz to 1 kHz. Music Station C had the lowest distortion even though it received the lowest preference and distortion ratings from the listeners.
In conclusion, the harmonic distortion measurements of the Music Stations are not particularly good at predicting listeners’ distortion ratings, or overall preference in sound quality. This confirms many previous loudspeaker studies that have reported that harmonic distortion measurements are poor predictors of listeners’ overall impression of the loudspeaker. This can be explained by the fact that the distortions are often below the threshold of audibility, and the measurements themselves do not account for the masking properties of human hearing.

Conclusions
This article has shown evidence that a combination of comprehensive anechoic and in-room measurements can help explain listeners’ preferences and spectral balance ratings of the Music Stations evaluated in controlled listening tests.
Above 300 Hz, the anechoic derived listening window curve correlated well with listeners’ spectral balance ratings, whereas the in-room measurements better explained the Music Station’s acoustical interactions with the room below 300 Hz. In these particular tests, the overall smoothness of the on and off-axis frequency response curves provided the best overall indicator of listeners’ preferences and their comments.
Dynamic compression measurements revealed significant differences among the Music Stations in terms of their linear SPL output capability. The most preferred Music Station could play 6 dB louder (100 dB SPL @ 1 meter) than the other units without exhibiting significant dynamic compression. It is unlikely that this was a factor in the listening tests since the units were evaluated at a comfortable average level of 78 dB (B-weighted, slow). Finally, distortion measurements revealed some differences among the products but had no clear correlation with listeners’ sound quality ratings. This highlights the need for further research into the perception and measurement of nonlinear distortion in loudspeakers so that loudspeaker engineers can optimize their designs using psychoacoustic criteria.
References
[1] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 1" J. AES Vol. 23, issue 4, pp. 227-235, April 1986. (download for free courtesy of Harman International).
[2] Floyd E. Toole, "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2," J. AES, Vol. 34, Issue 5, pp. 323-248, May 1986. (download for free courtesy of Harman International).
[3] W. Klippel, "Multidimensional Relationship between Subjective Listening Impression and Objective Loudspeaker Parameters", Acustica 70, Heft 1, S. 45 - 54, (1990).
[4] Sean E. Olive, “A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part I - Listening Test Results,” presented at the 116th AES Convention, preprint 6113 (May 2004).
[5] Sean E. Olive, “A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part 2 - Development of the Model,” presented at the 117th AES Convention, preprint 6190 (October 2004).

Saturday, October 31, 2009

Audio's Circle of Confusion

Audio’s “Circle of Confusion” is a term coined by Floyd Toole [1] that describes the confusion that exists within the audio recording and reproduction chain due to the lack of a standardized, calibrated monitoring environment. Today, the circle of confusion remains the single largest obstacle in advancing the quality of audio recording and reproduction.

The circle of confusion is graphically illustrated in Figure 1. Music recordings are made with (1) microphones that are selected, processed, and mixed by (2) listening through professional loudspeakers, which are designed by (3) listening to recordings, which are (1) made with microphones that are selected, processed, and mixed by (2) listening through professional monitors...... you get the idea. Both the creation of the art (the recording) and its reproduction (the loudspeakers and room) are trapped in an interdependent circular relationship where the quality of one is dependent on the quality of the other. Since the playback chain and room through which recordings are monitored are not standardized, the quality of recordings remains highly variable.


Creating Music Recordings Through An Uncalibrated Instrument


A random sampling of ones own music library will quickly confirm the variation in sound quality that exists among different music recordings. Apart from audible differences in dynamic range, spatial imagery, and noise and distortion, the spectral balance of recordings can vary dramatically in terms of their brightness and particularly, the quality and quantity of bass. The magnitude of these differences suggests that something other than variations in artistic judgment and good taste is at the root cause of this problem.


The most likely culprits are the loudspeakers and rooms through which the recording were made. While there are many excellent professional near-field monitors in the marketplace today, there are no industry guidelines or standards to ensure that they are used. The lack of meaningful, perceptually relevant loudspeaker specifications makes the excellent loudspeakers difficult to identify and separate from the truly mediocre ones. To make matters worse, some misguided recording engineers monitor and tweak their recordings through low-fidelity loudspeakers thinking that this represents what the average consumer will hear. Since loudspeakers can be mediocre in an infinite number of ways, this practice only guarantees that quality of the recording will be compromised when heard through good loudspeakers [1]. This is very counterproductive if we want to improve the quality and consistency of audio recording and reproduction.


Another significant source of variation in the recording process stems from acoustical interactions between the loudspeaker and the listening room [1]-[3] Below 300-500 Hz, the placement of the loudspeaker-listener can cause >18 dB variations in the in-room response due to room resonances and placing the loudspeaker in proximity to a room boundary.


Evidence of acoustical interactions has been well documented survey of 164 professional recording studios where the same high-quality, factory calibrated monitored was installed [4]. Figure 2 shows the distribution of in-room responses measured at the primary listening location where the recordings are monitored and mixed. The 1/3-octave smoothed curves show a reasonably tight ± 2.5 dB variation above 1 kHz. However, below 1 kHz, variation in the in-room response gets progressively much worse at lower frequencies. Below 100 Hz, the in-room bass response can vary as much 25 dB among the different control rooms! You needn’t look any further than here to understand why the quality and quantity of bass is so variable among the recordings in your music library.


Evaluating Loudspeakers When the Recording is a Nuisance Variable


Loudspeaker manufacturers are also trapped in the circle of confusion since music recordings are used by listening panels, audio reviewers, and consumers to ultimately judge the sound quality of the loudspeaker. The problem is that distortions in the recording cannot be easily separated from those produced by the loudspeaker. For example, a recording that is too bright can make a dull loudspeaker sound good, and an accurate loudspeaker sound too bright [5]. A review of the scientific literature on loudspeaker listening tests indicates that recordings are a serious nuisance variable that need to be carefully selected and controlled in the experimental design and analysis of test results.


At Harman International, we try to minimize loudspeaker-program interactions in our loudspeaker listening tests by using well-recorded programs that are equally sensitive to distortions found in loudspeakers. Listeners become intimately familiar with the sonic idiosyncrasies of the different programs through extensive listener training and participation in formal tests. In each trial of a loudspeaker test, the listener can switch between different loudspeakers using the same program, which allows them to better separate the distortions in the program (which are constant), from the distortions in the loudspeaker.


Through 25+ years of well-controlled loudspeaker listening tests, scientists have identified the important loudspeaker parameters related to good sound, which can be quantified in a set of acoustical measurements [6],[7] By applying some statistics to these measurements, listeners’ loudspeaker preferences can be predicted [8]. The bass performance of the loudspeaker alone accounts for 30% the listener’s overall preference rating. Good bass is essential to our enjoyment of music, which unfortunately is a frequency range where loudspeakers and rooms are most variable (see Figure 2). Controlling the behavior of loudspeakers and rooms at low frequencies is essential to achieving a more consistent quality of audio recording and reproduction. Fortunately, there are technology solutions today that provide effective control of acoustical interactions between the loudspeaker and rooms.


Breaking the Circle of Circle of Confusion


As Toole points out in [1], the key in breaking the circle of confusion lies in the hands of the professional audio industry where the art is created. A meaningful standard that defined the quality and calibration of the loudspeaker and room would improve the quality and consistency of recordings. The same standard could then be applied to the playback of the recording in the consumer’s home or automobile. Finally, consumers would be able to hear the music as the artist intended.


References


[1] Floyd E. Toole, Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms, Focal press (July 2008).


[2] Floyd Toole, “Loudspeakers and Rooms: A Scientific Review,” J. Audio Eng. Soc., Vol. 54, No. 6, (2006 June). A free copy of this paper can be downloaded here


[3] Sean E. Olive and William Martens “Interaction Between Loudspeakers and Room Acoustics Influences Loudspeaker Preferences in Multichannel Audio Reproduction,” presented at the 123rd Convention of the AES, preprint 7196 (October 2007).


[4] Aki V. Mäkivirta and Christophe Anet, “The Quality of Professional Surround Audio Reproduction, A Survey Study,”19th International AES Conference: Surround Sound - Techniques, Technology, and Perception (June 2001).


[3] Todd Welti and Allan Devantier, “Low-frequency Optimization Using Multiple Subwoofers,” Audio Eng. Soc., Vol. 54, No. 5, (May 2006). A free copy of this paper can be downloaded here


[4] Sean E. Olive, John Jackson, Allan Devantier, David Hunt, and Sean Hess, “The Subjective and Objective Evaluation of Room Correction Products,” presented at the 127th AES Convention, New York, preprint 7960 (October 2009).


[5] Sean E. Olive,”The Preservation of Timbre: Microphones, Loudspeakers, Sound Sources and Acoustical Spaces,”8th International AES Conference: The Sound of Audio (May 1990)


[6] Floyd E. Toole, “Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 1,” J. Audio Eng. Soc., Vol. 34,No.4, pp.227-235, (April 1986). A free copy of this paper can be downloaded here


[7] Floyd E. Toole, “Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2,” J. Audio Eng. Soc., Vol. 34, No.5, pp. 323-348, (May 1986). A free copy of this paper can be downloaded here


[8] Sean E. Olive, “A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part II - Development of the Model,” presented at the 117th Convention of the AES, preprint 6190 (October 2004).


Tuesday, December 30, 2008

Sound Science - Loudspeaker R&D at Harman

The American artist Andy Warhol once said that everyone will eventually have their 15 minutes of fame. The closest I came was being on the cover of Test & Measurement magazine in November 2004. OK, admittedly T&M is not exactly People Magazine, but  1 or 2 pocket protector-wearing test engineers may have noticed the cover while shopping for a new digital oscilloscope or multimeter.

The title of the article is "Sound Science: Musical tastes differ, but tests show that listeners respond with the consistency of spectrum analyzers to loudspeaker performance."

The article explains the science behind loudspeaker R&D at Harman International and is written in a very approachable style for the audio layperson. You can read it here. 

Tuesday, December 23, 2008

Welcome to My Blog on The Science of Sound Recording and Reproduction

This blog is concerned with all matters related to the quality of recorded and reproduced sound. Some of the topics I hope to cover in upcoming posts include recording technology, listening tests, loudspeakers, headphones, automotive audio, and acoustical interactions between loudspeakers and listening rooms.

I am an audio scientist by profession, and in matters related to the sound quality, I prefer to make conclusions based on hard scientific evidence gathered through properly controlled listening tests and meaningful objective measurements. Unfortunately, most of the audio industry doesn't operate this way. Why not? Quality subjective and objective measurements require significant investments in time, facilities, and expertise, whereas opinions on sound quality cost almost nothing.  Sometimes you get what you pay for.

I'm particularly  interested in the psychoacoustics of audio (i.e. the relationship between the human perception and measurement of sound). Here, controlled listening tests play an important role  since they permit scientists to make accurate, reliable and valid correlations between listeners' preferences and the variables being tested (e.g. different loudspeakers, room treatments, etc). From these listening tests  will hopefully emerge  a set of measurement and design rules from which the audio chain can be consistently optimized to produce a quality listening experience. 

I hope the reader will find this blog educational and entertaining.

Note:  The above photograph shows a listener auditioning different loudspeakers in Harman International's Multichannel Listening Lab.  Loudspeaker positional effects are controlled by an automated speaker mover that shuffles each loudspeaker into the same exact position within 3 seconds. During the test, an acoustically transparent but visually opaque curtain (shown in the up position here) is dropped in front of the loudspeakers so that the listener is not biased by visual factors such as loudspeaker size, brand, price,etc.