Friday, February 5, 2010

Evaluating the Sound Quality of Ipod Music Stations: Part 1

For many consumers, an iPod Music Docking Station may be the primary audio device through which they experience most of their recorded music and infotainment. These ubiquitous devices offer a convenient, low cost, portable and easy-to-use solution for enjoying an Ipod through loudspeakers -- but what about their sound quality? What sonic compromises are made in order to achieve this level of convenience and portability? Do certain models or brands of Ipod Music Stations offer better sound than others, and if so, how can consumers identify which ones they are? These are legitimate questions that consumers should be asking when purchasing an Ipod Music Station. Unfortunately, the answers are not readily found.

Choosing an Ipod Music Station based on sonic performance quality is a daunting task for consumers. There are dozens of models to choose from that vary in price from $80 to as high as $3000 for a model designed by Ferrari. Competent in-store demonstrations and reviews of these products are difficult to find, and the technical specifications on the packaging provide no clear indication of how good they sound. For traditional loudspeakers, it is already possible to quantify their sound quality, but the audio industry continues to withhold this information from consumers. Without meaningful performance specifications in place, consumers cannot make sound purchase decisions, nor can manufacturers be easily held accountable for delivering products that sound “ not good enough.”

This article describes a listening test method used at Harman International for evaluating the sound quality of Harman and competitors’ Ipod Music Stations. The goal is to provide subjective ratings of Ipod Music Stations that are accurate, reliable and scientifically valid. From this data, a set of technical performance specifications can be developed that quantify how good the products sound.

Designing Listening Tests For Ipod Music Stations

Fortunately, there already exists a large body of scientific knowledge on how to design accurate, reliable and valid listening tests on loudspeakers. A key ingredient is careful control of listening test nuisance variables: these are psychological, electro-acoustical and experimental factors not directly related to the product(s) under test but nonetheless influence and bias the results (click on the figure below). Some of the more significant nuisance variable controls that should be in place but often are ignored by audio manufacturers and reviewers are:

  • Double-blind conditions (this removes the effects of sighted biases related to brand, price,etc)
  • Trained listeners with normal hearing (trained listeners are up to 20 times more discriminating and reliable than untrained listeners, yet their overall sound quality preferences are similar to those of untrained listeners)
  • Quiet listening room with acoustics that are representative of average homes (important for hearing low level sounds and the quality of the loudspeaker's off-axis radiated sounds)
  • Loudness matching between products (the perception of timbre, spatial and dynamic attributes are level dependent)
  • Selection of well-recorded music selections that are revealing of sound quality differences
  • Multiple comparisons among products which are more discriminating and reliable compared to single stimulus presentations

These important nuisance variable controls are essential for obtaining accurate, reliable and valid sound quality ratings of Ipod Music Stations.

Including the Acoustical Effects of the Wall and Desktop in the Listening Test

If audio products are not tested under similar conditions for which they were designed and intended to be used, the ecological validity (as well as the external validity) of the test may be compromised: in other words, the test results will be of little value or relevance to how the product is typically used in the real world.

Most Ipod Music Stations are intended to be placed on a desktop surface or bookshelf located near a wall, which will cause acoustical reinforcement and cancellation at certain audio frequencies. Below 500 Hz, there will be a gradual increase in sound pressure level that unless compensated for in the design of the product can make vocals and bass instruments sound tubby and boomy. Diffraction effects or reflections from the desktop/bookshelf may also produce audible effects that should be included in the listening test. For these reasons, listening tests on Ipod Music Stations are best done on a desktop/wall boundary.

A Video On How We Evaluate the Sound Quality of Ipod Docking Stations

The video shown at the top of the page illustrates how Ipod Music Stations are currently evaluated in the Harman International Reference Listening Room. The acoustical properties and features of the room have been described in detail in a previous posting.

In the video you see a trained listener comparing three different Ipod Music Stations situated on our automated in-wall speaker mover configured with a removable shelf and desktop. An acoustically transparent, visually opaque screen is placed between the listener and the products under test, so that the test is double-blind (note: the term double-blind implies that neither the listener nor the experimenter know the identities of the products currently selected since the computer controls and randomly assigns the letters A/B/C to the products in each trial.)

The listener can switch between the different products at will and enter their responses via a wireless PDA equipped a custom listening test software (LTS) client application. Sound quality ratings are given on a number of different pre-defined scales that include preference, spectral balance, distortion, auditory image size.This is repeated twice using four different programs.

The PDA client communicates with the LTS server application that performs the following functions:

  • A test wizard that defines of all experimental design and setup parameters (perceptual scales, presentation of stimuli, program, randomization of test objects, playback level,etc), which are then stored in a database
  • automation and administration of the listening test and its hardware (e.g. speaker mover, media player, DSP, audio switcher)
  • collection, storage and statistical analysis of listening test data
  • real-time monitoring of listener’s performance and ratings during the test

LTS makes conducting listening tests an efficient and repeatable process by minimizing human interaction and errors in the listening test setup, storage, and analysis of the results.


This article has described a listening test method used for evaluating Ipod Music Stations with the goal to provide accurate, reliable and valid sound quality ratings. In Part 2, I will show some results from a recent listening test conducted on different Ipod Music Stations, followed by some different acoustical measurements of the products in Part 3. By studying the relationship between well-controlled scientific listening tests and comprehensive acoustical measurements of Ipod Music Stations, a meaningful technical specification based on sound quality can be found.


  1. Very interesting. Ultimately, when I compare the sound quality of my iphone through headphone out to my computer using Sennheiser HD650 headphones, the iPhone sounds absolutely awful which begs the question, why bother with a home system based on a portable player?

  2. Hi Anonymous,

    For many people, Ipod is their only source of audio, for reasons mentioned in the Wired article about the sound quality being "Good Enough." I think many people accept the sound quality as being "good enough" because they simply don't know there are better sounding options. The audio industry has failed to educate consumers that not all products sound the same -- even though the specifications on the packaging suggests they do.

    Of course, the quality of the Ipod will depend on bit-rate used (lossy versus uncompressed) to encode the music on it. Most of these Ipod Music Stations also accept analog inputs which is what we used to test these products. Our music source was uncompressed audio fed from a high-quality digital sound card.


  3. The devil finds work for idle hands

    Since you spare no time and effort on these revealing listening tests of low quality entertainment gadgets, I think you should have listened to the samples of the music mostly played on such devices - pop charts. Could be that competitors voiced their devices for such highly compressed and equalized music material.

  4. Vulki,
    I have no knowledge of how our competitors voice and test their products. We design and test our products using well-recorded music, so that well-recorded music sounds good on our products, and poorly recorded music sounds the way it should.

    Our philosophy at Harman is "science in the service of art". The art is the music and recording, and our job is to accurately reproduce the art, which is best served using a scientific approach.

    If the art is flawed, it is not our place to editorialize it by designing products that attempt to put a band aid on it.


  5. Is it possible that the listening distance of your test is not representative of the typical idock user, and would influence the test findings? e.g. they might rank differently at a 1m listening distance.

    Also, I would like to see a benchmark, such as a well-measuring 2-way bookshelf speaker (even at a comparable price point, possibly even a small active unit sold for benchtop use with PC's and with good measurements) included in the test to see how the idocks compare to a valid and more traditional sonic alternative.

    P.S. finding your blog has been a godsend since my audio life was transformed by reading Toole's book. Please keep up the blogging and the accessibility.

    Grant S

  6. Sean,

    I am curious to know if you have list of acoustic parameters that you have set for a good quality iPod docking station? It would seem, based on your past research (I've learned a lot from it), that most of the psycho-acoustic research data is known and the engineering challenge of a docking station would determine which compromises are acceptable. For example, in the case of existing docking stations, they have limited bass response, almost non existent stereo output due to the close proximity of the drivers, and poor passband power response due to most docking stations lacking a tweeter. So what's left, amplitude response in the mid band?

    I guess what I'm really trying to say is; why waste your time evaluating poor products? I have never heard a docking station that sounded decent. I think your time would be better spent designing your own docking station, I'm sure it would outperform any of those devices you are testing. After I finish my current loudspeaker project I will be building an iPod docking station for my bedroom, it will be an interesting challenge.


    Rob Collins

  7. Hi Rob:

    You're right: getting good sound from an Ipod Music Station is much more of an engineering challenge than designing a $200 or $10,000 large home loudspeaker, where you don't have the same number of constraints on available amplifier power, size, weight, cost, size/number of transducers, mechanical resonances, air leaks, and box volume.

    As you point out, the trick is making the right set of compromises.In my view, that is where listening tests are needed the most, since they can provide engineering very useful feedback that can be used to optimize the sound quality given those constraints.

    I think you might be surprised at how good some of the models can sound given the above engineering challenges and constraints. True, many of them don't sound good -- and it's important to differentiate those from the ones that do sound good,

  8. Hi Grant,

    We have done some tests where the Music Stations were tested with the listener seated at the desktop (about 3 feet away) as well as further away (10 feet). In this particular case, the results didn't significantly change, although results will vary depending on the directivity, and on versus off-axis frequency response of the model.

    I like your suggestion of comparing this category of product to more conventional loudspeakers in order to establish what the sound quality differences are between the two product categories. If consumers were more aware of the sound quality differences between products, they would at least be in a better position to choose the level of sound quality they need.

    As it stands now, when you walk into Best Buy the audio specifications on the side of the box suggests that everything in the store sounds the same -- which we know not to be true.

  9. To the person who said the IPOD doesn't sound as good as a computer. Why don't you do a volume matched blind test with high bitrate mp3/aac and then see if you come to the same conclusion? Modern PMPs are very good and pretty much audibly transparent.

  10. @ Anonymous,
    I agree with you. Although we didn't use an Ipod in these tests (we used the analog inputs of the Music Stations), this was mostly done to easily allow us to control and switch the input signals between the different Music Stations.

    Using an Ipod with lossless or high bitrate AAC encoded music should produce the same test results as using the analog inputs.


  11. I have found that newer iPods have very good audio output jacks. I think it happened sometime around the first iPhone.

  12. I think there are two big problems with the approach of the Harman Listening Lab, when being used to assess an overall preference rating (as well as other performance parameters) for speakers:

    1. It assumes all the music used is a perfect, and somehow an objective, recording, mix and master. Which is pie in the sky.

    2. The "training" of listeners is a bias in action - towards listening for those cues which are also ranked as important by Harman designers. Selecting listeners based on independent listening experience and reliability is a much more preferable method.

  13. I consider this the finest blog I have read all this hour.
    iphone 4 screen