Moulton Laboratories
the art and science of sound
Whaddya Mean It’s Inaudible? Why, I Know A Guy…
Dave Moulton
January 2000
Write a Comment
comments: (0)

Fermata Audio + Acoustics
New England audio recording and acoustical consulting company.
www.fermata.biz
Virtual Instruments
An essential new magazine on softsynths and samplers.
virtualinstrumentsmag.com
Sausalito Audio Works
Dedicated to the development and promotion of Acoustic Lens Technology.
www.sawonline.com

About audibility

For those of you who haven’t been reading my column over the past couple of months, I’ve been taking a look at the whole notion of “audibility” of very small audio differences. Things such as the difference between a 16-bit audio signal and a 24-bit audio signal, or between a signal with a sampling rate of 44.1 kHz. and one with a rate of 96 kHz.

While the proportional differences between those signals, in the digital realm, are huge (64:1 and 2.2:1, respectively), the actual magnitude of the perceivable differences (both physically and psychologically) are quite small. We’ll talk about that in a future column, but, for now, just take my word for it: we’re talking about some really small differences.

Meanwhile the debate rages on. As I write this, there is a thread on the listserv Sursound that can’t reconcile the widely varying reports of audibility of various signal resolutions. Blind testing is called “bad science” by some because it doesn’t reveal “audibility” that some people experience. Other writers argue that if you can’t hear it while listening to it “blind,” well, you can’t hear it, period, in spite of what you hear. Others say we just can’t objectively measure subjective phenomena like hearing, in any case. It goes on. Blah, blah, yada, yada.

From my standpoint, it’s all pretty sloppy. Descartes said, “Cogito, ergo sum” (“I think, therefore I am.”). Have we now sunk to “Cogito, ergo sum vero” (“I think, therefore I am right!”)?

The dictionary sez that the word “audible” refers to something that “can be heard.” That’s all well and good, but for our kind of inquiry, the devil is in the details. How we actually “measure” audibility turns out to be the actual definition of “what can be heard.” Here’s where it gets to be fun.

Traditionally, when we have measured something like the “threshold of audibility,” (meaning “the softest sound that a human can detect”), we have gone out and measured the hearing of a bunch of people. We play them really soft sounds that get softer and softer and ask them to tell us when they can’t hear them anymore. From that, we determine something about the physical magnitudes of sounds that people can “just barely” hear, as well as sounds that they “just can’t quite” hear.

Now, all people don’t hear these things exactly the same way. Some will only hear somewhat louder sounds while others will manage to also hear somewhat softer sounds. We average this out in a common-sense way and usually select as our “nominal” physical threshold “the sound level that is the softest that 50% of the test population can hear.”

Now this sounds sweetly reasonable, and it has been the practice for quite a long time. We use a similar method for determining the “audibility” of lots of things. When we’d like to find out at what magnitude a given effect becomes audible, we do a similar sort of measurement and come up with a similar probabilistic answer, usually based on 50% of the test population.

The problem arises when we realize that even though 50% of the population CAN’T hear the magnitude we’ve determined is at the threshold, well, 50% of the population CAN, and some will hear it quite well, thank you. Therefore, effects that are at or slightly below our defined nominal threshold of audibility are probably going to be clearly audible for many of us.

Hence, effects that are “measured” to be “inaudible” are quite audible for some of us. It’s a simple matter of probability vs. semantics.

One correspondent, dissatisfied with this state of affairs, has suggested that a more appropriate defined “threshold” might be the magnitude at which an effect is heard by 1% of the test population. THAT’s a rigorous standard, he suggests. Aside from the question of whether or not there IS IN FACT a discrete 1% of the population that is innately “golden-eared,” such an effect is going to be “inaudible” for 99% of the population! We’re stuck with a worse version of the same semantic tangle: a so-called “audible” effect will be in fact “inaudible” for the population at large.

There’s another problem as well, and that has to do with the nature of testing humans. When we conduct listening tests, there are three basic variables, all of which effect the outcome of the test. The two obvious variables are the device under test (called the DUT) and the test listener. The third variable is the test itself. As we vary the test (its type, environment, timing, mood, how we score it, etc.) we will affect the results just as surely as we will by changing the DUT or the test subject. Therefore, even tests that appear to be quite similar may yield quite different results, results that often appear to be incompatible. This is why informal tests, especially, get us so confused.

And finally, there’s the problem of the Range Rule. This rule (The Range of the Theory must fit the Range of the Facts) requires that if we’re going to make a scientifically valid statement about the threshold of audibility, that statement has to be limited to the range of our test conditions. If our data is all gathered in double-blind AB tests, then our statements regarding audibility are limited to such tests, and may not be freely applied elsewhere. When the test results are cited, such qualifications are usually ignored. This isn’t bad science so much as bad reporting. Whatever we call it, there’s a lot of it going around these days.

Fortunately, all is not lost. We can infer some generalities from our findings. We can reasonably speculate that A/B tests tend to exaggerate findings of audibility. Therefore, if something is INAUDIBLE in an A/B format, it is probably going to be inaudible in less revealing environments. On the other side of the coin, blind testing tends to be less revealing than sighted testing (after issues of bias have been corrected for), so we can assume that if something is AUDIBLE in blind testing it is probably going to be audible in more revealing conditions. We can even speculate that these effects are reasonably offsetting, and that the results we get are “pretty close” to what we’ll find in tests that are neither blind nor A/B in character.

Finally, keep in mind that we use science as a predictive tool. We use science to predict “what will probably happen” in the future. The replicability of experiments in the scientific method is at the center of this predictability. We are less concerned, in science, with what happened in the past (that’s the study of history) than with what will happen in the future under given conditions.

When we make these very fuzzy measurements of audibility for things like 24-bit audio, we need to be more concerned with how they will be perceived in future situations than about their past effect. We need to predict how they are going to sound to future listeners in future environments. That is why probabilities like “50% of the population” are so useful. They have much more viability for predictive work than probabilities of 1%, for much of our work.

It’s worth thinking about.

Next month, we’ll look at the psychological implications of audibility. Here’s where it really gets fuzzy. If you hear it but don’t think you hear it, do you really hear it? Hmmmmmm. Thanks for listening.
Note: The following group of columns that I wrote for TV Technology are an attempt on my part to describe some of the issues surrounding our attempts to measure and evaluate the audibility of high-resolution formats. Together, I think they make an excellent short survey of these issues. I hope you find them useful.
Members
Login | Register
Mailing List