About My Last Article on Levels
In the May ‘96 issue of
Recording, we looked at the range of audio levels that are found in typical commercial recordings. This range turns out to be quite small, and electric instrument pop/rock recordings in particular have a range of about 20 dB, pushed right up against 0 dB Full Scale. This would suggest that the 96 dB dynamic range of a CD is largely wasted, and that the 120 dB of 20-bit recording is definitely overkill. Is that really true?
Bob Ludwig stopped in on his way to an AES meeting in June, and we talked about this issue at some length. Bob was a little doubtful about the idea that the measured dynamic range between Lmax (the maximum measured level of a recording) and Lmin (the minimum level) is a meaningful indicator of the necessary dynamic range for a recording. He noted that the difference between a 16-bit and a 20-bit recording is quite audible, and that we’re nowhere near having
more dynamic range than we need on CDs or other digital recording media.
So, what gives here? How can it be that we can measure rock recordings to have a range of no more than 20 dB between their loudest and softest moments (except for the fadeout), but at the same time feel that a 96 dB range isn’t enough and can hear some stuff well below that range? Well, the presence of this paradox suggests that we need to look at the whole issue of levels in a little more detail.
What Is Dynamic Range?
As we’ve already seen, the dynamic range of an
audio system is the range between the loudest level the system can achieve without distorting and the noise floor of the system. Meanwhile, the dynamic range of an
audio signal is appears to be the range between it’s Lmin and Lmax. Now, it stands to reason that the dynamic range of the signal should fit within the dynamic range of the system. If it gets louder than the system can handle, we’ll encounter distortion at Lmax. If it gets softer than the system’s noise floor, then that noise floor should be what we’ll hear during Lmin.
In fact it is more complicated than that, particularly when we deal with signals and noise floors. Not only do we hear the noise floor, but we also hear signal below it . . .
What Is A Signal? And What Do We Mean By A Single Sound?
First, we’ve got to consider what we mean by a signal. Yeah, what exactly
do we mean by the term “audio signal” or “sound?”
At its simplest, a signal or sound is an expenditure of energy in the audible frequency range existing either in air or in a wire. It has dimensions of frequency, amplitude and time. Now, these dimensions vary, so that a single sound doesn’t necessarily have a single frequency, amplitude or time. In fact, all musical sounds are complex arrays of frequencies with varying amplitudes over a range of times. So, except for that pesky 1 kHz. sine wave @ 0 VU that only an audio engineer’s mother could love, the signals we work with are generally quite complex. Take a look at the following harmonic spectrum taken from an article I did on Spectral Management back in May, 1993:
| | | |
|  |
| Power spectrum for a typical musical sound. The fundamental frequency is around 200 Hz. and approximately 60 harmonics are shown. | |
Now there are several things wrong with this picture. The first is that it represents what is going on at a single point in time. Real music and real audio change rapidly. Therefore, you can reasonably expect that, two or three milliseconds later, the power spectrum of this signal will have changed sufficiently that it may very well not be recognizable as the same signal, even though our ears will easily recognize it as still a singer singing G below middle C.
The second thing wrong is that the power range I’ve shown for these harmonics is a little too small (1,000:1 = 30 dB). In fact, the power range for
audible harmonics across the spectrum is more like (are you ready for this?) 10,000,000:1 or 70 dB!
So, in a single sound, for a small fraction of a second, there will be a range of frequency components with a range of amplitudes. And they are mostly all audible. The loudest one in this case, approximately 200 microWatts (-7 dBM) at about 600 Hz., doesn’t render the others inaudible, which is to say, it doesn’t
mask the others.
The measured level of the signal is, to put it precisely, the amplitude that is caused by the sum of the powers of all of the components. In the example above, the level is 565 microWatts, the sum of the powers shown for each octave, or approximately -3 dBM.
So, the indicated level of the signal tells you only about its total power, which is somewhat greater than the power of it’s loudest component. It tells you nothing about the range of powers of the various components!
My neighbor Tom Bates pointed out another way to think about this. The amplitude of signals, he notes, range from Lmax to Lmin. He then goes on to point out that we’d probably like to have the
resolution of those signals to be pretty good, like, er, 16 bits or something close to it. He politely suggests that a better way to decide how much dynamic range we really need is to decide what the minimum resolution of a signal is that we are willing to accept. A 1-bit signal is going to sound downright awful, an 8-bit signal will sound moderately crummy and a 12 bit signal will sound pretty good, if not fabulous. So, being practical, let’s accept 12 bits as the minimum signal resolution we’ll accept (for “good, if not fabulous”). This means that the dynamic range we need is the range between Lmax and Lmin
plus the resolution range of the signal (12-bits, or 72 dB) at Lmin.
Desired dynamic range, Bates says, can be defined this way: It is the range of signal resolution plus the range of amplitudes of signals. If we have a 12-bit signal (72 dB) and a range between Lmin and Lmax of 30 dB, then we’d like a production dynamic range of 102 dB.
comments: (1)