Various Realms of Sound ‘n Audio
The Acoustic Realm
The acoustic realm is where sound actually occurs. Conversational voice level is approximately 65 dB Sound Pressure Level (SPL). Orchestral music in a concert hall ranges from about 50 dB SPL to 120 dB SPL, rock ‘n roll in a club from 80 to 125 dB SPL. Sound becomes really unpleasant for humans above 120 dB SPL, and air begins to distort above about 130 dB SPL. Low level noise floors hover around 50 dB SPL (I’m not going to discuss the meaning of A-weighting here) in our noisy modern world. Interestingly, that acoustical noise floor is not white, like electrical noise, but instead steeply biased toward low frequencies, so we can expect the acoustical noise floor level to be 30 dB louder at 30 Hz. than it will be at 3000 Hz.
More importantly, thanks to the way our hearing works, our perceived frequency response in the acoustical realm varies significantly as a function of loudness (the Fletcher-Munson or Equal Loudness Curves). The perceived loudness of 30 Hz. will probably be 40 dB less, relative to 1 kHz., at 40 dB SPL than it is at 100 dB SPL. That’s a big difference!
Finally, there is no fixed correspondence between electrical signal levels and acoustic sound pressure levels. However, in the film industry, such a level has (roughly) been established so that the electrical signal level 0 VU (usually the same as -20 dBFS in the digital realm) yields 85 dB SPL at the listener’s position for EACH loudspeaker. Meanwhile, in the television world, Dolby’s Dialogue Normalization (AKA “dialnorm”) is intended to calibrate the dialog level of video programs on the entirely reasonable assumption that viewer/listeners will intuitively set that dialog level, which is at –31 dBFS in a “good” stereo television set viewing a correctly calibrated program, to 65 dB SPL.
The Analog Realm
The analog realm is a representation of sound as a two-dimensional map of “voltage amplitude” over time, where voltage amplitude stands in for “relative air density” in the acoustic realm. This realm is bounded on the top end of ampliutude by the onset of non-linearity, usually signal clipping as the limits of the voltage supply are reached. On the bottom end amplitude is limited by random electrical noise (white noise, rising 3 dB/octave in amplitude), power supply hum and whatever acoustic noise is carried into the analog realm.
The tradition for analog recording has been to have a “nominal” signal level that represents some sort of ongoing average level over time. This nominal level is some amount below the level of clipping, typically 15-20 dB below. In the analog realm, as clipping is approached, non-linearities may be incurred as a function of limitations in the equipment relating to the comparatively large voltage swings that occur (i.e. “slew rate”). That tradition has been supplanted by the digital tradition. Now, analog levels converted from the digital realm hover very close to that level of clipping until they are attenuated. Hot digital mixes may be hovering around +20 dBV coming out of the converters, which may be a little much for some analog electronics. Beware!
The Digital Realm
Digital audio is, of course, a subset of analog (all digital audio comes from analog and goes back into analog). As you know, it consists of numbers. Its amplitude is bounded on the top by 0 dBFS, which represents the greatest amplitude of a single frequency (sine wave) that can be represented by a signal swinging between “all ones” and “all zeros.” At the smallest amplitude, it is bounded by dither noise, zipper noise (the noise generated by the Least Significant Bit randomly toggling back and forth between zero and one in the absence of dither) and/or noise from the analog and acoustic realms carried by the signal into the digital realm.
It has become a tradition (unfortunately) to try to get our mastered recording to have as high an amplitude as possible, which is to say that we tend to place it as close to 0 dBFS as we reasonably can. Meanwhile, 0 dBFS has, as a matter of electronic design tradition, been made equivalent to the “point of clipping” in the analog realm, which is usually at the amplitude of the voltage power supply.
I’ve taken you through this brief review because I believe it is essential to get and keep in your head these basic relationships when you are preparing a recording for public release. You need to keep in mind all of the digital, analog and acoustic issues and their interactions as you work. Otherwise, your success will be restricted to whatever good ol’ dumb luck will give you.
About Audio Levels Themselves
Audio levels themselves, although they seem simple enough, are actually fairly difficult to talk about and understand. Keep in mind that sound itself is a constantly varying amalgam of various frequencies at various amplitudes. When we talk about the “level” of a sound, we are actually describing the power summation of all those various frequencies (hundreds of them, usually) at various amplitudes at any given point in time. This summation also changes rapidly and dramatically over time.
To simplify matters, we often refer to “peak levels” (the highest level amplitude reached in some time period, such as 3 seconds), or else to RMS (Root-Mean-Square) levels, an averaged level over some brief period, such as .3 second. For a given musical signal, the peak level may be up to 20 dB greater than the RMS level, and it is typically around 8-10 dB greater, depending on (a) the nature of the program material and (b) the nature of audio compression used.
It is important to keep in mind that a loud signal does not simply drown out (or mask) a softer signal. Further, any complex signal is made up of many softer components (see above), most of which are audible. It has been my experience that we often can easily hear signal components that are up to 60 dB softer than the overall level of the signal. Meanwhile, it is fairly easy to hear disparate signals that are also up to 60 dB different. Even in the worst case (broadband noise), it is generally possible to hear a pitched signal (if not other noise of a similar spectrum) up to 20 dB below the noise.
About Audio Level Meters
Meters only tell us something about the summed components of a signal, or what the amplitude (not loudness!) of the whole thing is at any given moment. You need to know what kind of detection the meter does (is it peak?, RMS?, old VU ballistics?, slow (10 second average) detection?, other?). With practice, you can guess what the other meter values of a signal might be from any given meter reading, IF you know WHAT your meter is detecting and you are paying attention to what kind of material you’re listening to. But remember, it takes practice!
Personally, I use a batch of different meters in my work. These days I work almost exclusively in Pro Tools, to a point where I even play back CDs through Pro Tools. In Pro Tools I use the SpectraFoo plug-in for a variety of analytic displays, include their own meter protocol, which I set for a 72 dB range, with peak and RMS-peak (the loudest AVERAGE level obtained) tell-tales [in 2009, I’m using IzoTope’s Ozone 4, which has excellent meters]. I also use Trillium Lane meters to count the number of “overs” and “continuous overs”. Very handy!
These days I use my Yamaha 02R96 console as a “monitor mixer” (if you can believe that!). For stereo work, I sometimes refer to its main meters as a general check, but more often I refer to a pair of analog Dorrough meters, which are calibrated so that their highest level (+14) equals 0 dBFS at the Yamaha.
Finally, I set up a B&K test microphone on the meter bridge and use it to measure acoustic levels and spectrum, which are computed by a TEF analysis system. This permits me to know explicitly what acoustic levels I’m experiencing, so I can be quite fussy about maintaining stable acoustic levels (and therefore spectral consistency).
To me, this stuff is all a BIG help in mastering. It allows me to very carefully and precisely determine a great deal about levels on the CD, in the electronics and in the room. I can also study and satisfy myself that what I’m doing actually is louder or softer than some reference signal, and by how much, as well as to measure complex spectral changes. It’s a fabulous set of study tools, which really help me with clients’ recordings, not to mention my own.
About Spectrum
At the same time, while mastering we have to concern ourselves with the spectrum of our recording. At this point in the production process, the mix has been fixed, and what is called for is to adjust the overall spectrum of that mix so that it will play back most effectively on our fans’ range of playback systems. This is a very gentle, touchy creative process. Two things are needed:
first, we have to know the spectrum absolutely solidly, so that we can make the various fine adjustments that (a) bring out the best qualities of the mix while (b) gently de-emphasizing spectral problems. All this in a subtle enough way that it ends up just sounding natural and correct for the genre.
Second, we have to successfully anticipate how this spectrum will sound for the broad range of end-users. This one is really tough! It takes years of apprenticeship and mastering experience to develop the feel and touch needed to make such anticipations reliably. In fact, this is probably the strongest argument for NOT doing it yourself.
For me, the touchiest parts of the spectrum are the bass (from 120 Hz. on down) and the extremely critical octave band around 4 kHz. If you can get those to sit right, you’re well on the way. Immediately above the bass range, in the lower mid-range, from 120 to about 500 Hz., there can be many troubling problems that distinguish themselves as tubbiness, muddiness, thickness, etc. However, if you simply turn down this part of the spectrum, you run the risk of making the mix thin, wimpy and/or sterile. You need to be very picky about what you turn down.
Similarly, in the octave between 500 and 1000 Hz., if you reduce level the mix will tend to sound “open” and “transparent,” while if you boost level the mix will tend to sound “rich,” “warm” and “full.” The trick is, of course, to tease out both sets of qualities, so the mix sounds “warm,” “open,” “full,” “rich” and “transparent.” This takes work, practice and ears!
From 1 kHz. to 3.5 kHz. is a range where many overtones exist, including some fairly harsh resonances. A lot of experimentation and care are needed here to bring out the best while avoiding the nasty stuff.
I find the region above 5 kHz. fairly easy to deal with unless there are real sibilance problems in the mix. There is a metallic brilliance up here, and above that, “airiness.” The main thing is to make sure that you don’t lose these qualities, while at the same time not being crass about it. You need to tiptoe between giving enough top end so the boombox listeners can sense some of that brilliance while audiophile listeners won’t find their wine-glasses shattering due to excess ultrasonic energy.
To do all this, you need to have a gentle but sure touch with the equalizer. This is not a place for dramatic timbral gestures! As far as I’m concerned, when you are done equalizing, the recording should seem to distinctly emerge when you switch the EQ in, but WITHOUT the actual level going up! (Check on this last! I’ve bagged myself more times than I care to remember on just this issue – thinking I’ve really improved the spectrum only to find that what I’ve really done is turned the level up 6 dB!!!)
Good luck with this!