Moulton Laboratories
the art and science of sound
Taking Stock: How Important Is High Resolution Audio, Anyway?
Dave Moulton
September 2000

An epic discussion of high resolution digital audio.
Dissonance Resolved Records
Inspiring Music, Transforming Souls
www.dissonanceresolved.com
Digital Bear Entertainment
Artist development, music production, and publishing.
www.digitalbear.com
Playback Platinum
Audio lectures on loudness, compression, distortion, stereo, reverb, eq, and more.
www.musicmakerpub.com
< 1 2 3 4 5 6 >

An Exchange of Letters on the Importance of High Resolution Audio between Moulton and Yeager (not his real name) during November, 2000, plus a letter from NBC’s Bob Dixon


Yeager wrote:
I'm a post production engineer for a major network television show. We record and remix a weekly show which contains 72 tracks of discrete material.

Let me see if I can define your perception:

1. You don't see the need for 24-bit technology because you say there is no perceivable benefit. Do you think this is true for individual tracks in a MULTITRACKED recording or just in final mixdowns (CD, DAT, etc.)?

Your articles have been very unclear in this regard.

If you believe that 24 bits are not needed for individual tracks prior to final mixdown, try a mixdown with 16 bit source files as compared to a mixdown with 24-bit source files. Then do a mixdown with 96k/24 bit source files. Tell me what your hear.

If you tell me you don't hear a difference, it will ASTOUND me!!!

Please define (re-define) your position and then make it clear to your readers!!!


then Dave wrote:
Thanks for your quite relevant letter. You've got a very good point! I'm happy to say I'd already dealt with it - you'll read me dealing with the exact issue you raise in my next column (see November, 2000). A couple of comments: the problem you raise is not an issue of resolution, per se, but an issue of proper signal processing in the digital realm, so I maintain that my observations on resolution stand. There's no theoretical reason why 16-bits can't be processed as cleanly as 24 bits. However, the real-world production situation is as you describe. I hope this clarifies my position a little.

Thanks for the keen insight!


then Yeager wrote:
Thank you for replying!

Unfortunately, I disagree completely.

Math done on a 24 bit source file (with the associated increase in dynamic range) compared to math done on a 16 bit source file is not equivalent. The same goes for a file sampled at 48K as opposed to a 44.1K.

Where there is additional information describing the analog source, better accuracy is possible.

Remember, when tracking an analog source, engineers OFTEN do not use all available bits. Unfortunately, it's a trade off between clipping or over compressing. So as a result, having 24 bits while tracking an analog source at low level sounds MUCH better than a 16-bit source tracked at low level! Then think about remixing those source files x48 tracks.

You need to study this a little bit more. It doesn't add up!!!


then Dave wrote:
Whew! And here I thought I was agreeing with most of your position, yet you "disagree completely." What's going on?

When we measure something like the relative audibility of two different resolution depths, we need to eliminate all other variables in order to satisfy the Range Rule (the Range of the Theory must fit the Range of the Facts), which includes eliminating variables like DSP. When measured in that simple, straightforward, blind, bedrock kind of way, the perceptible difference between 24-bit and 16-bit resolution proves to be VERY small at best, and mostly "inaudible" to end-users, while proving to be between "inaudible" and "audible but not annoying" to professionals such as yourself. This finding is in coherence with our usage practice – 0 dBFS occurs at approximately 99 dB SPL in a calibrated (and typical) playback condition, while the acoustic noise floor (NC 30 or higher?) will tend to mask even the dither at the 16th bit (3 dB SPL), to say nothing of dither at the 24th bit (-42 dB SPL). These are the general conditions to which I have been referring in my series.

The condition you cite, a production scenario using a DAW with massive track counts and signal processing (72 tracks, you mentioned), may indeed prove to be "audible and annoying" in 16-bit format and less so or inaudible in 24-bit. I contend this is an issue of DSP, not word-length.

You misunderstand my response in one regard - while you are right in your statement that the math is not equivalent, please note that I did not say that it was.

That said, I believe it is possible to create extremely complex mixes using 16-bit resolution on the outputs IF the DSP is handled correctly - which is to say that all errors and anomalies would be maintained below the threshold of audibility in playback. If you are using, for instance, massive amounts of digital compression, you may be artificially pulling up noise floors, perhaps as much as 60 dB under some conditions. When you do this, it is reasonable to assume that there may be an audible difference between 16 and 24 bits.

My point remains: the hi-rez 24/96 digital resolution window is obscured by the analog audio window, which in turn is obscured by the acoustical windows, both source and playback. In this regard, we have oversold the virtues of hi-rez audio. The exception you correctly note remains a special case. As the quality of our DSP improves, particularly in low-cost DAWs, that problem will recede. And your 72-track mixing problems will become much much easier. At least, I suspect that's the truth of the matter.


then Yeager wrote:
Thanks for the response!

Unfortunately, I must disagree with you. DSPs will improve, but the weakest link will then also be the source files. When you're dealing with large track counts, the best possible source files must be used. 16-bit source files playing reverb tails just don't sound as good as 24-bit files (and that's only 2 tracks). What happens to all the other effect subtleties?

The bottom line is that if you have more accuracy in your sample (samples/sec and dynamic range), a better DSP will have more to work with to make accurate results.

Your article (unfortunately) gives the impression that there is NO need for us to have high standards. The real world is using DSP for multitrack mixdowns. If this is so, and the DSPs are not superpowered, the best possible source files MUST be used. One day we'll all be using primo post SHARC technology but the source files will still need to be primo.

The standards you are using for your article don't take into account real world conditions:
  1. Multitrack usage
  2. DSP
  3. Poor tracking levels
Your quote: "The exception you correctly note remains a special case."

I implore you to change your position. People are multitracking with Pro Tools all over the world. Your article ONLY applies to final mixes not multitrack source files that are used in a DAW/DSP world.

Don't you agree? Thanks for listening!!!


then Dave wrote:
What a great letter you wrote!

Sorry to be slow answering, but (a) I've been up to my ears in an unrelated project and (b) your email raises enough serious questions that it deserves a thoroughly thoughtful response. Also, even though I'd promised the readers of TV Technology that I would put this topic away, your points are sufficiently important that I thought it might be useful to air them in a future column. So, I've been really thinking about what you said.

First off, the important thing here is to find where the correctness of our relative views overlaps, rather than for either of us to "change" our opinions. So far as I'm concerned, agreement as such is irrelevant. That said, you raise a number of points that deserve comment. At the same time, it's not clear to me that you've been following the entire series, so I am attaching a file that covers the whole series. You might also want to check out my website for further insight regarding my approach to things.

Issues you raise include the following:
  • the nature of source files
  • audibility of reverb trails in 16-bit vs. 24 bit
  • other effect subtleties
  • (digital) accuracy in samples/accuracy
  • high standards (a) is there a need for them? (b) what are they?
  • the real world re a variety of things, especially
  • Multitrack usage
  • DSP
  • Tracking practices
Sounds like a book is needed! Whew! (Actually, I've got one coming out, and it discusses much of this - see my website if you're curious.)

In the meantime, let's look at these issues briefly, one at a time.

  • the nature of source files
  • Source files are mostly acoustical source recordings, subject to the constraints of (a) the acoustic environment and its noise floor, (b) the microphone and its associated preamp and their noise floors and distortion limits, (c) the analog signal flow and processing, and the A/D converters and their limits (and this includes jitter and quantization errors as well as word-length and sampling rate). The limits of the source files are the combined limits of each of those stages (the "weakest link" phenomenon). The dynamic range, linearity, resolution, etc. of the source file is only as good as the "worst" stage for each value. Increasing the resolution, etc. of one stage will generally not improve the quality of the signal unless it is the "worst" stage. The "audibility" of resolution in a source file is also constrained by the limits of all reproducing and playback stages, including the loudspeaker(s) and the playback room.
  • audibility of reverb trails in 16-bit vs. 24 bit formats
  • Reverberance, artificial or natural, is the persistence of energy in a reflective space or simulation of same. As a general rule the decay is audible down to the noise floor (or slightly below it). Interestingly, in subjective listening tests I conducted for Lucent Technologies, we found that reverberance (real or artificial) served as a masker of recording artifacts. My experience and research with both real and artificial reverberance supports this - in general, reverberance serves as a "perfume" that generally masks recording artifacts and anomalies, including resolution and noise floors, rather than revealing them.

    The use of reverb trails to detect low-level non-linearities and misbehaviors in the digital realm is generally not going to prove reliable (this is a case where using blind testing to test the assumption can be very useful).

  • other effect subtleties
  • There may be other audio effects whose low level behavior may change in audible ways as a function of word length. In general, such changes are going to be constrained by the overall limits of the source signal, as noted above. In controlled observations and evaluations of the audibility of various forms of dither at various word lengths from 16 to 24 bits, low level artifacts that are clearly audible with 40 - 60 dB (100 - 1,000x) of amplification, are simply inaudible at reference level.

  • (digital) accuracy in samples/accuracy
  • This is a really interesting problem. If we have a source signal that has a (white, for instance) noise floor at a level equivalent to, say, the 14th bit (to select a reasonable floor - approximately equivalent to 20 dBA SPL), then a 24-bit version of that signal will be no more "accurate" than a 16-bit version. The limit to the accuracy of the signal is, in this case, determined by its inherent noise floor. To increase the number of possible increments that can be used to represent a random noise signal will not reveal that noise signal more "accurately." The magnitude of the noise will determine the lower limit of accuracy. The "weakest link" phenomenon still applies. Resolution that is way finer than the magnitude of the random motion (noise) that defines the minimum amplitude of the signal is essentially useless. We cannot "improve" the accuracy of the signal by adding bits, any more than we can strengthen a chain by strengthening its strongest links. Accuracy (and strength) will both remain unchanged.

    Further, the term "accuracy" means "conforming exactly to fact - errorless." In our case, sadly, the source signal is already loaded with error, and is not accurate at all! In fact, we make virtually no attempt to make it accurate, as a function of our current recording practices of gain changing, mixing, editing and processing. And if we already have a signal error of, say, 30%, then the presence of additional error of, say, .0012% is going to be insignificant in terms of accuracy, whether or not such additional error is audible.

  • high standards (a) is there a need for them? (b) what are they?
  • You've thrown a curve ball here. I call for NO high standards? C'mon!

    That said, let's define standards, and see what they are useful for. Standards are known, accepted points of reference. We use them in order to be able to work together, to communicate and to have work that is easily replicable and applicable by all. "High standards" is a euphemism for a kind of stringent moral integrity and/or physical stringency (as in "Bob firmly and unequivocally believed that no wine lesser in quality than a 1989 Petrus was worth drinking, ever, and he adhered to that high standard for his entire life!").

    In audio production, "high standards" generally refer to a kind of meticulous care, painstaking evaluation, and sustained damage control to minimize the inevitable erosion of quality that occurs as a function of the production process, and to minimize the related degradation of the quality of illusion of the auditory product. You and I differ here – you believe that 24-bit, 96 kHz. is a useful "standard of quality" that will in some way reflect on the final quality of illusion of the end product, while I believe that such word-lengths and sampling rate are generally superfluous as "standards of quality." My own "high standards" apply more to quality of intonation, phrasing, time and intensity (in music production) and the illusion of emotional intensity and gestalt presence of the artists during playback. I come from the school of thought that holds that "nobody ever bought a record for the signal-to-noise ratio." I also believe that if you can't make a great recording using 16-bit, 44.1 kHz. storage formats, there's no way you're going to be able to just because you've got higher resolutions.

  • the real world re a variety of things
  • The "real world" is a euphemism for a kind of practical reality, as opposed to theory. In one sense, I can argue that my concern with things acoustic and analog is far more "real world" than your concern about "virtual digital" resolution. In another sense, I suspect you mean "real world" to represent the practical everyday production environment and tasks you find yourself confronted with. In that place, my preoccupation with "the nature of the signal" must seem hopelessly academic and "non-real-world."

    Nonetheless, I try to stay firmly oriented to your "real world" and to observe what is REALLY going on in that world. Also, please keep in mind that while I have some academic and scientific credentials, I am more of a practicing audio professional than I am a theorist. I have paid my dues in studios, on location, in preproduction, production, mixdown and mastering, as a producer and engineer, working in a wide variety of musical and recording styles, including multitrack, sampling cut 'n paste, classical minimalist, jazz and folk, as well as both pop and experimental synthesis and electronic music. I am not entirely without your brand of "real world" experience.

    That said, let's look at some real world issues:

  • Multitrack usage
  • In general, multitrack is a very powerful technique that permits us to defer production decisions to the point of "final fixing of the media product." It involves the assemblage of multiple tracks. As a matter of practice, we have it found it best to make those tracks anechoic in their mono source nature, except for occasional "stereophonic" tracks for atmosphere. We have evolved a practice of close-miking, isolation, and overdubbing. We usually add back our reverberance artificially in post production.

    In DAWs (such as Pro Tools), the multitrack recordings are maintained on hard disk, and EDLs are very quickly assembled, auditioned and implemented. Over the past ten years, DAWs (especially Pro Tools!) have begun to fulfill their promise as "studios in a box," leading to the kinds of production techniques you have cited in your earlier emails (72 discrete tracks mixed to stereo/Dolby Pro Logic surround). Because output portals are so expensive, we tend to do our mixing of such massive track assemblies in the DAWs, through arithmetical summation, and we output only the final mix.

    You complain that under these circumstances sometimes you run into audible problems that seem directly related to the word length of your source files. You argue that 24-bit source files do not seem to be subject to the same problems. To me, these problems are related to issues of production craft rather than word length. If we sum 72 digital tracks into mono, we will get a significant increase in noise (dither, at the very least, more likely source track noise – almost 19 dB louder than the single mono track). Further, if some of those tracks are attenuated, artifacts related to low level signal components can be generated that weren't present in the original source signal, and may be significantly louder than the low level artifacts were.

    Here's where high standards come in - the solution lies not in greater word length (as such, although it may prove to be PART of the solution), but in more meticulous and relevant production techniques. It is worth asking the questions: what production behaviors cause these problems? What production techniques can be used to resolve them? In my world, high standards involve answering those questions and applying the answers with sufficient rigor that the problems end up being eliminated for both the producer and the end-user. In my DAW experience, a variety of techniques, including re-dithering, attenuating in the analog realm, and bouncing to reduce track count, as well as the judicious application of digital noise reduction, all can help. I've never done 72 tracks (although I am soon to embark on several projects that involve that kind of track count), but I suspect the principles are the same. Meticulous, careful house-keeping and signal management, a "less-is-more" conservatism, and so on. Also, as you well know, an intimate knowledge of the strengths and weaknesses of the given platforms you are working with – they all have quirks, anomalies, and badnesses. The devil is in the details, and Pro Tools is no exception!

  • DSP
  • DSP, in general, is a mathematical expression of ASP processes (except for some FIR cases). In theory, there is no difference between the two, while in practice there always is, and the devil continues to lurk in the details in both realms. A problem with DSP at the present time is that it is constrained by processor speed and capacity. As we increase resolution and bandwidth, these constraints are increased. Similarly, as we increase processing demands (# of tracks, amount of processing, etc.), constraints are increased. Authors of DSP software resort to all sorts of strategies to reduce the damage caused by such constraints.

    But the rule holds: of the qualities of good, fast and cheap you can have any two at the expense of the third. In your situation, working on a weekly TV show with 72 tracks in a DAW, you've got to have fast, for sure, and you're using the DAW to keep costs down (cheap), so of necessity you HAVE to sacrifice good. There's nothing wrong with that. However, THAT is where your real problem lies, I think. In your case, the less DSP you use, the better it is going to sound, as a general rule. You simply have too much spaghetti on your production plate, is my guess.

  • Tracking practices
  • You cite "poor tracking levels" as a problem. By that you seem to mean that such levels are not "hot" enough, because you call upon engineers to "use all available bits." This is another fascinating and complex problem. Let's consider, once again, that pesky real world.

    A good mic has a self-noise level of about 15 dBA SPL. Most close-miked acoustic sources generate levels around 100 dB SPL. Most GOOD studios and sound stages have in-session noise floors no less than NC-30. The net result is that the absolute minimum noise level in any given octave will be no more than 85 dB below the nominal recorded signal level, and a more typical range will be 70 dB. When we track, we need to leave sufficient headroom to prevent "surprise" overloads that often occur when the people recording put more energy into the "take" than they did when they "got levels." As you well know, it is traditional to track with headroom of approximately 20 dB, or at -10 to 0 VU. Similarly, upon conversion to digital, the standard for tracks (often ignored) is that nominal level should equal -20 dBFS. In a 16-bit system, this means that the level of dither will be at approximately the level of acoustic+mic noise. It should be enough, and in my experience, this conservative practice works fine. I have almost always found the limiting factor to be mic or preamp noise.

    If we go to 20-bit resolution, dither will reside some 20 dB below the noise floor of the signal, and will be completely insignificant. 24-bit resolution yields no further gain because the thermal noise limits covering the analog portion of the analog/digital/analog conversion process restrict the real range to about 100 dB in ANY case.

    So, if we run into "tracking problems," it's got to be something else that is screwing it up. The noise and headroom of the intervening acoustical and analog audio signals are such that they represent the "weaker link." So, I'd encourage you to reconsider, in your production craft, what it is, exactly, that is troubling you, and to test the assumptions you have put forth above. I think your challenges are perfectly reasonable and thoughtful ones. From a scientific standpoint, however, they are open to question. My own guess is that the problems you cite come from other forces than inadequate resolution. My experience has been that 16-bit, carefully crafted, works really well. I find that the real challenges lie elsewhere.

    Thanks for reading all of this.

    Best regards,
    Dave
    PS - I'm willing to keep going with this a little, if you want. I'd like very much to include this in a future article (it's too big for a “letters to the editor” exchange). To do that, I need your name. I hope that's ok with you.


    then Yeager wrote:
    Thanks for your patience while you listen to my ranting.

    I do feel strongly that you're missing the point.

    I could give you my name at a later date, but due to corporate red tape it can't be used without corporate clearance.

    While I agree with your point "My own "high standards" apply more to quality of intonation....". Music will not be successful without a decent performance. But I am really commenting on your article which is measuring a technical standard.

    I agree that 20-bit resolution is all that is necessary for signal/noise issues. If however you're applying any DSP to your tracks (effects/processing/gain changes) the extra bits (beyond 16) become valuable as artifacts are introduced.

    I work on a live comedy show that has dynamics that are not easily tracked. It is often impossible to "get levels". Unfortunately that is the world I live in. 24-bit files give me the quality of 20 bit A/D's while giving the DSP more bits to work with. I've compared the actions of DSP on 16 and 24-bit files. The 24-bit files easily sound better.

    We use an SSL 9K for music and part of the posting process. I agree that if I had a 9K in front of my Pro Tools so I didn't have to use as much DSP, my final mix would probably sound better. Unfortunately, a 2nd 9K is beyond our reach. Additionally for the POST process, the lack of integration between editing and mixing would prevent us from completing the work load.

    So my point is: there are many different post &/or production needs that have to be accounted for. I think that using the 24-bit standard is only the beginning. We need more capable DSP, better A/D's, D/A's, etc. By challenging the need for improvement, you only slow down progress. Remember what engineers were able to do with analog audio. The quality produced was never thought to be possible. It takes experimentation and failures to meet the final goal. Better technical standards make it easier to track and mix. I'd rather spend my time being creative instead of jumping through hoops to make my equipment sound good !!!

    Thanks for listening!!!


    Dixon wrote:
    It is with some concern that I have read your columns regarding what you describe as the questionable audibility of high-resolution audio. What troubles me is your focus on what is "good enough". The reasons are twofold: first of all, from personal experience, even with these old ears, I believe the difference between 16-bit at 44.1 kHz. and 20-bit at 48 kHz. is important. Secondly, I feel that the focus of your articles might lead some to conclude that money and time spent in the pursuit of the highest possible quality would mean resources allocated without benefit or purpose, in other words, wasted.

    What is to be our goal?

    Meeting the standards we can "get away with"? Finding out what is "good enough" and allowing that to define our standards?

    Goals and standards are choices made by individuals. Michelangelo decided that the incredible detail in the veins on the hand, or the exquisite curls on the head of his David were essential. Painters, poets, musicians, and recording engineers all have to decide what is "good enough".

    For some, "good enough" is defined as "the best I can possibly do at this time". For others it is defined by the boundaries of what can be gotten away with.

    I have always felt that what most people perceived was our salvation, not our goal. On a personal level I don't think I have ever been really satisfied with any audio engineering job I have done. I've always ended up feeling that it could have been better, should have been better, and that the next time I would work harder to make it better. I'm not suggesting this is the most comfortable way to spend your life, but when I marvel at the beauty and detail in some of the work I have been lucky enough to see and hear, I realize that some levels of "perfection" are possible. Yet perfection seldom happens, even when you wish it to, even when you try very hard to make it that way. However it is only by following the desire inside the self, the drive to make something be as good as it can be, that allows a great work to ever happen. That drive needs to be nurtured and encouraged, or it will die.

    What measures of "good enough" will hold up to the test of time? Bob Ludwig told the story of the first time he heard a certain computer-generated drum track in a recording he mastered, and he was amazed when he was told that it wasn't real drums. Now, he says, all recording engineers can instantly recognize those types of drum tracks in those recordings. Are things we have a hard time perceiving now going to stand out in a negative way in the future?

    When a corporation decides what is "good enough", it is going to affect a lot of people and projects. The focus of your articles may help to set the bar lower than the test of time might demand, and change the direction of a lot of careers. Worse, if "good enough" becomes the goal, when the goal is missed, well, you're out of headroom.

    Now, is it always practical to pursue the highest possible quality? I submit that when the drive for excellence is combined with common sense, it is the most practical action of all. For example, crews in the field gathering news have the least amount of time for set up. Equipment is gathered and carried with no notice at all. The question here is not about 24 bits at 96 kHz and hearing the smallest detail. The issue is focus. The mind attuned to quality will think about buying well-built microphones that can reliably pick out a voice 5 feet away with clarity, or buying high-quality cables and connectors. A focus on excellence will consider the series of dubs and edits that each piece will be subjected to. All of these elements can chip away at the final product presented to the public. How good does it have to be? Well, at what stage? Acquisition? Edit? Final dub? Going to air? Archive? There is always a chance that that news bite may become an important part of history, but there is a certainty that we want our viewers to hear our program with clarity.

    In summary, it isn't just the question of 16 bits being "good enough". Our focus should be on "how good can we make it?" Let common sense choose the technology, but I'd rather have people choosing 16 bits while keeping in mind that it is a compromise instead of fooling themselves into thinking that it is "perfectly fine'.

    Thanks for your time,
    Bob Dixon


    Closing comments from Dave:
    Some of these comments showed up in further columns for TV Technology, where I openly shared the gist of these letters. Bob Dixon, of NBC Sports, was actually the guy who put me in touch with TV Technology for this audio columnist gig. We go back to the middle ‘70s, and we’ve got a lot of mutual regard and respect. I take his comments quite seriously, and I suggest that you should, too.

    Yeager’s comments about standards got to me, unfortunately. As you probably read, I got huffy and went into my standard “nobody ever bought a record for its signal-to-noise ratio” rant, and talked about my musical verities – intonation, performance intensity and ensemble playing. Yeager quite rightly pointed out in his next e-mail that I was obscuring things, that he was talking about “technical standards.” He’s absolutely right, of course. My mistake.

    What’s important to note is that there’s some really good thinking here. It made me realize that in debunking the hype and mythology that surround high-resolution audio (and there’s plenty to debunk!), I may have also given TV Technology readers the wrong idea. “Just good enough” isn’t the right idea. High standards ARE important. So, in the May 2001 issue, I hunkered down and took a look at some technical standards that I think may suggest real excellence in audio.

    And here is that column:
    < 1 2 3 4 5 6 >
    Members
    Login | Register
    Mailing List