Moulton Laboratories
the art and science of sound
More About CODECs
Originally published in TV Technology, approx. December 2002
By Dave Moulton
December 2002

They're everywhere: data compression strategies for digital audio.

The View From 2009: What I said in 2002 still stands, I think. We really have gotten to a point where the data compression is not only tolerable, but mostly inaudible. And CODECs are absolutely everywhere, now.

More About CODECs, or: HwLBRCDCsRGnnChngThBrdcstWrld!

CODECs In Review

You all know what CODECs are, right? Chunks of software that encode digital data for transmission and/or storage and then decode it for use. It’s an Encode/Decode cycle. We call it CODEC (for Code/Decode), in an oddly poetic compression of language.

But WHY do we encode and decode? Why don’t we just hit the transmit or record button? You know the answer to this one too, right?

That answer is, of course: we got more data than we can transmit! So we gotta throw some away. And this is what the CODEC does. The CODE/DECODE cycle contains an algorithm that deletes data that the CODEC’s programmers believe won’t be important to our enjoyment of the transmitted signal. Sort of like Readers’ Digest Condensed Books.

The reality is that, for audio signals, the conversion from analog to digital requires a LOT of digital data. The standard stereo CD signal uses 1.41 megabits of data every second, which is, ah, huge. Takes a lot of bandwidth for transmitting. Bandwidth that we don’t have. Similar constraints exist for video signals as well.

So, the CODEC encodes the signal, reducing its bandwidth by throwing away a great deal of the signal (all of it, hopefully, consisting of irrelevant details) prior to transmission. After transmission has been accomplished, the reduced signal is reconstituted into a reduced-data approximation of the original, during the DECODE portion of the cycle. Naturally, the devil is in the details.

In fact, the devil permeates a fair amount of this. Back in the late 70’s, we went to considerable trouble to define and establish a viable digital signal standard (the Red Book CD specification for digital audio is 44,100 16-bit voltage samples per second per channel), and now, ironically, we’re busy working on trying to whittle away at that estimable standard in order to be able to transmit this signal via radio (and television) in its digital format, even as we also develop higher resolution formats. Certainly sounds a like a deal with the devil to me!

Why Is This So Important?

The general adoption of such data compression is really worth thinking about, because it is shaping both the present and the future of audio (and video). At present, all Internet transmission of audio and video and all satellite transmission of digital audio and video are compressed by CODECs. Further, as digital FM comes on line over the next few years (it is already available via satellite and terrestrial IBOC transmission is beginning now), it is all compressed by CODECs. We are profoundly committed to this technology and broadcast media. By 2010 it is going to be EVERYWHERE!

Taken just by itself as a concept, this sort of lossy data compression is not a bad idea at all. However, concern arises because we’ve managed to paint ourselves into a bit of a corner with it, due to some really enthusiastic general optimism (“Oh, the channels we’ll have! The pictures we’ll see!!”). BECAUSE we can compress, we do. More to the point, we compress AS MUCH AS WE CAN, for better or for worse. We have now committed ourselves to transmitting a volume of data (er, number of channels) that requires quite massive compression. We didn’t stop at the quite conservative point where we felt that quality MIGHT begin to fall off, but rather we’ve kept going to the point where we KNOW that it has begun to fall off, but still hope nobody will mind too much. To put it bluntly, we’ve changed our standard from: “the quality is just good enough that almost no one is going to notice” to “the quality is just good enough that almost no one is going to sue us”. We’ve been trying to fit more and more channels into a fixed bandwidth, simply to maintain, in each of our various competing corporate business plans, a competitive offering to the public.

Why did we do this? For survival in a free market, that’s why! We could not and can not afford not to!

Meanwhile, there are some verities that need to be kept in mind. First, data compression is NOT like audio compression. We aren’t compressing amplitude here, knowing that we can expand it later, with no significant loss of resolution. Data compression might, more accurately, be called “data-stripping.” We are deleting data, and it cannot reasonably be recovered or restored later. It’s gone for good, for better and for worse.

Second, we are stripping A LOT of data away, usually more than 90% of it! This is not a trivial act of compression akin to a 1.3:1 compression above +10 dBM in audio. This is massive.

Finally, once we’ve “decoded” the stripped data, we may encounter some significant problems if we try to encode it again for more storage or transmission. When we do that, we multiply our errors. Interesting, eh?

Now the argument is, of course, that the stuff we’re stripping out is inaudible, or at least insignificant. And, in fact, there is a lot of truth to that argument. This sort of lossy compression can really work well, and in fact it yields quite massive benefits for both the broadcasters and the consumers.

But it is essential that we keep in mind that we’ve created a fairly fragile low-resolution signal. We’ve done this as a function of our effort to fit more channels into a given bandwidth. We need to be realistic about what that low-resolution signal is, how it behaves and how it will be perceived by our beloved listeners and viewers.

Kvetching And High Praise

And here’s where we maybe got a little TOO chummy with the devil. We’ve made some, ah, claims about our CODECs. We’ve said things like, “Well, you know, it’s NEARLY CD quality. You can’t really hear the difference.” And we’ve used such claims to justify our adoption of CODECs. Unfortunately, we’ve institutionalized such claims.

Meanwhile, we’ve tended to downplay just exactly how much data we’re throwing out, as well as how much progress we’re really making with improving CODECs, because to do so implies that maybe they weren’t so hot to begin with, while we were busy claiming that they were, well, ah, NEARLY like CDs.

I’ve got some history here. Off and on, I’ve been hired by a CODEC developer to conduct formal and informal listening tests and demonstrations of CODEC audio performance. Right now, I’m doing work for iBiquity Digital Corporation, who in turn is providing the CODECs used by both Sirius Satellite Radio and the upcoming IBOC terrestrial digital FM broadcasting.

Some years back, I measured some stereo CODECs running at 128 and 96 kilobits per second that performed quite well, in addition to some other CODECs whose performance ranged from fair to good. By “well,” I mean that in blind trials expert listeners scored them somewhere between “inaudible” and “audible but not annoying” and naïve listeners generally scored them higher than that.

At that time, I also took an informal listen to CODECs running at lower bit rates, such as 64 and 36 kilobits per second. At that time, I opined that such rates were “too slow to be musically viable, though speech intelligibility is certainly adequate.” That was a polite way of saying that CODECs at that speed really didn’t sound very good at all, certainly not for music.

Recently, iBiquity asked me to have a listen to their current Perceptual Audio Coder Version 4, running at 96, 64, 48 and 36 kilobits per second. At the faster end of that range the performance is really remarkably good, generally better than the best CODECs I studied five years ago. At the slower end of things, where things used to be impossible from a musical standpoint, the performance is now musically viable, even if the artifacts are audible.

Now, it needs to be noted that most listeners will detect some artifacts and some listeners will almost certainly find those artifacts to be annoying, and there is no sense in trying to gloss over those truths. In fact, it is the previous attempts to do just such glossing that has gotten us into trouble. But, at the same time there is a quite positive truth to be noted as well. At 96 kilobits per second, we are actually deleting 93% of the data, and the result is a signal that even expert listeners find to be generally acceptable, whose defects are often inaudible. Remarkable! There’s high praise due here for a remarkable improvement and technical accomplishment.

But what’s even better, to my mind, is what has been accomplished at the lower bit rates. At 36 kilobits per second, we have deleted 97.5% (!) of the data (and of course we can hear the effects of that compression). But, just as significantly, the signal has remained musically viable, which is to say that the elements of timbre, dynamics, stereo image and intelligibility remain viable and generally enjoyable. This means that, according to my lofty standards, such compressed signals are pleasant and satisfying for the kind of general-purpose background listening that is the basis for most listening to broadcast music, the kind of listening that occurs in a car, for instance. To me, that’s a tremendous achievement, and a huge improvement over where we were back in 1997. Back then, I doubted that we could ever get down to these bit rates successfully.

The Future Ain’t Gonna Be What It Used To

The upshot of all of this is that we are entering yet another iteration of the brave new world, a world in which low-resolution signals are everywhere and their signature artifacts are audible. But we can and will learn to live with them. And in return, we will get a richness of diversity in our broadcast programming that hasn’t been available for quite a while now, if ever.

Meanwhile, this is another step along the path toward on-demand broadcast programming, where we can call up anything we want anytime we want anywhere we want, all for a reasonable cost. That’s where this is all really heading.

Thnksfrlstnng!

DvMltnsbngcmprssdbynl2:1.Ntrstngh?Ycncmplnthmbtnythngthswbst,moultonlabs.com.
Members
Login | Register
Mailing List

Post a Comment



rss2

rss atom