Moulton Laboratories
the art and science of sound
What IS The Sound Of One Amp Clipping?
Dave Moulton, assisted by Alex Case and Peter Alhadeff
September 1994

Our Intrepid Author Ventures (or Sinks?) Deeper Into The Swamps of Subjective Listening Tests

< 1 2 3 4 5 >

How We Tested ‘Em

The Bryston 4B (250 W/channel into 8 ohms, $2095) is a staple in studios around the country. It is generally held to be an excellent amplifier, which is why I bought one - more about that later. So, I figured we’d use it as a reference amplifier. Out back, in my Audio Museè Et Parc Du Junque, I found an old Crown D75 (35 W/channel into 8 ohms, ca. $300 in 1975) that I use occasionally as a headphone amp, etc. I then called Mark Parsons (Parsons Audio, Wellesley, MA) and asked for some loaners. As my account is paid up, he said sure, and loaned me a Hafler Pro 2400 (120 W/channel into 8 ohms, $630) and a Yamaha P2350 (175 W/channel into 8 ohms, $949).

I didn’t test any tube amps nor any of the comparatively lightweight sound-reinforcement amps that use switching power supplies, because (a) Parsons didn’t have ‘em available and (b) they aren’t normally used in studios. Their virtues and vices, therefore, are outside the range of this article, and what I found may not apply to them.

For speakers, I used my custom wide-dispersion reference speakers, some Auratones, some LAR 2-way bookshelf speakers, and a pair of JBL 4313 ported small studio monitors.

For technology, I used an ABX switcher plus the TEF20 analysis system to control and document this mess. The switcher allows the listener(s) to switch back and forth between power amps A & B at will, and then to select X (a random choice of either A or B, with identity concealed until requested by the listener). If you correctly guess A or B around 50% of the time, it means that your performance is indistinguishable from chance, and it is reasonable to infer that you cannot reliably hear, or at least identify, a difference between A and B

For ears, I used my my own, plus I got help from my daughter (good ears, untrained), Tom Plsek (acoustician and trombonist, excellent ears), Alex Case (recording engineer, beginning – excellent and trained ears) and Carl Beatty (recording engineer, experienced – excellent and experienced ears) as listeners. We listened, in several different sessions over about a week’s time, to a variety of stereophonic recordings (and occasionally Pink Noise) at a variety of levels. Mostly, we listened on my reference speakers, but we switched speakers on occasion to see if it mattered. It didn’t.

Listening in sets of ten trials at a time, we would toggle back and forth between A and B and then see if we could tell which X was. It turned out that this was never easy except by accident (see below). Between trials, we discussed at length what we heard and the nature of the differences we perceived.

What We Found

Because these tests were fairly informal, and because we are not actually rating these amplifiers, I will not bore you with the statistical results of our tests. Instead, I’ll jump to our conclusions and then go back and share with you some of the insights we gained along the way.
  • First, power amps sound pretty much alike until you run into their power limits. Personally, I couldn’t tell the difference at all between the Crown D75, the Hafler or the Bryston until I got the overload lights going on the D75. Then it was easy to hear the Crown as being muddier, less distinct, distorted, thick, etc.
  • Second, we couldn’t measure any significant differences in frequency response of any of the test amps within the audible range of hearing.
  • Third, Alex Case was able to make a statistically significant differentiation between amps (his overall score, for 100 trials, was around 70% correct). This means that he heard differences that cannot reasonably be described as “chance.” The rest of us scored in the “chance” range (40 - 60% correct answers). Alex allowed as how he was straining to pick out differences, and his notes reveal no consistent correlations such as Amp C always being “smoother.” Nonetheless, his results make clear that there is an audible difference, however subtle, for at least some of us.
  • Fourth, the single amp that was somewhat reliably differentiated was the Yamaha (Alex heard it correctly 90% of the time – I managed 80%), and both of us heard its differences as subtle changes to the reverberance (er, sense of envelopment) in recordings. Alex felt the difference was big enough to possibly affect mix decisions, while I felt like I was grasping at straws the whole time. We speculate that the Yamaha’s lower damping factor (100 vs. 300 for the Hafler and 500 for the Bryston and Crown) is what led to this differentiation.
  • Fifth, small set-up errors in level settings led to easy differentiation – we found we had to be Really Fussy about level matching. See the stories below.
  • Sixth, all of us regularly heard differences (sometimes several of us at once agreed we heard the same things) that defied objective measurement or our ability to reliably identify those differences in X. Big Insight: just because you hear it doesn’t mean it’s real! So, yes, Virginia, there is an Audio Twilight Zone Factor! This is an extremely important point, and it reveals something about why this sort of work can be so difficult and hard to pin down. Humans will impose “meaning” and patterns upon their perceptions of random conditions. I often “hear” orchestral music in Pink Noise, for instance, and if I watch snow on television (which is visual noise), I see shapes and patterns. The so-called Rorschach Test, in fact, uses this tendency as a basis for studying our thoughts. So, in perceptual tests we need the rigor of the ABX test to “prove” our perception of differences. It is not enough to “hear” a difference. We must be able to reliably identify that difference in order to be sure that the difference is there. This became painfully clear to all of us in these tests.
  • Finally, and this is quite important, all of us felt the differences we heard were generally in the “vanishingly small” category. None of the differences were large, and only one even fit into the “small” category. At the same time, we were able to easily identify some really pretty small differences due to measured errors, which suggests our inability to pick out differences wasn’t simply due to test equipment deficiencies. In this regard, our tests also comply with the Range Rule, in that we were listening to conventional recordings with conventional equipment in a reasonably conventional room. So, the small and vanishingly small differences we heard are probably going to be very similar to the differences you, your clients and your end-users will hear.
NEXT> Misadventures    
< 1 2 3 4 5 >

Post a Comment



rss2

rss atom