Moulton Laboratories
the art and science of sound
Some Reminiscing About My Experiences With Subjective Testing
David Moulton
September 2001

How Dave began his journey down the rabbit hole of audio perception measurement.

< 1 2 3 4

Enter the Rasch Model

During this time, my son Mark was working on his Doctorate in education at the University of Chicago. I didn’t know much about what he was doing, but tried to stay in touch. I found his explanations of the Rasch Model (for the first year I thought it was the “Raj” model and, of course, thought it was originated by an Indian mathematician!) difficult, confusing and irrelevant. I couldn’t keep up!

You have to understand. I’m not a mathematician or a researcher. I’m a musician and a recording engineer! So, I used simple-minded common sense. I used some rather primitive common approaches to my measurements. I asked people to make their ratings, and then I added up the scores and created averages. I also created averages for each listener and each bit of program material I used. I then looked at this stuff and estimated how reasonable it all looked. After that assessment, I very cautiously suggested to the client something about how his or her product made out. I tried to be as gentle, kind and encouraging as Floyd had been to me that awful day that we bit the big weenie.

However, I was confused and concerned by some of the perplexing questions that came up when evaluating our measurements and by the occasionally wacky results I seemed to get. One day, while struggling once again to understand Mark’s explanations of his work with the Rasch Model, I asked him if the Rasch Model could be applied to my loudspeaker measurements. He said he thought so.

So. I pulled out some old test data, flew to Chicago, and we went to work. I hoped this would help me to understand what he was doing a little better (father-son bonding, y’know). I spent an intense bunch of days with him. Ben Wright (see Mark’s write-up on Objective Models) kindly invited us to lunch and helped out with data evaluation. It was quite a trip! It was also a moment of epiphany.

In its simplest manifestation, the Rasch Model balances the "ability" of test subjects against the "degree of difficulty" of tasks or questions presented to those subjects. We substituted the "quality of reproduction" by the various loudspeakers under test for the "level of ability" and we substituted an amalgam of "test listener" and "music selection" for "degree of difficulty.” So, in effect, we measured our loudspeakers’ “abilities” on a variety of “test questions” of varying difficulty, “questions” that consisted of the opinions of various listeners listening to various pieces of music and test signals.

When we were done, I could see clearly that various listeners clearly had different rating standards and different levels of consistency. I could also see that different selections of music affected the reliability of listeners' performance. I was particularly interested in Pink Noise. I had wondered how test subjects could rate the quality of reproduction of an abstract signal they had never heard before, or ONLY heard through a loudspeaker! Although my subjects cheerfully and gamely made such ratings, and by my own rough estimations it seemed that they could meaningfully distinguish differences, the Rasch Model analysis said those ratings were improbable, and therefore probably meaningless. A big light bulb went on for me!

Another big light bulb went on when Mark suspended that meaningless data! I’d always thought you were supposed to keep all the data in, to avoid the introduction of your own prejudices. “Nope,” Mark said, “this data is ‘misfitting.’ That means that some force is acting on it that skews the answers. We have to take it out in order for the findings to be consistent with what other listeners in other tests will find.” Wow! A whole new way of thinking opened up! So now we have a way to identify and get rid of bogus data!

So, through use of the Rasch Model, Mark was able to sort out the varying listeners and music selections, to arrive at an idealized ranking of loudspeakers that would be reproducible by any set of listeners and program materials that were subjected to the same analysis. To me, this was (and still is) a major breakthrough!

It turns out that the Rasch Model dramatically simplifies the burden of physically eliminating variables during the physical test process. It does this by effectively assessing the “probability” of the findings. When a finding is “improbable,” it suspends it and “replaces” it with a “probable” finding. Thus, using an array of mathematical tools, the Rasch Model permits us to collect the very fuzzy sorts of data that are found in subjective measurements and to evaluate them in a way that proves to be highly robust, reproducible and reliable. Instead of physical rigor in test design, we can use mathematical rigor during evaluation to establish the reliability of our results.

I began to solicit work evaluating audio devices and systems using this technique. This led to the establishment of Moulton Laboratories, and a long-term relationship with Lucent Technologies (morphing to Lucent Digital Radio and now Ibiquity) to measure the performance of audio CODECs (devices that compress audio data for radio broadcast transmission). Our work has proved to be extremely robust and reliable, as well as remarkably economical and, happily, quite relevant. We've presented several papers, and continue to serve clients occasionally.

We've been sufficiently busy with other projects and life issues that we haven't actively "grown" Moulton Laboratories as a testing service. Nonetheless, we know that we have a powerful, economical and viable method for analyzing and interpreting the subjective responses of test subjects to sensory stimuli, yielding reproducible and reliable rankings. It is our hope to put this to good use In future years, as time and resources permit.

Right now, Mark is beginning to help me measure the benefit of the new loudspeaker technology that Sausalito Audio Works and Bang & Olufsen are developing (see their pages on this website). Soon I hope to be able to share the results of that work with you.

Thanks for listening!
< 1 2 3 4
Members
Login | Register
Mailing List

Post a Comment



rss2

rss atom