View this PageEdit this PageUploads to this PageHistory of this PageTop of the SwikiRecent ChangesSearch the SwikiHelp Guide


X-Authentication-Warning: majordom set sender to using -f
Date: Sun, 01 Jul 2001 18:47:29 +1000
From: Robin Whittle
Organization: First Principles
X-Accept-Language: en
Subject: Re: [music-dsp] 24bit/96khz? what for?

co melvin wrote:

> can someone explain the motivation for this
> 24bit/96kHz. i really cannot see why companies push
> for this technology if 48kHz is already more than
> enough

We had a big discussion on this around 10 to 12 February this year. My
position is that properly done (as it almost always is with modern
sigma-delta ADCs) 44.1 kHz is perfectly adequate. That is, there is no
perceptible degradation in the sound, at least due to sampling rate, for
a final mix in any normal, practical, listening situation.

As far as bit depth, I believe that 16 bits is fine for any finally
mixed piece of music in almost any practical listening situation. That
is, the dither noise (which is all there is in a proper system, there is
no distortion, just one step of white or perhaps shaped spectrum dither)
will be lower than what is perceptible (or damn close to it) for all
listening situations which are not painfully loud when the signal
reaches the loudest limits. If you want painful levels and
imperceptible noise, then yes, you need more than 16 bits.

Also, if you want to go near the speaker in the quiet bits to try to
listen for the dither, and then retreat to a safe distance before the
music hits full volume, then you need more than 16 bits too.

For recording, I would prefer to use a 20 or so-called "24" bit ADC.
The lower noise (though few are probably a great deal better than 16 bit
in reality) and greater bit depth means you can be not so fussy about
gain and overloads from signals which are louder than you anticipated.

When it comes to microphones, it is hard to get one with low enough
background noise to make the limitations of the 16 bit ADC in my Sony D8
DAT recorder the limiting factor. Of course, for really loud physical
signal levels , then the background noise of the ADC would be the
limiting factor since the gain between the mike and ADC would be turned
right down.

Mike noise is approximately pink. In a good mike (say an electret
condenser mike or an externally energised condenser), the noise of the
mike is not created to any significant extent by the amplifier
electronics (primarily the single FET which buffers the signal
initially) but by the Brownian motion bombardment of the diaphragm by
air molecules. The only fix for this is to get a bigger diaphragm so
the bombardment cancels out.

The DPA (previously B&K) site has lots of details on various mikes, from
the tiny to the very large, all of which I think would be operating
within a whisker of the physical limits for their diaphragm size.

(Somehow, the human ear seems to do better than these physical limits.)

With the D8, I did spectrum analysis of the actual ADC background noise
with the Line In. I compared this with an artificially generated one
step dither signal. Not counting "birdies" (low level oscillations in
the imperfect ADC in a low or zero signal condition) the noise was about
7dB above dither at 1 kHz and 8 or 9dB above at 15 kHz. So in terms of
noise, the ADC was about 15.7 bits or so in the most audible 2 to 5 kHz
band, since one bit is 6dB. I did some tweaks to the input amp and
driver circuit, and this seemed to reduce the noise by about 2dB, which
is just noticeable, but which I was quite proud of. Later, I found that
after the mods the ADC could be in two modes - determined randomly at
the start of recording, and that one was improved and the other about
the same as before. Why this is the case, I have no idea. The mods
were extremely difficult - removing and surface mount components,
cutting tracks and installing tiny caps and 1/8 watt resistors where
there is almost no space for them - so I won't be publishing them. It
was basically eliminating all things which cut the signal down en-route
to the ADC.

I understand that a 20 bit ADC (AKM5351) such as in the Zefiro InBox:

"S/N ratio is really only giving you about 16.5 bits of real information
above the noise floor."

In my experience (which of course is limited), the only situation in
which the dynamic range is so wide that either mike noise or ADC noise
was a problem with respect to the loudest signals was when recording
spankings. Then, I want to record the faintest murmurs, but the impact
sounds are immense by comparison. I use cheap electret mikes (the
good ones - AKG CK-32 are way too sensitive) and clip them with series
resistors and back-to-back germanium diodes to ground en-route to the
the DAT recorder's Line In. You wouldn't want to listen to the real
dynamic range in this situation anyway - it would be bad for your ears.

As to why people mortgage their homes to pay for 96kHz/24 bit systems .
. . well . . . "My sampling rate's bigger than your sampling rate" . . .
Also whatever reason people rewire their studio with oxygen free copper
which has been time-aligned and cellared for seven years in proximity
with vast quantities of hyper-pure H2O so as to thoroughly imbue it with
infinitely accurate musicality.

I have conducted a test which is rigorous as I can think of, in which I
was unable to hear the difference between live 12 string guitar and the
signal passed through a 16 bit ADC/DAC in a Tascam DA-30 DAT recorder, I
believe that 44.1kHz/16 bit is really excellent, and properly done has
no audible degradation (other than background noise).

Some people swear that 24kHz/24 bit sounds better than plain old
44.1kHz/16 bit. I think that either they are listening to lousy 44.1
kHz/16 bit systems, or they are comparing it with simple 16 bit
truncation (without dither) of a higher bit depth signal (which would
sound lousy) or they are imagining things.

It is easy to imagine things and for marketers to have a field day with
audio gear. It is a quasi religious field in which we want to eliminate
errors which are smaller than we can hear, just to be on the safe side.

A good test would be a chain of 16 bit ADC/DAC systems, all the same,
and see how many we could listen to before there was audible difference
between the original live source (no digital recordings at all, it
must be a live, miked, signal) and the original. Of course you expect
background noise to rise. You need impeccable DAC's too, since all we
really want to test is the ADC's. (Ordinary current-source DAC's have
quite a voltage linearity glitch at zero, so if I was doing this test
with such DAC's, I would bias the signal so there was no glitching
around zero. Sigma-delta ADCs - and I guess DACs - have no such

If, after about 8 conversions, an audible difference is evident, then it
would be reasonable to say that for this particular 16 bit 44.1kHz
system, there is audible degradation, but it is about 1/8 of what is

In the discussion in February some people accused me of having lousy
ears, lousy speakers, lousy headphones, lousy amplifiers etc. - to
justify their position that 44.1kHZ/16 bit, no matter how well done,
damages sound audibly. My ears are better than many sound engineer's
(in part because I am not a sound engineer and have not suffered the
consequent damage) and these critiques did not convince me that what I
did was invalid. I don't mind if people don't accept my position.

- Robin

// Robin Whittle
// Melbourne, Australia
// First Consulting and telco tech writing; Internet
// Principles music marketing; Audio compression; DSP;
// Show and Tell; 21 Metre Sliiiiiiinky;
// Fondly and Firmly - the Gentlemanly Art of...
// Real World Electronics for music, including Devil Fish
// Interfaces TB-303 modifications & Akai sampler memory.

dupswapdrop – the music-dsp mailing list and website: subscription info,
FAQ, source code archive, list archive, book reviews, dsp links

X-Authentication-Warning: majordom set sender to using -f
From: "Andrew Simper"
Subject: Re: [music-dsp] 24bit/96khz? what for?
Date: Sun, 1 Jul 2001 18:35:14 +0800
Organization: vellocet
X-Priority: 3

> For recording, I would prefer to use a 20 or so-called "24" bit ADC.
> The lower noise (though few are probably a great deal better than 16 bit
> in reality) and greater bit depth means you can be not so fussy about
> gain and overloads from signals which are louder than you anticipated.

Hey Robin,

Did you the article written by Michael Stavrou (with some help from John
Williams) in issue 13 of AudioTechnology (australia)? He reported being able
to listen to two recordings of acoustic guitar and pick every time which one
was done digitally. This was due to the loss of audibility of the very quiet
sound of the finger leaving the string. As you point out, as a final
production format 16 bit 44.1kHz if great for most situations, but when
recording you are safer to use 20-bits so you have a bit more headroom.

When it comes to processing and filtering digital signals you are generally
better off oversampling the signal. This reduces the feedback delay in
digital simulations of analogue circuits, which is important for stability
and more "linear" behaviour so you don't need compensation lookup tables
etc. The main drawback of oversampling is cpu load because you have to
process many more samples, and you usually have to use double precision
arithmetic to maintain low frequency accuracy (tuning of filters etc).

So a higher bit depth makes recording the original dynamics easier, and a
higher sampling rate makes digital processing "easier" after the signal has
been captured. Both of these things are of little concern to your average cd
listener, but they are of some concern to those interested in recording and
processing audio signals.


Ok, here's a little perception lecture. Bear in mind
this stuff is controversial.

When I say the word "tone" I mean a sound of constant
perceived pitch. Pitch is a sensation, like warmth
taste. Herman Helmholtz in his famous treatise
"Sensations of Tone" drew the conclusion that the sensation
of pitch results from the presence of sine wave components
in a sound, and is related to the frequency of the fundamental
sine wave component. This conclusion was largely supported
by the work of the mathematetician Ohm. Yes, the same Ohm.
It was Ohm who suggested that all tones could be resolved
into components of harmonically related frequencies, which
could be expressed using a transform developed by the
mathemetician Fourier.

Now a few years prior to Ohm, another mathemeticain named
Seebeck had developed the theory that the sensation of
pitch was actually related to the period of the waveform
of a tone. Now it so happens that the period of the
waveform is also the period of the fundamental sine wave
component. However, a tone can have the period of the
fundamental frequency with little or no actual sine wave
component at the fundamental frequency. And recent psychoacoustical
studies have shown conclusively (and I have demonstrated for
myself) that even in the absence of a sine wave component
at the fundamental frequency of a waveform we still hear
the pitch of tone to be the fundamental frequency. In
other words, Seebeck was more correct than Ohm.

So what does this mean? It means that until the 70's when
this study was done, all analysis, synthesis and reproduction
of audio tones has been related to the notion that we hear sine
waves. And since we can't hear pitch above 20 K, we can't
hear anything with sine waves above 20 K. But in fact, many
people with good hearing can hear the fundamental frequency
of a waveform composed of sine waves at 22K, 33K, and 44K. We hear a
tone at 11K, the fundamental frequency of those components. And
furthermore, the timbre of the tone we hear is strongly related
to the presence of those components. We don't hear the components
as pitches themselves. We hear the resultant of their combination.
Is this effect subtle? Absolutely. Are there many tones with
components that high in music? No. But there are some, and these
are deleted by 44.1 K sampling and replaced with digital transients.
That's why CDs sound "crisp".

> Another good reason is for each doubling of sample rate, latency is
> halved so eventually a near analog performance will be achieved. Latency

Not true. The limiting factor in latencies is the DSP's interrupt response
jitter, not some magic constant (unless you work on single-sample pipes).
The DSP interrupt response jitter is measured in microseconds, and thus when
calculated in units of samples, will go up as sampling rate goes up.
Actually, because you need to process more samples in a given amount of
time, your latency may have to go UP. Here's how it works:

Interrupt jitter = J (s)
Sampling rate = Fs (n/s)
Buffer size = B (n)

Duration of buffer = B/Fs
Available processing time per buffer = B/Fs-J
Max CPU utilization before overload = (B/Fs-J)/(B/Fs)
Time per sample = (B/Fs-J)/B

As Fs increases, your "Max CPU utilization before overload" will decrease,
and thus you're running at less and less efficiency (because the benefits of
block processing disappear with smaller block sizes).

When you increase B to keep B/Fs the same, you'll still have a lower "Time
per sample". Building a system requires choosing the target CPU power,
looking at the available OS and hardware support (which gives J) and then
setting B as low as it can go for the Fs you're targeting. If low latency is
your absolute requirement, then that's slightly easier/cheaper to achieve
with a lower Fs.


/ h+

Link to this Page

  • enter. last edited on 29 January 2003 at 10:46 pm by