Corey Bailey
Audio Engineering
GLOSSARIES
DIGITAL AUDIO GLOSSARY
Aliasing
Aliasing refers to an effect that causes different audio signals to become
indistinguishable when sampled (aliases of one another). It also refers to the
distortion or artifacts that result when the signal that is reconstructed from
samples is different from the original analog signal.
In the world of digital audio, the highest frequency information often suffers from
aliasing in the form of poorly or incorrectly reconstructed waveforms. Piano,
cymbals and some stringed instruments can be noticeably affected.
Analog-to-digital converter
Abbreviated A/D, ADC or A to D. An audio A/D converter is an electronic device
used to convert analog electrical signals to digital values whose numbers
(combinations of ones and zeros) represent the level (volume) and frequency
information contained in the original analog signal.
Bandwidth
Defined simply as a given range of frequencies (audible or not). We humans, for
example, have an audible hearing bandwidth ranging from approximately 20 Hz
(cycles per second) to about 20 kHz. (20 kilohertz or 20,000 cycles per second)
Bit Depth
Within the realm of sampling an analog waveform (using Pulse Code Modulation
or PCM), bit depth divides a given sample by its value (16 bit, 24 bit, etc.). Bit
depth is binary code that uses a mathematical language that is Base 2. It defines
the parameters of the Dynamic Range of an audio signal. By increasing the bit
depth, smaller fluctuations of the audio signal can be resolved. The noticeable
result for the listener is usually increased clarity and tonality. Listening tests have
shown that an increase in bit depth is more readily detected than an increase in
Sample Frequency.
The rule-of-thumb for bit depth resolution is:
For every 1-Bit increase in Bit Depth, the dynamic range will increase by 6dB.
A bit depth of 16 Bits yields a dynamic range of 96 dB (16X6). A bit depth of 24
bits yields a (theoretical) dynamic range of 144 dB. Notice that an 8 bit increase
in bit depth yields a difference of 48 dB in dynamic range!
I say “theoretical” because the best analog equipment we can use to playback
that decoded 24 bit audio has a realistic dynamic range of about 120 dB.
I have written an article on this subject: Bit Depth Defined.
Bit Rate
The bit rate is typically the amount of data bits per second that can be
transmitted over a specified interface. “Interface” is this context could be USB,
Firewire, Parallel, Serial or any number of specified digital protocols. The transfer
or bit rate for any given protocol is different. USB 2.0 for example, has a
maximum bit rate of 480 Megabits per second (MBit/s) or 480 million data bits
per second. The terms Bit Rate and Data Rate are used interchangeably.
Common bit rate (or data rate) terms:
Kilobit per second (kbit/s): one thousand bits per second.
Megabit per second (Mbit/s): one million bits per second.
Gigabit per second (Gbit/s): one billion bits per second.
So, now when you see or hear the term Gigabit Ethernet, you will know that it is a
network interface that can convey or transfer one billion bits per second….Pretty
kewel eh?
CD Quality
A term (an oxymoron, in my opinion) that refers to the 16 bit-44.1kHz digital audio
file specification used for the Compact Disk.
Digital-to-Analog Converter
Abbreviated D/A, DAC, it performs the inverse of the A/D converter by converting
those digital ones and zeros to an analog waveform that is hopefully pleasing to
listen to.
Dither
Dither is an intentionally applied form of noise. Dither is used to randomize noise
at discrete frequencies in a digital audio recording.
In layman’s terms, when digital audio with lower bit depths is at the threshold of
signal level, it can sound grainy. Noise is often added to mask this anomaly. One
of the trade-offs of applying Dither is that it will slightly raise the noise floor of a
given audio recording. Dither can be applied during recording or after the fact. It
is often one of the last stages of audio production used for Compact Disc.
Dynamic Range
In the world of audio, Dynamic Range is defined simply as the range of volume
from the loudest to the softest of audible sounds. Dynamic Range is expressed in
decibels (dB), which is a mathematical Log-10 ratio. We Audio Engineers often
refer to Dynamic Range in terms of the difference between the loudest
undistorted signal that be recorded down to the noise level (floor) of a given
medium. I.e.: Analog tape has a dynamic range approaching 70dB when used at
30 inches per second. Your audio CD has a theoretical dynamic range of 96dB.
The average human can hear a dynamic range of approximately 140dB.
MP3
Known officially as: MPEG-2 Audio Layer III. It is a patented digital audio
encoding scheme using a form of lossy data compression. It is a common audio
exchange format for consumer digital audio players. The MP3 file format is
designed to greatly reduce the amount of data required to represent an audio
recording and still sound like the original uncompressed audio for (most)
listeners. An MP3 file that is created using the setting of 128 Kilobits per second
(kbits/s) will result in a file that is about 11 times smaller than a Compact Disk file
created from the original analog audio source. An MP3 file can also be
constructed at higher or lower bit rates with the higher bit rates resulting in higher
audio quality. It's worth noting that listening tests have shown that the average
listener can tell the difference in fidelity between an MP3 file and a CD quality
file.
Nyquist Frequency
Named after the Swedish-American engineer Harry Nyquist, the Nyquist
frequency is half the sampling frequency (sample rate) of a discrete signal. For
example: If the sample rate is 48 kHz, the Nyquist frequency would be 24 kHz.
According to the Nyquist theorem, at least 2 samples of a peak-to-peak audio
waveform are required for it to accurately be reproduced. Acquiring less than 2
samples of a peak-to-peak waveform can produce a form of audible distortion
known as Aliasing. In the process of sampling analog audio, everything above
the predetermined Nyquist frequency is blocked (attenuated actually) by filtering.
Oversampling
In digital signal processing, oversampling is the process of sampling a signal with
a sampling frequency significantly higher than twice the bandwidth (or the highest
frequency) of the signal being sampled.
Pulse Code Modulation
Pulse Code Modulation (PCM) is the most widely used method of converting
analog audio to a digital format. Basically, it is defined as the uniform sampling of
an audio waveform at regular intervals. This applies to both frequency (sample
rate) and volume or the magnitude of the signal (bit depth).
RED BOOK: The CD standard
The original set of books containing the specifications for all forms of optical
Compact Disk were individually color bound. The book containing the
specifications for Audio CD’s was bound in red, hence the name.
(I’m not kidding!)
The Compact Disk standard was developed jointly by Phillips & Sony, and the
technical specifications were released in 1980.
SACD
SACD is an acronym for Super Audio Compact Disk. SACD’s are high-
resolution, high fidelity, read-only optical disks (typically a DVD disk) for audio
playback. Developed jointly by Sony and Philips Electronics, SACD recordings
have a wider frequency response and dynamic range than conventional CD’s.
The SACD format uses a digital audio scheme called Direct Stream Digital
(DSD), which has a very high sampling frequency of 2.8224 MHz (Millions of
cycles per second) which, coincidentally, is 64 times the sample rate of a CD.
A stereo SACD recording can stream data at four times the rate of CD stereo
audio.
Sample Rate or Sampling Frequency
Sample Rate defines the number of samples per second taken of an analog
waveform within a given bandwidth. The terms Sample Rate and Sample
Frequency are used interchangeably to describe the same thing. I go more in-
depth on this subject here.
Sample Rate Conversion
This is where digital audio that was originally digitized at a given sample rate
(and bit depth) is “re-sampled” to another sample rate and often another bit
depth. For example, a digital file created at 24 bit/48 KHz has to be sample-rate
converted to 16 bit/44.1 KHz so it can be used for Audio CD’s. This process is
often accomplished by software using a mathematical formula called an
algorithm or in real time by a hardware converter. The real time process is most
often used in a sound mixing environment where several sources were delivered
that are not all the same sample rate or bit depth. This often happens in the
making of Radio & TV Commercials, Feature Films and Music Production. Either
process can result in the creation of unwanted audible artifacts, which is why
everyone tries to maintain the same specifications for a given project.
Word Clock (also known as Sample Clock)
Digital audio is created by taking a "sample" of an analog signal on a periodic
basis, say 48000 times per second (the "Sample Rate"). A dedicated clock, the
"sample clock," ticks at that rate, and, every time it does, a new sample is
measured. Sample clocks are built into all devices that handle digital audio and
video. Thus, your CD player and your DVD player have sample clocks built into
them in order to stream the data accurately enough to convert the signal to
analog audio or video.
Whenever you connect two digital audio or video devices together in order to
move data from one to the other, you must ensure they share the same sample
clock. Why is this necessary? The oscillating crystals used for sample clocks are
generally very stable, but there are always minute differences in the frequency of
any two or more clocks. When used individually, this is not a problem, but
connect two digital devices together and those minute differences will
accumulate over time. Eventually, one of the devices will be trying to read a
sample in the middle of the other device's tick, and the result is a small click or
pop in the audio stream or noticeable jump in the picture. In the consumer realm,
when you connect your CD/DVD player to your home theater processor via a
digital interconnect cable, the theater processor will adjust its clock to the
incoming data stream, and all works well. In the professional world, many digital
devices are often connected together or to a single source (a mixer, for example),
and, in order to avoid clocking errors over time, a central clock source is used
and fed to all of the equipment. This central clock is known as a Word Clock
generator.
References
Rudolph F. Graf, “Dictionary of Electronics” Howard W. Sams, 1974
Glenn D. White, “The Audio Dictionary” University of Washington Press, 1987
Wikipedia, http://www.wikipedia.org/
Return to TOP of page
© Corey Bailey Audio Engineering