Introduction to MIDI and Computer Music: Quiz 2 Review
In a sound wave, air molecules are alternately compressed together and expanded apart, resulting in the pattern of air pressure change that propagates as a sound wave.
If a wave shape repeats over and over, we hear the frequency of its repetition as pitch.
20 Hz to 20000 Hz (or 20 kHz)
When you double the frequency, the perceived pitch rises by an octave. A low G on the piano is about 100 Hz. The G's above that are: 200 Hz, 400 Hz, 800 Hz, 1600 Hz, 3200 Hz, etc. So an exponential increase in frequency results in a linear increase in pitch.
The amplitudes of their sound waves. Amplitude is a measure of how extreme the compression and expansion of air molecules is in a sound wave.
Just one. Unlike any other waveform, a sine wave is completely pure and contains only one frequency component: the fundamental. The fundamental frequency is considered to be the first partial. By contrast, a triangle wave contains many odd-numbered partials above the fundamental.
A harmonic partial is a frequency component that is an integer multiple of the fundamental frequency. An inharmonic partial is a multiple of the fundamental that has a fractional part. For example, if we have a fundamental frequency of 100 Hz, it could have harmonic partials at 100 Hz (100 * 1), 200 Hz (100 * 2), 300 Hz (100 * 3), 400 Hz (100 * 4), etc. If it had partials at 234 Hz (100 * 2.34) or 440 Hz (100 * 4.4), these would be inharmonic partials. Clearly pitched instruments like pianos and oboes tend to have harmonic partials (or partials that are very close to integer multiples). Instruments with less clear pitch, like church bells, drums or cymbals, have predominantly inharmonic partials.
Brighter timbre is due to the presence of relatively strong upper partials. These partials are greatly reduced in strength for sounds having a mellower timbre.
(See near the end of the "Acoustics" PowerPoint slides.)
Formants are areas of resonance that have a fixed frequency. These contribute a lot to the characteristic sound of an instrument. For example, an acoustic guitar has fixed areas of resonance due to its shape. No matter what note you play on a guitar, you'll always hear these particular resonances. Think of the guitar body as a filter with multiple peaks spread across the frequency spectrum. The center frequencies of these peaks never change. This "filter" emphasizes whatever frequency components of a note happen to fall in the peaks.
If you put a recording of a guitar note in a sampler, and play it at something other than its original pitch, the formants will be transposed, along with the note and its partials. In other words, the guitar body "filter" peaks are shifted. If you transpose an octave up, it sounds like a tiny toy guitar; if you transpose an octave down, it sounds like a giant guitar with heavy strings.
This is why Digital Performer provides the Spectral Effects command, which lets you transpose independently the formant peaks and the frequency components of a sound. If the formant peaks hold steady or shift just a little, the result sounds more natural.
ADSR = Attack, Decay, Sustain, Release

See the explanation in Assignment 3, part 2.
See the explanation in Assignment 3, part 2.
LFO = Low Frequency Oscillator
An LFO generates cyclical change at sub-audio frequencies (i.e., frequencies below our range of hearing, which begins around 20 Hz.)
Use an LFO to change frequency (creating vibrato), amplitude (creating tremolo), or the cutoff frequency of a filter (creating a "wah-wah" effect).
See the explanation in Assignment 3, part 1.
See the explanations in Assignment 3, part 3.
See the "Audio Effects: Delay" PowerPoint slide.
See the "Audio Effects: EQ" PowerPoint slide. The things to remember:
This is sampler terminology. See the "Sampling" PowerPoint slide and Assignment 3, part 4.
A transducer transforms energy from one form into another. A microphone transforms acoustical energy — fluctuation in air pressure — into fluctuating electrical current. A loudspeaker goes in the opposite direction: from electricity into acoustical energy.
An analog representation is a continuously varying signal that mimics the shape of an acoustic waveform. A digital representation is a series of discrete numbers whose values correspond to an analog signal measured at particular moments.
The key words here are "continuously varying" and "discrete."
ADC — Analog to Digital Converter
DAC — Digital to Analog Converter
These are devices that convert between a continuously changing electrical signal — which is analogous to the changing air pressure of a sound wave — and a series of discrete samples of this signal. So, you use an ADC to convert the analog signal produced by a microphone into the digital signal that a computer can use. And you use a DAC to convert the computer's digital signal to an analog signal that an amplifier needs.
Sampling rate is how often you take samples (or snapshots) of the continuously changing analog voltage in an ADC. And for a DAC, it's how quickly you convert these samples back to analog form.
Sample word length (or resolution) determines how accurately you can represent the amplitude of an analog waveform. The longer the word (i.e., the more bits it has), the wider the range of discrete numbers available for encoding sample values.
44100 Hz (or 44.1 kHz, because 1 kilohertz equals 1000 Hertz)
This sampling rate is more than twice as high as the highest frequency humans can hear (around 20 kHz for young healthy ears). The Nyquist theorem says that you need at least two samples per cycle of a waveform. If a waveform has a frequency component at 20 kHz, then you need a sampling rate of at least 40 kHz to represent this frequency.
A 16-bit number can have over 65000 different values; an 8-bit number can have only 256 different values. These values form a grid to which voltages are quantized during analog-to-digital conversion. The more values you have, the closer together the grid lines can be, and the more accurately the sample numbers will represent the shape of the waveform.
During conversion, analog voltages are rounded off to the nearest integer in the quantization grid. The difference between the actual voltage and the rounded-off sample number is called quantization error. It sounds a bit like analog tape hiss, but is often more grainy and harsh.
You get aliasing.
The Nyquist frequency is the highest frequency that can be represented by a particular sampling rate. Frequencies higher than this will be "folded over" to lower frequencies that can be represented at that rate.
CD-quality sound has a sampling rate of 44,100 Hz and a sample word length of 16 bits.
There are 8 bits in a byte. So each 16-bit sample word needs two bytes. If there are 44,100 samples per second, and each sample takes two bytes, you need 44,100 * 2 = 88,200 bytes per second. But that's only for one channel. A stereo file has two channels, so that makes 88,200 * 2 = 176,400 bytes per second.
If it lasts a minute instead of a second, it'll need 176,400 * 60 = 10,584,000 bytes per minute. A megabyte is about a million bytes (actually it's the closest power of 2, which is 1,048,576). So 10,584,000 / 1 million = c. 10 megabytes.
This means stereo CD audio takes about 10 megabytes (MB) per minute. So the answer to the question is ... 5 MB.
See the last two slides in the "Digital Audio" PowerPoint slide show.
Destructive editing alters the original sound file on disk; non-destructive editing leaves the original sound file alone, and instead manipulates references to the sound file, or creates new sound files based on the original. Digital Performer is primarily a non-destructive editor.
A soundbite is a reference to a portion of a sound file. The reference comprises a start time and end time that defines a segment of sound relative to the start of the sound file. Soundbites are the cornerstone of non-destructive editing, since they allow you to manipulate sound without changing the original sound file. ("Soundbite" is a Digital Performer term; Pro Tools and some other programs call this a "region.")