[ Demos ] [ EECS 20 Home ] [ Corresponding lecture ]

Sampling Speech Signals

In this demonstration, a speech signal is manipulated by sampling it in various ways. The speech signal is the sentence, "I'm sorry Dave, I'm afraid I can't do that," spoken by HAL, the computer in 2001 Space Odyssey. You will hear the following samples:

The original sound, sampled at 8kHz and played over the workstation audio.
The signal downsampled by discarding every second sample, but still played back at 8kHz. Notice that the signal lasts only half as long, and the pitches have risen.
The signal downsampled by discarding 3 of every 4 samples, but still played back at 8kHz. The signal is now quite short in duration, and entirely unintelligible.
The signal downsampled by discarding every second sample, but then interpolated back to the original sample rate, 8 kHz. Notice that now the duration and pitches are the same as the original, but the sound is significantly degraded. The degradation is due partly to aliasing.

The speech waveform is really just a list of numbers. Here is a segment of speech consisting of 1000 samples:

[signal display]

If we zoom in to the the middle of this segment, the signal looks as follows:

[signal display]

This segment shows a transition between a part of the signal that is roughly periodic to one that is much noisier. The roughly periodic part corresponds to a voiced sound, like "aaaa", whereas the noisy part corresponds to an unvoiced sound, like "ffff". This segment is at the transition between voiced and unvoiced in the "af" of "afraid". To emphasize that this waveform is constructed from samples, we can plot each sample as a dot:

[signal display]

In the fourth audio sample above, a set of samples at 4 kHz is interpolated to get a new set of samples at 8 kHz. The interpoloation of the waveform can be understood intuitively by looking at the following picture from the aliasing demo.

[signal display]

If the blue signal represents a set of samples, then the red signal is an interpolated version. The waveform is smoothly interpolated between the given samples.