Next: Hybrid Codecs
Up: Audio Compression and Codecs
Previous: Waveform Coding
  Contents
  Index
Source Codecs
As in video coding, high compression ratio is achieved by understanding the video signal and the Human Vision System (HVS), the same idea is used in source speech coding. The main idea is based on the understanding of how the speech signal is produced. Instead of sending the coded signal, some parameters of the signal are sent to the decoder. Original speech signal contains two types of time-varying signals (constituting the majority of the speech). They involve voiced sounds that have a high degree of periodicity at the pitch period and unvoiced sounds that have little long-term periodicity but contain short-term correlations. Thus the vocal tract is represented as a time-varying filter and is excited with either a white noise source, for unvoiced speech segments, or a train of pulses separated by the pitch period for voiced speech.
Therefore the information that must be sent to the decoder is the filter specification, a voiced/unvoiced flag, the necessary variance of the excitation signal, and the pitch period for voiced speech. This is updated every 10-20 ms to follow the non-stationary nature of speech. This kind of coders are also referred as vocoders. Vocoders operate at bit rates 2.4 Kbps. The resulting speech is intelligible and almost synthetic as most of the natural characteristics of the speech signal is removed.
Next: Hybrid Codecs
Up: Audio Compression and Codecs
Previous: Waveform Coding
  Contents
  Index
Samir Mohamed
2003-01-08