next up previous contents index
Next: Hybrid Codecs Up: Audio Compression and Codecs Previous: Waveform Coding   Contents   Index


Source Codecs

As in video coding, high compression ratio is achieved by understanding the video signal and the Human Vision System (HVS), the same idea is used in source speech coding. The main idea is based on the understanding of how the speech signal is produced. Instead of sending the coded signal, some parameters of the signal are sent to the decoder. Original speech signal contains two types of time-varying signals (constituting the majority of the speech). They involve voiced sounds that have a high degree of periodicity at the pitch period and unvoiced sounds that have little long-term periodicity but contain short-term correlations. Thus the vocal tract is represented as a time-varying filter and is excited with either a white noise source, for unvoiced speech segments, or a train of pulses separated by the pitch period for voiced speech. Therefore the information that must be sent to the decoder is the filter specification, a voiced/unvoiced flag, the necessary variance of the excitation signal, and the pitch period for voiced speech. This is updated every 10-20 ms to follow the non-stationary nature of speech. This kind of coders are also referred as vocoders. Vocoders operate at bit rates $\leq$ 2.4 Kbps. The resulting speech is intelligible and almost synthetic as most of the natural characteristics of the speech signal is removed.
next up previous contents index
Next: Hybrid Codecs Up: Audio Compression and Codecs Previous: Waveform Coding   Contents   Index
Samir Mohamed 2003-01-08