Abstract:
The representation of speech signals by cepstral coefficients is very often used in Automatic Speech Recognition (ASR). While cepstral coefficients are used because of their good representation properties, namely the decorrelation of coefficients, they suffer some limitations. Particularly they are sensitive to signal acquisition and to acoustic environment (robustness problem). Because of this sensitivity, the performance of speech recognition systems are degraded, even more when the conditions of training and testing are different. The aim of this work is to study and to implement representations, which are robust to the mismatches between the acoustic conditions of training and evaluation. These representations will be then tested on the Sirocco large vocabulary speech recognition system. A particular attention will be paid to the methods of cepstral trajectories filtering, especially band-pass filtering (smoothing of cepstral trajectories). Some other methods will be also considered.