Alexey Ozerov

More recent picture

address:	I am now with Technicolor, Cesson Sévigné, France
e-mail:	myFirstName.myLastName@technicolor.com

Research Interests

Statistical signal processing, machine learning and information theory, including:

Autoregressive (AR) model, Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), and Bayesian Network in general.
Expectation Maximization (EM) algorithm and extensions (GEM, SAGE, variational EM).
Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-means algorithm.
Non-negative Matrix Factorization (NMF) and extensions (e.g., NTF).
Support Vector Machines (SVM) and other kernel-based machine learning methods.
High-Rate (HR) quantization theory.

Application areas:

Blind and supervised audio source separation.
Audio and speech coding and source coding in general.
Automatic speech recognition.
Automatic musical instrument recognition.

Short Bio

Alexey Ozerov holds a Ph.D. in Signal Processing from the University of Rennes 1 (France). He worked towards this degree from 2003 to 2006 in the labs of France Telecom R&D and in collaboration with the IRISA institute. Earlier, he received an M.Sc. degree in Mathematics from the Saint-Petersburg State University (Russia) in 1999 and an M.Sc. degree in Applied Mathematics from the University of Bordeaux 1 (France) in 2003. From 1999 to 2002, Alexey worked at Terayon Communicational Systems (USA) as a R&D software engineer, first in Saint-Petersburg and then in Prague (Czech Republic). He was for one year (2007) in Sound and Image Processing Lab at KTH (Royal Institute of Technology), Stockholm, Sweden, for one year and half (2008-2009) in TELECOM ParisTech / CNRS LTCI - Signal and Image Processing (TSI) Department, and for two years (2009 - 2011) with METISS team of IRISA / INRIA - Rennes. Now he is with Technicolor R&D departement in Cesson Sévigné, France.

Curriculum Vitae

in English: PDF, PostScript in French: PDF, PostScript

Demonstrations

One microphone singing voice separation

One microphone source separation

Multichannel nonnegative matrix factorization for convolutive blind source separation

Factorial scaled hidden Markov model for single channel speech / music separation

SARAH project istrument extraction demos:

User-Guided Audio Source Separation via Multichannel Nonnegative Tensor Factorization With Structured Constraints

Using the FASST source separation toolbox for noise robust speech recognition

Coding-based Informed Source Separation

Software

Multichannel nonnegative matrix factorization toolbox (in Matlab)

BSS Locate - A toolbox for source localization in stereo convolutive audio mixtures (in Matlab)

FASST - Flexible Audio Source Separation Toolbox (in Matlab)

Participation in Evaluation Campaigns

Third community-based Signal Separation Evaluation Campaign (SiSEC 2011).

The PASCAL 'CHiME' Speech Separation and Recognition Challenge (CHiME 2011).

Second community-based Signal Separation Evaluation Campaign (SiSEC 2010).

First community-based Signal Separation Evaluation Campaign (SiSEC 2008).

Public Responsibilities

Member of the organizing committee of the international signal separation evaluation campaign SiSEC 2010 with results presented at LVA/ICA'10.

Member of the local organization committee of the international conference "Latent Variable Analysis and Signal Separation" (LVA/ICA'10), 27 to 30 September 2010 in St. Malo.

Coordination of preparation of a European project proposal (STREP) for FP7 (ICT Call 1: FP7- ICT-2007-1, Challenge 4: "Digital libraries and Content", Objective 1: "Digital libraries and technologyenhanced learning") with 5 European research labs.

Member of the local organization committee of the international conference "Signal Processing with Adaptative Sparse Structured Representations" (SPARS'05), 16 to 18 November 2005 in IRISA/INRIA - Rennes, France.

Reviewing for 5 international journals and 7 conferences/workshops, since 2005.

Projects

Quaero is a collaborative research and development program promoting research and industrial innovation on technologies for automatic analysis and classification of multimedia and multilingual documents. The consortium is composed of about 25 French and German public and private research organisations.

SARAH "StAndardisation du Remastering Audio Haute-Définition" is a French ANR project between Audionamix, TELECOM ParisTech / TSI and the Studios-Copra on high-quality HD remastering of music recordings. completed Demo

FlexCode is a european FP6 project between KTH (Royal Institute of Technology), Nokia Corporation, Orange/France Telecom Group, RWTH Aachen University and Ericsson AB on new practical flexible, parameterized and generic coding system for speech and audio coding. completed

Collaborators

Simon Arberet	EPFL, Lausanne, Switzerland
Roland Badeau	Télécom ParisTech, France
Elie Laurent Benaroya	ESPCI ParisTech, France
Frédéric Bimbot	IRISA, Rennes, France
Charles Blandin	IRISA, Rennes, France
Raphaël Blouet	Yacast, Paris, France
Maurice Charbit	Télécom ParisTech, France
Bertrand David	Télécom ParisTech, France
Jean-Louis Durrieu	EPFL, Lausanne, Switzerland
Slim Essid	Télécom ParisTech, France
Cédric Févotte	Télécom ParisTech, France
Rémi Gribonval	IRISA, Rennes, France
Guillaume Gravier	IRISA, Rennes, France
Richard Heusdens	Delft University of Technology, The Netherlands
W. Bastiaan Kleijn	Victoria University of Wellington, New Zealand
Janusz Klejsa	KTH, Stockholm, Sweden
Mathieu Lagrange	IRCAM, Paris, France
Minyue Li	KTH, Stockholm, Sweden
Antoine Liutkus	Télécom ParisTech, France
Mounira Maazaoui	Télécom ParisTech, France
Pierrick Philippe	Orange Labs, Rennes, France
Ilyas Potamitis	Technological Educational Institute of Crete, Greece
Gaël Richard	Télécom ParisTech, France
Emmanuel Vincent	IRISA, Rennes, France

Publications

IEEE Copyright declimer conserning all IEEE papers reprints posted below: Copyright © 2005-2011 IEEE. This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view these documents, you agree to all provisions of the copyright laws protecting it.

Submitted

M. Li, J. Klejsa, A. Ozerov and W. B. Kleijn, "Audio Coding with Power Spectral Density Preserving Quantization," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'12), Kyoto, Japan, March, 2012. (submitted)
M. Li, A. Ozerov, J. Klejsa and W. B. Kleijn, "Asymptotically optimal distribution preserving quantization for stationary Gaussian processes," IEEE Transactions on Communications (submitted)
S. Arberet, A. Ozerov, F. Bimbot and R. Gribonval, "A tractable framework for estimating and combining spectral source models for audio source separation," Signal Processing, special issue on "Latent Variable Analysis and Signal Separation" (submitted)

Research report: HAL

Journal Articles

E. Vincent, S. Araki, F. Theis, G. Nolte, P. Bofill, H Sawada, A Ozerov, V. Gowreesunker, D. Lutter, N.Q.K. Duong, "The Signal Separation Evaluation Campaign (2007-2010): Achievements and remaining challenges," Signal Processing, special issue on "Latent Variable Analysis and Signal Separation" (to appear)

Article: HAL
C. Blandin, A. Ozerov and E. Vincent, "Multi-source TDOA estimation in reverberant audio using angular spectra and clustering," Signal Processing, special issue on "Latent Variable Analysis and Signal Separation" (to appear)

Article: HAL, Code
A. Ozerov, E. Vincent and F. Bimbot, "A general flexible framework for the handling of prior information in audio source separation," IEEE Trans. on Audio, Speech and Lang. Proc. (to appear)

Article: HAL, Code and Audio Examples
A. Ozerov and W. B. Kleijn, "Asymptotically optimal model estimation for quantization," IEEE Transactions on Communications, vol. 59, no. 4, pp. 1031-1042 , April 2011.

Article: PDF
A. Ozerov and C. Févotte, "Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation," IEEE Trans. on Audio, Speech and Lang. Proc. special issue on Signal Models and Representations of Musical and Environmental Sounds, vol. 18, no. 3, pp. 550-563, March 2010.

Article: PDF, Audio Examples, Code
A. Ozerov, P. Philippe, F. Bimbot and R. Gribonval, "Adaptation of Bayesian models for single channel source separation and its application to voice / music separation in popular songs," IEEE Trans. on Audio, Speech and Lang. Proc., special issue on Blind Signal Proc. for Speech and Audio Applications, vol. 15, no. 5, pp. 1564-1578, July 2007.

Article: PDF Audio Examples,
A. Ozerov, R. Gribonval, P. Philippe and F. Bimbot, "Choix et adaptation de modèles statistiques pour la séparation de voix chantée à partir d'un seul microphone," Traitement du signal, vol. 24, no. 3, pp. 211-224, 2007.

abstract in English: HTML, preprint in French: PDF

Conferences

A. Ozerov, A. Liutkus, R. Badeau and G. Richard, "Informed source separation: source coding meets source separation," In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'11), Mohonk, NY, Oct. 16-19, 2011.

Article: PDF, Audio Examples
A. Ozerov, M. Lagrange and E. Vincent, "GMM-based classification from noisy features," International Workshop on Machine Listening in Multisource Environments (CHiME 2011), pages 30-35, Florence, Italy, September, 2011.

Article: PDF, Slides: PDF
A. Ozerov and E. Vincent, "Using the FASST source separation toolbox for noise robust speech recognition," International Workshop on Machine Listening in Multisource Environments (CHiME 2011), pages 86-87, Florence, Italy, September, 2011.

Article: PDF, Poster: PDF, Audio Examples
A. Ozerov, C. Févotte, R. Blouet and J.-L. Durrieu, "Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'11), pages 257-260, Prague, May, 2011.

Article: PDF, Poster: PDF, Audio Examples
C. Blandin, E. Vincent and A. Ozerov, "Multi-source TDOA estimation using SNR-based angular spectra," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'11), pages 2616 - 2619, Prague, May, 2011.

Article: PDF, Poster: PDF, Code
A. Ozerov, E. Vincent and F. Bimbot, "A general modular framework for audio source separation", In 9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA'10), pages 33 - 40, Saint-Malo, France, Sep. 27-30, 2010.

Article: PDF, Poster: PDF
S. Araki, A. Ozerov, V. Gowreesunker, H. Sawada, F. Theis, G. Nolte, D. Lutter and N.Q.K. Duong, "The 2010 Signal Separation Evaluation Campaign (SiSEC2010): - Audio source separation -", In 9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA'10), pages 114 - 122, Saint-Malo, France, Sep. 27-30, 2010.

Article: PDF
S. Araki, F. Theis, G. Nolte, D. Lutter, A. Ozerov, V. Gowreesunker, H. Sawada and N.Q.K. Duong, "The 2010 Signal Separation Evaluation Campaign (SiSEC2010): - Biomedical source separation -", In 9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA'10), pages 123 - 130, Saint-Malo, France, Sep. 27-30, 2010.

Article: PDF
C. Févotte and A. Ozerov, "Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues", In 7th International Symposium on Computer Music Modeling and Retrieval (CMMR 2010), 2010.

Article: PDF, Audio Examples, Code
S. Arberet, A. Ozerov, N.Q.K. Duong, E. Vincent, R. Gribonval, F. Bimbot and P. Vandergheynst, "Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation", In 10th International Conference on Information Sciences, Signal Processing and their applications (ISSPA 2010), 2010.

Article: PDF
A. Ozerov, C. Févotte and M. Charbit, "Factorial scaled hidden Markov model for polyphonic audio representation and source separation", In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'09), Mohonk, NY, Oct. 18-21, 2009.

Article: PDF, Slides: PDF, Audio Examples
J.-L. Durrieu, A. Ozerov, C. Févotte, G. Richard and B. David, "Main instrument separation from stereophonic audio signals using a source/filter model", In EUSIPCO, 17th European Signal Processing Conference, Glasgow, Scotland, August 24-28, 2009.

Article: PDF, Audio Examples
A. Ozerov and C. Févotte, "Multichannel nonnegative matrix factorization in convolutive mixtures. With application to blind audio source separation", In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'09), pages 3137-3140, Taipei, Taiwan, April 19-24, 2009.

Article: PDF, Poster: PDF, Audio Examples, Code
A. Ozerov and W. B. Kleijn, "Optimal parameter estimation for model-based quantization," In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'09), pages 2497-2500, Taipei, Taiwan, April 19-24, 2009.

Article: PDF, Poster: PDF
S. Arberet, A. Ozerov, R. Gribonval and F. Bimbot, "Blind spectral-GMM estimation for underdetermined instantaneous audio source separation", In Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA'09), pages 751-758, Paraty, Brazil, March 15-18, 2009.

Article: PDF
I. Potamitis and A. Ozerov, "Single channel source separation using static and dynamic features in the power domain", In EUSIPCO, 16th European Signal Processing Conference, Laussane, Switzerland, August 25-29, 2008.

Article: PDF, Audio Examples
A. Ozerov and W. B. Kleijn, "Flexible quantization of audio and speech based on the autoregressive model," In IEEE Asilomar Conference on Signals, Systems, and Computers (Asilomar CSSC'07), pages 535-539, Pacific Grove, CA, Nov. 4-7, 2007.

Article: PDF, Poster: PDF
R. Heusdens, W. B. Kleijn and A. Ozerov, "Entropy-constrained high-resolution lattice vector quantization using a perceptually relevant distortion measure," In IEEE Asilomar Conference on Signals, Systems, and Computers (Asilomar CSSC'07), pages 2075-2079, Pacific Grove, CA, Nov. 4-7, 2007.

Article: PDF
W. B. Kleijn and A. Ozerov, "Rate distribution between model and signal," In IEEE Worksh. on Apps. of Signal Processing to Audio and Acoustics (WASPAA'07), pages 243-246, Mohonk, NY, Oct. 2007.

Article: PDF
A. Ozerov, P. Philippe, R. Gribonval and F. Bimbot, "One microphone singing voice separation using source-adapted models," In IEEE Worksh. on Apps. of Signal Processing to Audio and Acoustics (WASPAA'05), pages 90-93, Mohonk, NY, Oct. 2005.

Article: PDF, Slides: PDF, Audio Examples
A. Ozerov, R. Gribonval, P. Philippe and F. Bimbot, "Séparation voix / musique à partir d'enregistrements mono : quelques remarques sur le choix et l'adaptation des modèles," In GRETSI'05 Symposium on Signal and Image Processing, Louvain-la-Neuve, Belgique, Sept. 2005.

abstract in English: HTML, full text in French: PDF, PostScript, Audio Examples
G. Gravier, L. Benaroya, A. Ozerov, R. Gribonval and F. Bimbot, "Séparation de sources à partir d'un seul capteur pour la reconnaissance robuste de la parole," In Journées d'Etude sur la Parole (JEP'04), April 2004.

abstract in English: HTML, full text in French: PDF

Patents

A. Ozerov, C. Févotte and R. Blouet, "Automatic source separation via joint use of segmental information and spatial diversity" US patent 13021692, 2011 (filled).
S. Arberet, A. Ozerov, R. Gribonval and F. Bimbot, "Procédé et un dispositif d'estimation de signaux de source issus d'un signal de mélange" French patent 2939933, 2010 (published) and international extension WO2010/076412, 2010 (published).

Technical reports

A. Ozerov, S. Essid and M. Charbit, "Reconnaissance des instruments dans la musique polyphonique par décomposition NMF et classification SVM," Technical Report TELECOM ParisTech 2009D014, July 2009.

abstract in English: HTML, full text in French: PDF

Theses

A. Ozerov. "Adaptation de modèles statistiques pour la séparation de sources mono-capteur. Application à la séparation voix / musique dans les chansons." PhD thesis, University of Rennes 1, 2006.

abstract in English: HTML, full text in French: PDF, PostScript

A. Ozerov. "Représentations robustes pour la reconnaissance automatique de la parole". MSc thesis, DESS "Scientific Calculation and Applications", University of Bordeaux 1, 2003.

abstract in English: HTML, full text in French: PDF, PostScript

A. Ozerov. "A criterion of nondisappearance of invariant sets satisfying Krasovsky property under C⁰ perturbations of right part of the system". MSc thesis, department of Ordinary Differential Equations, Mathematics and Mechanics faculty, St. Petersburg State University, 1999.

abstract in English: PDF, full text in Russian: PDF

Miscellaneous

A. Ozerov, C. Févotte and R. Blouet, "The SARAH project: Standardization of High-Definition Audio Remastering", Demo presented at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'09), Mohonk, NY, Oct. 18-21, 2009.

Poster: PDF