All algorithms addressed the source signal estimation task, which consists of recovering the original dereverberated source signals. However, since reference mono source signals were not available, the results of this task were evaluated w.r.t. the contribution of each source to the first mixture channel. Due to this mismatch between target and reference signals, the measured SDR and SAR values are irrelevant and only the measured SIR is provided below.
The measured SIR does not take human frequency sensitivity into account and may differ from the perceived SIR when the target and interference sources have different frequency ranges.
For details about each algorithm, click on the algorithm number.
For summary results, see
E. Vincent, S. Araki and P. Bofill, "The 2008 Signal Separation Evaluation Campaign: A community-based approach to large-scale evaluation", in Proc. Int. Conf. on Independent Component Analysis and Signal Separation, 2009.
The mixture audio files are licensed for research use only by their authors Kenneth Hild (Iliad), Hiroshi Sawada (rooms 4 and 5), Mads Dyrholm (rooms 1, 2, 3, C and O) and Lucas Parra. The music used for recordings in rooms 1, 2 and 3 was taken from "Germ Germ" by Das Böse Ding and has kindly been approved for public presentation by Jan Klare of Das Böse Ding in the name of research. See Lucas Parra's BSS page for the original dataset. All other audio files are made available under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 2.0 license.
Iliad mix | ||
Algorithm 1 F. Nesta | SIR (dB) | src1 src2 21.1 15.9 |
Algorithm 2 I. Lee | SIR (dB) | src1 src2 21.1 18.1 |
Algorithm 3 I. Lee | SIR (dB) | src1 src2 7.0 0.2 |
Algorithm 4 I. Takashi | SIR (dB) | src1 src2 6.6 -0.9 |
Room 4, 2 sources mix | Room 4, 3 sources mix | Room 4, 4 sources mix | Room 5, 2 sources mix | Room 5, 3 sources mix | Room 5, 4 sources mix | ||
Algorithm 1 F. Nesta | SIR (dB) | src1 src2 19.3 22.2 | src1 src2 src3 20.7 24.1 24.3 | src1 src2 src3 src4 10.0 18.1 14.6 18.2 | src1 src2 16.5 17.2 | src1 src2 src3 17.1 21.4 7.8 | src1 src2 src3 src4 9.4 16.5 9.4 15.2 |
Algorithm 2 I. Lee | SIR (dB) | src1 src2 2.6 11.4 | src1 src2 src3 20.2 18.9 17.4 | src1 src2 src3 src4 8.0 18.0 1.7 20.6 | src1 src2 0.0 13.4 | src1 src2 src3 1.7 16.3 14.7 | src1 src2 src3 src4 15.5 16.2 16.2 3.5 |
Algorithm 3 I. Lee | SIR (dB) | src1 src2 0.6 9.3 | src1 src2 src3 3.1 14.8 0.5 | src1 src2 src3 src4 -5.7 2.4 1.2 -0.3 | src1 src2 -0.8 11.1 | src1 src2 src3 1.9 12.9 5.7 | src1 src2 src3 src4 0.0 8.3 2.2 -1.4 |
Algorithm 4 I. Takashi | SIR (dB) | src1 src2 7.3 15.4 | src1 src2 src3 3.5 11.0 7.7 | src1 src2 src3 src4 -1.6 7.9 9.0 7.0 | |||
Algorithm 5 S. Douglas | SIR (dB) | src1 src2 14.1 14.5 | src1 src2 src3 9.5 12.1 10.8 | src1 src2 src3 src4 6.7 9.6 7.7 5.2 | src1 src2 11.2 13.8 | src1 src2 src3 8.7 10.7 10.5 | src1 src2 src3 src4 3.6 8.1 4.9 4.2 |
Algorithm 6 S. Douglas | SIR (dB) | src1 src2 5.0 13.3 | src1 src2 src3 9.5 7.8 6.0 | src1 src2 src3 src4 6.0 8.6 4.6 3.8 | src1 src2 8.2 12.3 | src1 src2 src3 7.7 6.4 6.3 | src1 src2 src3 src4 -1.1 6.2 0.0 3.0 |
Room 1, 2 sources mix | Room 1, 3 sources mix | Room 1, 4 sources mix | ||
Algorithm 1 F. Nesta | SIR (dB) | src1 src2 5.4 6.0 | src1 src2 src3 11.7 10.8 8.7 | src1 src2 src3 src4 3.6 1.2 -2.1 -7.4 |
Algorithm 2 I. Lee | SIR (dB) | src1 src2 2.2 2.4 | src1 src2 src3 3.4 3.0 19.4 | src1 src2 src3 src4 5.9 5.8 1.1 -4.0 |
Algorithm 3 I. Lee | SIR (dB) | src1 src2 1.8 1.4 | src1 src2 src3 -4.8 3.8 0.8 | src1 src2 src3 src4 0.9 6.7 -0.6 -2.8 |
Algorithm 5 S. Douglas | SIR (dB) | src1 src2 6.7 3.2 | src1 src2 src3 2.9 3.4 14.5 | src1 src2 src3 src4 -0.3 -3.5 4.5 -1.3 |
Algorithm 6 S. Douglas | SIR (dB) | src1 src2 10.3 2.9 | src1 src2 src3 4.3 -0.6 13.9 | src1 src2 src3 src4 -0.7 -1.4 4.0 -2.7 |
Room 2, 2 sources mix | Room 2, 3 sources mix | Room 2, 4 sources mix | Room C, 2 sources mix | Room C, 3 sources mix | Room C, 4 sources mix | ||
Algorithm 1 F. Nesta | SIR (dB) | src1 src2 5.4 3.8 | src1 src2 src3 5.2 0.3 -8.0 | src1 src2 src3 src4 0.8 -1.1 -3.2 -10.6 | src1 src2 7.0 16.6 | src1 src2 src3 9.4 9.6 -7.7 | src1 src2 src3 src4 -0.2 17.0 -5.3 -15.6 |
Algorithm 3 I. Lee | SIR (dB) | src1 src2 2.2 -1.2 | src1 src2 src3 1.7 -1.8 -3.6 | src1 src2 src3 src4 -0.2 -0.5 0.1 -5.5 | src1 src2 9.0 18.4 | src1 src2 src3 11.7 12.0 -6.7 | src1 src2 src3 src4 7.8 10.8 -4.4 -13.1 |
Algorithm 5 S. Douglas | SIR (dB) | src1 src2 3.6 0.8 | src1 src2 src3 -1.4 -1.4 -2.4 | src1 src2 src3 src4 -0.8 -3.1 -3.4 -1.3 | src1 src2 3.4 1.3 | src1 src2 src3 4.1 0.3 -6.5 | src1 src2 src3 src4 3.3 1.1 -7.1 -16.6 |
Algorithm 6 S. Douglas | SIR (dB) | src1 src2 5.1 2.7 | src1 src2 src3 1.4 -0.8 -1.6 | src1 src2 src3 src4 -2.4 -3.2 -4.8 -4.8 | src1 src2 0.0 1.0 | src1 src2 src3 -0.4 -1.0 -5.6 | src1 src2 src3 src4 -0.9 -1.1 -7.3 -16.0 |
Room 3, 2 sources mix | Room 3, 3 sources mix | Room 3, 4 sources mix | Room O, 2 sources mix | Room O, 3 sources mix | Room O, 4 sources mix | ||
Algorithm 1 F. Nesta | SIR (dB) | src1 src2 10.4 -1.2 | src1 src2 src3 7.6 -3.8 2.4 | src1 src2 src3 src4 -4.2 1.6 2.0 -7.8 | src1 src2 18.1 20.2 | src1 src2 src3 9.8 10.4 8.2 | src1 src2 src3 src4 14.1 7.1 -3.0 -12.0 |
Algorithm 3 I. Lee | SIR (dB) | src1 src2 3.2 2.2 | src1 src2 src3 2.4 5.0 0.6 | src1 src2 src3 src4 0.1 -1.0 -4.2 -4.3 | src1 src2 13.7 15.2 | src1 src2 src3 6.7 1.4 -0.3 | src1 src2 src3 src4 3.9 2.2 -0.1 -19.8 |
Algorithm 5 S. Douglas | SIR (dB) | src1 src2 5.6 3.6 | src1 src2 src3 1.2 -1.4 -2.7 | src1 src2 src3 src4 -0.8 -3.0 -4.4 -3.1 | src1 src2 5.4 1.9 | src1 src2 src3 1.7 -0.3 -0.7 | src1 src2 src3 src4 1.5 -0.1 -1.0 -18.9 |
Algorithm 6 S. Douglas | SIR (dB) | src1 src2 6.0 3.9 | src1 src2 src3 -0.4 -2.4 -1.8 | src1 src2 src3 src4 -2.0 -2.6 -3.9 -3.6 | src1 src2 1.0 1.9 | src1 src2 src3 -0.2 -0.8 -2.1 | src1 src2 src3 src4 -1.0 -0.5 -1.6 -16.5 |
(1) The mixture signals were produced by separately recording the spatial image of each source over all microphones then summing the spatial images of all sources over each channel.
(2) All signals were separately recorded, without ensuring synchronization between recording and playback. The reference source spatial images and the mixture signals are not synchronized. Approximate synchronization was performed prior to the computation of numerical performance figures by applying a delay to the reference source spatial images in order to achieve maximum correlation with the mixture signal over the first channel. Therefore, the measured SIR is inherently less accurate than in case (1).
If often happens over 4-source mixtures that the 4th source is not predominant in any of the estimates, while another source is predominant in two different estimates. In this case, the algorithm failed to recover all sources as requested, which is reflected by a large negative SIR for the 4th source.
It also happens that some estimates correspond to different sources at different frequencies. In this case, the permutation of the sources estimated by maximizing SIR may be perceptually wrong, since the SIR does not take human frequency sensitivity into account. Nevertheless, this does not affect much the mean SIR.