Speech Enhancement in Non-Stationary Noise
Original | Minimum statistics spectral subtraction [1, 2] | LSTM denoising [3] | Sparse NMF [4] | Exemplar-based sparse NMF [5] |
---|---|---|---|---|
Multi-condition example 1: TUM NAVIC corpus, English, City noise (bicycle) @ 5 dB(A) | ||||
Multi-condition example 2: TUM NAVIC corpus, English, Music noise @ 5 dB(A) | ||||
Application to real phone recording (close-talk microphone, Munich-Maxvorstadt city noise) | ||||
DNN/LSTM benchmark on CHiME-2 data [6]
Original | DNN | LSTM | Noise-free speech | |
---|---|---|---|---|
male speech + child noise @ 9 dB input SNR, si_dt_05 | ||||
female speech + music noise @ 0 dB input SNR, si_dt_05 | ||||
female speech + child noise @ -6 dB input SNR, si_dt_05 | ||||
male speech + child noise @ 0 dB input SNR, si_et_05 | ||||
male speech + child noise + female speech + telephone noise @ -6 dB input SNR, si_et_05 | ||||
DNN-based speech enhancement on Aurora-4 data [7]
Original | DNN | Noise-free speech | ||
---|---|---|---|---|
test set: babble noise | ||||
test set: airport noise | ||||
test set: car noise | ||||
test set: street noise | ||||
test set: restaurant noise | ||||
[1] VOICEBOX
[2] Rainer Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech and Audio Processing, 9(5):504-512, 2001
[3] Felix Weninger, Florian Eyben, and Björn Schuller, Single-Channel Speech Separation With Memory-Enhanced Recurrent Neural Networks, Proceedings 39th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Florence, Italy, 2014 [pdf]
[4] P. D. O’Grady and B. A. Pearlmutter, Discovering convolutive speech phones using sparseness and non-negativity, Proceedings 7th International Conference on Independent Component Analysis and Signal Separation, ICA 2007, London, UK, pp. 520-527, 2007
[5] Jort F. Gemmeke and Tuomas Virtanen and Antti Hurmalainen, Exemplar-Based Speech Enhancement and its Application to Noise-Robust Automatic Speech Recognition, Proceedings of the CHiME Workshop, Florence, Italy, 2011 [pdf]
[6] Felix Weninger et al., Discriminatively trained recurrent neural networks for single-channel speech separation, Proceedings of the IEEE Global Signal Processing Conference (GlobalSIP), Atlanta, GA, 2014.
[7] Jun Du et al., Robust speech recognition with speech enhanced deep neural networks, Proc. INTERSPEECH, 2014, pp.616-620. [pdf]
[8] Tian Gao et al., A unified speaker-dependent speech separation and enhancement system based on deep neural networks, Proc. ChinaSIP, 2015, pp.687-691. [pdf]
[9] Tian Gao et al., SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement, Proc. INTERSPEECH, 2016, pp.3713-3717. [pdf][poster]