TY - GEN
T1 - Sound Source Separation by Instantaneous Estimation-Based Spectral Subtraction
AU - Ozawa, Kenji
AU - Morise, Masanori
AU - Sakamoto, Shuichi
N1 - Funding Information:
ACKNOWLEDGMENT This study was supported in part by JSPS KAKENHI Grant (JP16K06384) and the Cooperative Res. Project (H28/A10) of the Res. Inst. of Electr. Comm., Tohoku Univ. We would like to thank Editage (www.editage.jp) for English language editing.
PY - 2019/1/2
Y1 - 2019/1/2
N2 - This project aims to achieve sound source separation based on the two-dimensional fast Fourier transform (2D FFT) of a spatio-temporal sound pressure distribution image consisting of the outputs of a microphone array. The target sound, which arrives from the front of the array, forms vertical stripes in the image. Therefore, its spectral components are perfectly localized as direct current (DC) components along the spatial frequency axis in the 2D-FFT spectrum. In this study, noise suppression was performed by spectral subtraction after the DC components of noise were instantaneously estimated from the spectrum using artificial neural networks. As a result, the performance of the proposed method with a 14-cm-long array was comparable to that of the conventional delay and sum beamformer method with an approximately 5-m-long array.
AB - This project aims to achieve sound source separation based on the two-dimensional fast Fourier transform (2D FFT) of a spatio-temporal sound pressure distribution image consisting of the outputs of a microphone array. The target sound, which arrives from the front of the array, forms vertical stripes in the image. Therefore, its spectral components are perfectly localized as direct current (DC) components along the spatial frequency axis in the 2D-FFT spectrum. In this study, noise suppression was performed by spectral subtraction after the DC components of noise were instantaneously estimated from the spectrum using artificial neural networks. As a result, the performance of the proposed method with a 14-cm-long array was comparable to that of the conventional delay and sum beamformer method with an approximately 5-m-long array.
KW - Instantaneous estimation
KW - Microphone array
KW - Neural networks
KW - Spatiotemporal sound pressure distribution image
KW - Spectral subtraction
KW - Two-dimensional FFT
UR - http://www.scopus.com/inward/record.url?scp=85061503163&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061503163&partnerID=8YFLogxK
U2 - 10.1109/ICSAI.2018.8599483
DO - 10.1109/ICSAI.2018.8599483
M3 - Conference contribution
AN - SCOPUS:85061503163
T3 - 2018 5th International Conference on Systems and Informatics, ICSAI 2018
SP - 900
EP - 905
BT - 2018 5th International Conference on Systems and Informatics, ICSAI 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th International Conference on Systems and Informatics, ICSAI 2018
Y2 - 10 November 2018 through 12 November 2018
ER -