基于加权预测误差算法的声源定位方法Robust Sound Source Location Based on Weighted Prediction Error Algorithm
张国昌;吴鸣;杨军;
摘要(Abstract):
混响是导致室内声源定位精度下降的主要因素之一。为了降低混响环境下的定位误差,本文应用加权预测误差算法对麦克风阵列信号进行去混响预处理。为了定量分析该去混响方法为声源定位带来的精准度上的提升,我们在多种混响时间、多种白噪声信噪比、两类定位算法和两种阵列参数条件下进行了仿真和实验。仿真和实验结果均表明,与不使用去混响预处理的方法相比,引入去混响预处理可以显著地提升声源定位的鲁棒性,其中在混响时间为1.4 s的仿真环境下,定位误差降低了87%,在实验环境下的定位误差也有不低于50%的下降。
关键词(KeyWords): 声源定位;加权预测误差;去混响
基金项目(Foundation): 国家重点研发计划课题(编号:2016YFB1200503);; 国家自然科学基金项目(编号:11474306、11404367、11474307)
作者(Authors): 张国昌;吴鸣;杨军;
参考文献(References):
- [1] Evers C,Naylor P A.Acoustic SLAM[J].IEEE-ACM Trans Audio Speech and Language Processing,2018,26(9):1484-1498.
- [2] Nikunen J,Virtanen T.Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation[J].IEEE-ACM Trans Audio Speech and Language Processing,2014,22(3):727-739.
- [3] Farmani M,Pedersen M S,Tan Z H,Jensen J.Maximum likelihood approach to “INFORMED” sound source localization for hearing aid applications [M].IEEE International Conference on Acoustics,Speech,and Signal Processing,New York,IEEE.2015.16-20.
- [4] Minotto V P,Jung C R,Lee B.Simultaneous-speaker voice activity detection and localization using mid-fusion of SVM and HMMs [J].IEEE Trans Multimedia,2014,16(4):1032-1044.
- [5] Do H,Silverman H F,Yu Y.A real-time SRP-PHAT source location implementation using stochastic region contraction (SRC) on a large-aperture microphone array[C]//IEEE International Conference on Acoustics,Speech and Signal Processing,F,2007.
- [6] Zhang C,Florencio D,Zhang Z.Why does PHAT work well in low noise,reverberative environments?[C]//IEEE International Conference on Acoustics,Speech and Signal Processing,F,2008.
- [7] Wang H,Kaveh M.Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wide-band sources[J].IEEE Trans Acoustics Speech and Signal Processing,1985,33(4):823-831.
- [8] Yoon Y S,Kaplan L M,McClellan J H.TOPS:New DOA estimator for wideband signals [J].IEEE Trans Signal Processing,2006,54(6):1977-1989.
- [9] Rafaely B,Alhaiany K.Speaker localization using direct path dominance test based on sound field directivity [J].Signal Processing,2018,143:42-47.
- [10] Habets E.Single- and Multi-Microphone Speech Dereverberation using Spectral Enhancement [J].Technische Universiteitndhoven,2007,
- [11] Gillespie B W,Malvar H S,Florêncio D A.Speech dereverberation via maximum-kurtosis subband adaptive filtering;proceedings of the 2001 IEEE International Conference on Acoustics,Speech,and Signal Processing Proceedings (Cat No 01CH37221)[C]//IEEE.F,2001.
- [12] Bo L,Sainath T N,Narayanan A,Caroselli J,Bacchiani M,Misra A,Shafran I,Sak H,Pundak G,Chin K.Acoustic Modeling for Google Home;proceedings of the Interspeech,F,2017[C].
- [13] Yoshioka T,Nakatani T.Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening[J].IEEE Transactions on Audio Speech & Language Processing,2012,20(10):2707-2720.
- [14] Drude L,Heymann J,Boeddeker C,Haeb-Umbach R.NARA-WPE:A Python package for weighted prediction error dereverberation in Numpy and Tensorflow for online and offline processing;proceedings of the Speech Communication[C]//;13th ITG-Symposium,F,2018.VDE.
- [15] Wang D,Zhang X.Thchs-30:A free chinese speech corpus[J].arXiv preprint arXiv:151201882,2015,
- [16] Allen J B,Berkley D A.Image method for efficiently simulation small-room acoustics[J].J Acoust Soc Am,1979,65(4):943-950.
- [17] Sohn J,Kim N S,Sung W.A statistical model-based voice activity detection[J].IEEE signal processing letters,1999,6(1):1-3.
- [18] Teutsch H.Modal array signal processing:principles and applications of acoustic wavefield decomposition[J].Lecture Notes in Control & Information Sciences,2007,348(1):60-76.
- [19] Chan S C,Chen H H.Uniform concentric circular arrays with frequency-invariant characteristics-Theory,design,adaptive beamforming and DOA estimation[J].IEEE Trans Signal Processing,2007,55(1):165-177.