nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2025, 06, v.14 21-32
基于码字嵌入与帧注意力的压缩语音隐写分析方法
基金项目(Foundation): 海南省科技专项资助(编号:ZDYF2025SHFZ058); 中国科学院青年创新促进会项目(编号:2022022); 海南省“南海新星”科技创新人才平台项目(编号:NHXXRCXM202340); 海口市重点科技计划项目(编号:2024020)
邮箱(Email): liup@dsp.ac.cn;
DOI: 10.20064/j.cnki.2095-347X.2025.06.003
摘要:

量化索引调制隐写术具有较高的隐蔽性,因为引入的嵌入失真可以通过语音编码器的合成步骤进行补偿,这对网络通信的安全监控构成重大威胁。本文提出一种基于码字嵌入与帧注意力机制的量化索引调制隐写分析模型,该方法提出一种码字嵌入表示,利用训练好的字典将压缩语音流转换为独热编码的表示形式,从而将量化索引序列映射到相应的嵌入矩阵中。然后,该嵌入矩阵被输入到结合双向长短期记忆网络和帧注意力机制的分类网络中,生成最终的隐写分析结果。实验结果表明,所提方法在嵌入率为10%的数据集上的检测准确率达到99.25%,在嵌入率高于40%的数据集上的检测准确率达到100%,显著优于现有方法。此外,所提方法在嵌入率低于10%的数据集上仍能保持较高的检测性能,在嵌入率为1%的数据集上的准确率高于65%。

Abstract:

Quantization index modulation steganography has high concealment because the introduced embedding distortion can be compensated by the synthesis step of the speech encoder,which poses a major threat to the security monitoring of network communications.In this paper,a steganalysis model of QIM steganography based on codeword embedding and Bi-LSTM with frame attention is proposed.The proposed method defines an embedding representation of codewords which utilizes the trained dictionaries to convert the compressed speech stream into a one-hot representation. As a consequence,the quantization index sequence can be mapped into the corresponding embeddings matrix. Finally,the obtained embedding matrix is input into a classification network that combines Bi-LSTM and frame attention mechanism to generate the final steganalysis result. The experimental results demonstrate that the detection accuracy of the proposed method on the dataset with a 10% embedding rate has reached 99. 25%,and the detection accuracy on all datasets with an embedding rate higher than 40% has reached 100%. These results are far higher than existing methods. In addition,the detection performance of the proposed method can still be guaranteed on datasets with an embedding rate of less than 10%,which are more difficult to detect,and its accuracy on a dataset with an embedding rate of 1% is still higher than 65%.

参考文献

[1]Bur Goode. Voice over Internet Protocol(Vo IP)[J]. Proceedings of the IEEE,2002,90(9):1495-1517.

[2]Mazurczyk W,Szczypiorski K. Steganography of Vo IP Streams[C]//Proceedings of the OTM 2008 Confederated International Conferences,Coop IS,DOA,GADA,IS,and ODBASE 2008,Monterrey,Mexico,2008:1001-1018.

[3]Huang Y F,Tao H Z,Xiao B,et al. Steganography in low bit-rate speech streams based on quantization index modulation controlled by keys[J]. Science China Technological Sciences,2017,60(10):1585-1596.

[4]AbdelRahim S,Ghoneimy S,Selim G. Adaptive security scheme for real-time Vo IP using multilayer steganography[C]//In ACM International Conference Proceeding Series,New York,USA:ACM,2018:106-110.

[5]Ren Y Z,Wu H X,Wang L N. An AMR adaptive steganography algorithm based on minimizing distortion[J]. Multimedia Tools and Applications,2018,77(10):12095-12110.

[6]Ren Y Z,Yang H Y,Wu H X,et al. A Secure AMR Fixed Codebook Steganographic Scheme Based on Pulse Distribution Model[J]. IEEE Transactions on Information Forensics and Security,2019,14(10):2649-2661.

[7]Nishimura A. Data hiding in pitch delay data of the adaptive multi-rate narrow-band speech codec[C]//In IIH-MSP 2009-20095th International Conference on Intelligent Information Hiding and Multimedia Signal Processing,Kyoto,Japan,2009:483-486.

[8]Liu P,Li S B,Wang H Q. Steganography in vector quantization process of linear predictive coding for low-bit-rate speech codec[J]. Multimedia Systems,2017,23(4):485-497.

[9]Liu P,Li S B,Wang H Q. Steganography integrated into linear predictive coding for low bit-rate speech codec[J]. Multimedia Tools and Applications,2017,76(2):2837-2859.

[10]Tian H,Liu J,Li S B. Improving security of quantization-index-modulation steganography in low bit-rate speech streams[J].Multimedia Systems,2014,20(2):143-154.

[11]Chen B,Wornell G W. Quantization index modulation:A class of provably good methods for digital watermarking and information embedding[J]. IEEE Transactions on Information Theory,2001,47(4):1423-1443.

[12]Wu Z J,Cao H J,Li D Z. An approach of steganography in G. 729 bitstream based on matrix coding and interleaving[J]. Chinese Journal of Electronics,2015,24(1):157-165.

[13]Peng X S,Huang Y F,Li F F. A steganography scheme in a low-bit rate speech codec based on 3D-sudoku matrix[C]//In Proceedings of 2016 8th IEEE International Conference on Communication Software and Networks,Beijing,China,2016:13-18.

[14]Yan S F,Tang G M,Sun Y F,et al. A triple-layer steganography scheme for low bit-rate speech streams[J]. Multimedia Tools and Applications,2015,74(24):11763-11782.

[15]Li S B,Tao H Z,Huang Y F. Detection of quantization index modulation steganography in G. 723. 1 bit stream based on quantization index sequence analysis[J]. Journal of Zhejiang University:Science C,2012,13(8):624-634.

[16]Li S B,Jia Y Z,Kuo J. Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals[J]. IEEE/ACM Transactions on Audio Speech and Language Processing,2017,25(5):1011-1022.

[17]Chen B L,Luo W Q,Li H D. Audio Steganalysis with Convolutional Neural Network[C]//In Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security,New York,NY,USA,2017:85-90.

[18]Lin Y,Wang R,Yan D,et al. Audio Steganalysis with Improved Convolutional Neural Network[C]//in Proceedings of the ACM Workshop on Information Hiding and Multimedia Security. New York,USA:ACM,2019:210-215.

[19]Lin Z N,Huang Y F,Wang J L. RNN-SM:Fast Steganalysis of Vo IP Streams Using Recurrent Neural Network[J]. IEEE Transactions on Information Forensics and Security,2018,13(7):1854-1868.

[20]Zhao H M,Dai Q Y,Ren J C,et al. Robust information hiding in low-resolution videos with quantization index modulation in DCT-CS domain[J]. Multimedia Tools and Applications,2018,77(14):18827-18847.

[21]Malik H,Subbalakshmi K P,Chandramouli R. Nonparametric steganalysis of QIM steganography using approximate entropy[J].In IEEE Transactions on Information Forensics and Security,2012,7(2):418-431.

[22]Malik H. Steganalysis of QIM steganography using irregularity measure[C]//In MM and Sec’08:Proceedings of the 10th ACM Workshop on Multimedia and Security,Oxford,UK,2008:149-158,.

[23]Wu Q X,Li W P,Xiao Y Y. Revisit steganalysis on QIM-based data hiding[C]//In IIH-MSP 2009-2009 5th International Conference on Intelligent Information Hiding and Multimedia Signal Processing,Kyoto,Japan,2009:929-932.

[24]Kraetzer C,Dittmann J. Mel-cepstrum-based steganalysis for Vo IP steganography[J]. In Security,Steganography,and Watermarking of Multimedia Contents IX,2007,6505:650505.

[25]Koçal O H,YürüklüE,AvcibaşI. Chaotic-type features for speech steganalysis[J]. IEEE Transactions on Information Forensics and Security,2008,3(4):651-661.

[26]Liu Q Z,Sung A H.,Qiao M Y. Temporal derivative-based spectrum and mel-cepstrum audio steganalysis[J]. IEEE Transactions on Information Forensics and Security,2009,4(3):359-368.

[27]Huang Y F,Tang S,Zhang Y. Detection of covert voice-over Internet protocol communications using sliding window-based steganalysis[J]. IET Communications,2011,5(7):929-936.

[28]Tian H,Liu J,Li S B. Improving security of quantizationindex-modulation steganography in low bit-rate speech streams[J].Multimedia Systems,2014,20(2):143-154.

[29]Yin Z,Shen Y Y. On the dimensionality of word embedding[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems,Montreal,Canada. 2018:895-906.

[30]Robinson T,Hochberg M,Renals S. The Use of Recurrent Neural Networks in Continuous Speech Recognition[J]. Automatic Speech&Speaker Recognition Advanced Topics,1996,355:233-258.

[31]Schuster M,Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing,1997,45(11):2673-2681.

[32]Liu G,Guo J B. Bidirectional LSTM with attention mechanism and convolutional layer for text classification[J]. Neurocomputing,2019,337:325-338.

[33]Yang H,Yang Z L,Bao Y J,et al. Hierarchical representation network for steganalysis of qim steganography in low-bit-rate speech signals[C]//International Conference on Information and Communications Security,Cham,Springer International Publishing,2020:783-798.

[34]Yang H,Yang Z,Bao Y,et al. FCEM:A novel fast correlation extract model for real time steganalysis of voip stream via multihead attention[C]//In ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Barcelona,Spain,2020:2822-2826.

[35]Hu Y T,Huang Y H,Yang Z L,et al. Detection of heterogeneous parallel steganography for low bit-rate voip speech streams[J]. Neurocomputing,2021,419:70-79.

[36]Kingma D P,Ba J L. Adam:A method for stochastic optimization[EB/OL]. International Conference on Learning Representations,ar Xiv preprint ar Xiv:1412. 6980.(2014-12-22)/[2017-01-30]. https://arxiv. org/abs/1412. 6980.

基本信息:

DOI:10.20064/j.cnki.2095-347X.2025.06.003

中图分类号:TP309;TN912.3

引用信息:

[1]王津港,刘鹏.基于码字嵌入与帧注意力的压缩语音隐写分析方法[J].网络新媒体技术,2025,14(06):21-32.DOI:10.20064/j.cnki.2095-347X.2025.06.003.

基金信息:

海南省科技专项资助(编号:ZDYF2025SHFZ058); 中国科学院青年创新促进会项目(编号:2022022); 海南省“南海新星”科技创新人才平台项目(编号:NHXXRCXM202340); 海口市重点科技计划项目(编号:2024020)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文