网络新媒体技术

2025, 06, v.14 13-20+32

基于调整扩散过程的文生图风格后门攻击

基金项目(Foundation): 山西省研究生创新项目(编号：2023SJ012)

邮箱(Email): lhzhang@sxu.edu.cn;

DOI: 10.20064/j.cnki.2095-347X.2025.06.002

47	0	65
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

随着文生图模型在图像生成领域的广泛应用，其安全性问题日益引起关注。然而，文生图技术中的后门攻击研究目前还处于起步阶段。本文提出一种基于调整扩散过程的文本生成图像风格后门攻击模型。通过在扩散模型的U-Net网络中嵌入后门层，精准调控生成路径与自注意力机制，优化图像生成的去噪过程，高效地将后门特征深度融入生成机制。不仅实现对生成图像全局风格的精准把控，还最大程度地保留生成图像的品质与语义一致性。实验结果表明，在无触发条件下，该方法仅导致FID分数出现微小上升;在触发条件下，攻击成功率高达95%。表明该方法能够在触发条件下有效操控生成图像的风格，同时在正常条件下确保模型表现正常，显著提升后门攻击的隐蔽性与针对性。

关键词： 后门攻击; 文生图; 扩散模型; 风格控制; 模型安全;

Abstract：

With the widespread application of text-to-image models in the field of image generation,their security issues have increasingly attracted attention. However,the research on backdoor attacks in text-to-image technology is still in its infancy. A text-to-imagestyle backdoor attack model based on the adjusted diffusion process is proposed in this paper. By embedding the backdoor layer in the U-Net network of the diffusion model,the generation path and self-attention mechanism are precisely controlled,the denoising process of image generation is optimized,and the backdoor features are efficiently and deeply integrated into the generation mechanism. It not only achieves precise control of the global style of the generated image,but also retains the quality and semantic consistency of the generated image to the greatest extent. Experimental results show that under no trigger conditions,this method only causes a slight increase in the FID score; under trigger conditions,the attack success rate is as high as 95%. It shows that this method can effectively control the style of generated images under trigger conditions,while ensuring normal model performance under normal conditions,significantly improving the concealment and pertinence of backdoor attacks.

KeyWords： backdoor attack; text-to-image; diffusion model; style control; model security;

参考文献

[1]Saharia C,Chan W,Saxena S,et al. Photorealistic text-to-image diffusion models with deep language understanding[J]. Advances in neural information processing systems,2022,35:36479-36494.

[2]Goodfellow I,Pouget-Abadie J,Mirza M,et al. Generative adversarial networks[J]. Communications of the ACM,2020,63(11):139-144.

[3]Ho J,Jain A,Abbeel P. Denoising diffusion probabilistic models[J]. Advances in neural information processing systems,2020,33:6840-6851.

[4]Gu T,Dolan-Gavitt B,Bad Nets S. Identifying vulnerabilities in the machine learning model supply chain[C]//Proceedings of the Neural Information Processing Symposium Workshop Mach. Learning Security(MLSec). 2017:1-5.

[5]Hao J,Jin X,Hu X G,et al. Diff-cleanse:Identifying and mitigating backdoor attacks in diffusion models[EB/OL]. ar Xiv preprint ar Xiv:2407. 21316.(2024-08-22)/[2025-01-19]. https://arxiv. org/abs/2407. 21316.

[6]Huang Y,Guo Q,Juefei-Xu F. Zero-day backdoor attack against text-to-image diffusion models via personalization[J]. ar Xiv preprint ar Xiv:2305. 10701.(2023-12-20)/[2025-01-19]. https://arxiv. org/abs/2305. 10701.

[7]Radford A,Kim J W,Hallacy C,et al. Learning transferable visual models from natural language supervision[C]//International conference on machine learning. PMLR,2021:8748-8763.

[8]Rybkin O,Daniilidis K,Levine S. Simple and effective VAE training with calibrated decoders[C]//International conference on machine learning. PMLR,2021:9179-9189.

[9]Young P,Lai A,Hodosh M,et al. From image descriptions to visual denotations:New similarity metrics for semantic inference over event descriptions[J]. Transactions of the Association for Computational Linguistics,2014,2:67-78.

[10]Lin T Y,Maire M,Belongie S,et al. Microsoft coco:Common objects in context[C]//Computer Vision-ECCV 2014:13th European Conference. Zurich,Switzerland:Springer International Publishing,2014:740-755.

[11]Bynagari N B. GANs trained by a two time-scale update rule converge to a local Nash equilibrium[J]. Asian Journal of Applied Science and Engineering,2019,8(1):25-34.

[12]Radford A,Kim J W,Hallacy C,et al. Learning transferable visual models from natural language supervision[C]//International conference on machine learning. PMLR,2021:8748-8763.

基本信息:

DOI：10.20064/j.cnki.2095-347X.2025.06.002

中图分类号:TP391.41

引用信息:

[1]马宏军,张丽红.基于调整扩散过程的文生图风格后门攻击[J].网络新媒体技术,2025,14(06):13-20+32.DOI:10.20064/j.cnki.2095-347X.2025.06.002.

基金信息:

山西省研究生创新项目(编号：2023SJ012)

请选择需要下载的pdf数据

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文

请选择需要下载的pdf数据

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

引用

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈