系統識別號 | U0002-2407202412230900 |
---|---|
DOI | 10.6846/tku202400563 |
論文名稱(中文) | 反偵測人工智慧生成影像的對抗樣本生成方法 |
論文名稱(英文) | An Adversarial Approach for Anti-Detection of AI-Generated Images through Sample Generation |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系碩士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 112 |
學期 | 2 |
出版年 | 113 |
研究生(中文) | 鄭承斌 |
研究生(英文) | CHENG-BIN JHENG |
學號 | 611410613 |
學位類別 | 碩士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2024-07-10 |
論文頁數 | 52頁 |
口試委員 |
指導教授
-
吳孟倫(mlwutp@gmail.com)
口試委員 - 方瓊瑤 口試委員 - 林其誼 |
關鍵字(中) |
生成對抗網路 對抗攻擊 擴散模型 對抗樣本 |
關鍵字(英) |
Generative Adversarial Networks Adversarial Attack Diffusion Model Adversarial Samples |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
人工智慧生成的影像越來越逼真,已經到了人眼無法看出破綻的地步,尤其是最近幾年大量出現的diffusion model生成架構,使用者可以透過描述句子生出想要生的影像,而為了防止被惡意的濫用,越來越多偵測diffusion model所生成的影像的方法被提出,這些方法在實驗中都達到非常好的表現,但對於他們應用在現實中是否夠穩健,我們抱持著懷疑,尤其當遭受到惡意分子的攻擊時,是否還能準確的偵測,是一大考驗。為了檢測這些方法是否穩健我們提出了一個攻擊方法對這些偵測的方法進行攻擊,我們透過消除生成影像中容易被偵測到的頻率域痕跡,接著將消除痕跡的影像生成對抗樣本來進行攻擊,而在實驗中,我們更進一步產生泛化能力更強的對抗樣本,對這些方法攻擊,我們在實驗中成功了攻擊這些方法,從這些實驗的過程中,我們探索未來這些偵測方法需要做哪些改進,以及從我們的結果反映偵測diffusion model的方法仍需要更進一步的研究。 |
英文摘要 |
The artificial intelligence generated images have become increasingly realistic, reaching a point where flaws are imperceptible to the human eye. Particularly in recent years, diffusion models have emerged prominently. These models allow users to generate images based on descriptive sentences. However, to prevent malicious misuse, numerous methods have been proposed to detect images generated by diffusion models. These methods have shown excellent performance in experiments. Nevertheless, doubts persist regarding their robustness in practical applications, especially under attacks by malicious individuals. Ensuring accurate detection under such circumstances poses a significant challenge. To evaluate the robustness of these detection methods, we propose an attack method targeting these detection techniques. Specifically, we eliminate the frequency-domain fingerprint in synthetic images that are easily detectable, thereby creating adversarial samples. In our experiments, we further enhance the generalization ability of these adversarial samples to challenge these methods effectively. Our experiments successfully attacked these detection methods, prompting considerations for future improvements in these methods. Our findings underscore the ongoing need for further research into methods for detecting diffusion model-generated images. |
第三語言摘要 | |
論文目次 |
CONTENTS 論文提要內容: III Abstract: IV CONTENTS V LIST OF FIGURES VII LIST OF TABLES VIII CHAPTER 1 INTRODUCTION 1 1.1 Background 1 1.2 Motivation 2 1.3 Research Objective 4 1.4 Problem Statement 5 1.5 Thesis Organization 6 CHAPTER 2 RELATED WORKS 7 2.1 Adversarial Perturbation Attack 8 2.2 Elimination of Manipulation Fingerprints in the Frequency Domain 9 2.3 Employing Image Filtering to Mislead Detectors 10 CHAPTER 3 PROPOSED METHOD 12 3.1 System Architecture 12 3.2 Eliminating Frequency Domain Fingerprint 13 3.2.1 Generator 13 3.2.2 Discriminator 17 3.2.3 Loss Function 19 3.3 Generated Adversarial Sample 22 3.3.1 Clipping 22 3.3.2 Loss Function 23 3.3.3 I-FGSM 24 CHAPTER 4 EXPERIMENTAL RESULTS 25 4.1 Evaluation Metrics 25 4.2 Dataset 25 4.3 Diffusion Model Detectors 28 4.4 Eliminating Frequency Domain Fingerprint 30 4.4.1 Presenting the Effect of Eliminating Frequency Domain Fingerprint 31 4.4.2 Selection of Feature Scaling Layer 32 4.4.3 Introducing the Spatial Domain 35 4.5 Generated Adversarial Sample 38 4.5.1 Adversarial Attack 41 4.5.2 Adversarial Attack-Real Adversarial Samples 42 4.5.3 Adversarial Attack-Fake Adversarial Samples 43 4.5.4 Enhancing Generalization Ability 44 4.5.5 Ablation Study 48 CHAPTER 5 CONCLUSIONS 49 REFERENCES 51 LIST OF FIGURES Fig. 1: The fingerprints of images after Fourier transformation in the frequency domain 4 Fig. 2: Architecture of generate adversarial sample approach 12 Fig. 3: Architecture of Eliminating Frequency Domain Fingerprint 13 Fig. 4: Architecture of Generator 15 Fig. 5: Architecture of Discriminator 18 Fig. 6: Presenting the dataset we used 27 Fig. 7: The effect of eliminating frequency domain fingerprint 32 Fig. 8: The effect of using different up-sampling methods in the feature scaling layer on fingerprint elimination 33 Fig. 9: A comparison between the original RGB image and the image after fingerprint elimination 34 Fig. 10: Displaying frequency domain fingerprints after fine-tuning with spatial domain discriminator 36 Fig. 11: The images regenerated by the fine-tuned generator 37 Fig. 12: Adversarial samples with different levels of perturbation. 40 Fig. 13: Comparison between typical adversarial samples and adversarial samples designed to enhance generalization, where v2 denotes the latter. 46 LIST OF TABLES Table 1. Architecture of the Encoder. 15 Table 2. Architecture of the Decoder. 17 Table 3. Architecture of the Discriminator. 19 Table 4. Detailed Composition of the Real Image Dataset. 27 Table 5. Detailed Composition of the Synthetic Image Dataset. 28 Table 6. Detailed Composition of Training Generator Dataset. 31 Table 7. Detailed Composition of the Fine-tuning Generator Dataset 35 Table 8. Accuracy on Images with Eliminated Frequency Fingerprints. 38 Table 9. Accuracy of the Detector for Adversarial Samples Generated from Real and Fake Images. 41 Table 10. Accuracy of the Detector for Adversarial Samples Generated from Real Images. 43 Table 11. Accuracy of the Detector for Adversarial Samples Generated from Fake Images. 44 Table 12. Accuracy of the Detector for Adversarial Samples Enhancing Generalization Ability Generated from Fake Images. 47 Table 13. Accuracy of the Detector Under Best Levels of Perturbation. 48 |
參考文獻 |
REFERENCES [1] I. Goodfellow et al., “Generative adversarial nets,” in Proc. Int. Conf. Neural Inf. Process. Syst., pp. 2672–2680, 2014. [2] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Proc. Int. Conf. Neural Inf. Process. Syst., pp. 6840–6851,2020. [3] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in CVPR, pp. 10684–10695,2022 [4] A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “GLIDE: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021. [5] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125v1, 2022. [6] R. Durall, M. Keuper, and J. Keuper, ‘‘Watch your up-convolution: CNN based generative deep neural networks are failing to reproduce spectral distributions,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, WA, USA, Jun. pp. 7887–7896,2020 [7] R. Corvi, D. Cozzolino, G. Zingarini, G. Poggi, K. Nagano, and L. Verdoliva, “On the detection of synthetic images generated by diffusion models,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5,2023. [8] L. Guarnera, O. Giudice, and S. Battiato, “Level up the deepfake detection: a method to effectively discriminate images generated by gan architectures and diffusion models,” arXiv preprint arXiv:2303.00608, 2023. [9] Z. Wang, J. Bao, W. Zhou, W. Wang, H. Hu, H. Chen, and H. Li, “Dire for diffusion-generated image detection,” arXiv preprint arXiv:2303.09295, 2023. [10] Z. Sha, Z. Li, N. Yu, and Y. Zhang, “DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Diffusion Models,” arXiv preprint arXiv:2210.06998, 2022. [11] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in Proc. Int. Conf. Learning Representations, pp. 1–14, 2017. [12] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3142–3155, Jul. 2017. [13] M. Masood, M. Nawaz, K. M. Malik, A. Javed, and A. Irtaza, ‘‘Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward,’’ arXiv:2103.00484, 2021. [14] N. Carlini and H. Farid, ‘‘Evading deepfake-image detectors with whiteand black-box attacks,’’ arXiv:2004.00622, 2020. [15] P. Neekhara, B. Dolhansky, J. Bitton, and C. C. Ferrer, “Adversarial threats to DeepFake detection: A practical perspective,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), pp. 923–932, 2021. [16] Y. Huang et al., “FakeRetouch: Evading DeepFakes detection via the guidance of deliberate noise,” arXiv:2009.09213, 2020. [17] S. Jung and M. Keuper. Spectral Distribution aware Image Generation. arXiv preprint arXiv:2012.03110, 2020. [18] T. Osakabe, M. Tanaka, Y. Kinoshita, and H. Kiya, “Cyclegan without checkerboard artifacts for counter-forensics of fake-image detection,” in Int. Workshop Adv. Imag. Technol. 2021, vol. 11766. International Society for Optics and Photonics, Art. no. 1176609, 2021. [19] Y. Huang et al., “FakePolisher: Making deepfakes more detectionevasive by shallow reconstruction,” in Proc. 28th ACM Int. Conf. Multimedia, pp. 1217–1226, 2020. [20] C. Liu, H. Chen, T. Zhu, J. Zhang, and W. Zhou, “Making deepfakes more spurious: Evading deep face forgery detection via trace removal attack,” IEEE TDSC, 2023. [21] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. 18th Int. Conf. Med. Image Comput. Comput.-Assist. Interv., pp. 234–241, 2015. [22] K. Chandrasegaran, N.-T. Tran, and N.-M. Cheung, “A closer look at Fourier spectrum discrepancies for CNN-generated images detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 7200–7209, 2021. [23] J. Wu, Z. Huang, J. Thoma, D. Acharya, and L. Van Gool, “Wasserstein divergence for GANs,” in Proc. Eur. Conf. Comput. Vis., pp. 653–668, 2018. [24] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” in Proc. Eur. Conf. Comput. Vis., pp. 694–711, 2016. [25] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [26] N. Carlini and D. A. Wagner, “Towards evaluating the robustness of neural networks,” in Proc. IEEE Symp. Security Privacy, pp. 39–57, 2017. [27] T.-Y. Lin et al., “Microsoft COCO: Common objects in context,” in Proc. Eur. Conf. Comput. Vis., pp. 740–755, 2014. [28] U. Ojha, Y. Li, and Y. J. Lee, “Towards universal fake image detectors that generalize across generative models,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 24480–24489, 2023. [29] S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. Efros, “CNN-generated images are surprisingly easy to spot... for now,” in CVPR, 2020. [30] A. Radford et al., “Learning transferable visual models from natural language supervision,” in Proc. Int. Conf. Mach. Learn., pp. 8748–8763, 2021. [31] A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv:2010.11929, 2020. [32] Davide Alessandro Coccomini et al., “Detecting images generated by diffusers,” arXiv preprint arXiv:2303.05275, 2023. [33] Jiaxuan Chen, Jieteng Yao, and Li Niu., “A Single Simple Patch is All You Need for AI-generated Image Detection,” arXiv preprint arXiv:2402.01123, 2024. [34] J. Fridrich and J. Kodovsky, “Rich models for steganalysis of digital images,” IEEE Trans. Inf. Forensics Security, vol. 7, pp. 868–882, Jun. 2012. [35] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980, 2014. [36] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2818– 2826, 2016. |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信