| 系統識別號 | U0002-1701202501283000 |
|---|---|
| DOI | 10.6846/tku202500041 |
| 論文名稱(中文) | 基於預嵌入的擴散模型生成影像浮水印 |
| 論文名稱(英文) | A Diffusion-Based Pre-Embedding Generative Image Watermarking |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 資訊工程學系全英語碩士班 |
| 系所名稱(英文) | Master's Program, Department of Computer Science and Information Engineering (English-taught program) |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 113 |
| 學期 | 1 |
| 出版年 | 114 |
| 研究生(中文) | 馬美芳 |
| 研究生(英文) | Nantachaporn Rueangsuwan |
| 學號 | 611785055 |
| 學位類別 | 碩士 |
| 語言別 | 英文 |
| 第二語言別 | |
| 口試日期 | 2025-01-09 |
| 論文頁數 | 41頁 |
| 口試委員 |
指導教授
-
陳建彰(cchen34@gmail.com)
口試委員 - 林承賢(cslin@mail.tku.edu.tw) 口試委員 - 許哲銓(tchsu@scu.edu.tw) |
| 關鍵字(中) |
人工智慧生成內容 擴散模型 浮水印 |
| 關鍵字(英) |
AI-Generative Content Diffusion Model Watermark |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
人工智慧生成內容(AIGC)的快速發展,已經徹底改變了創意產業,讓高品質媒體的創作變得輕而易舉。然而,這項技術進步也帶來了知識產權保護與內容真實性方面的挑戰。擴散模型作為生成式人工智慧框架的核心技術,在嵌入不可見的浮水印方面展現了潛力。本研究評估了兩個最先進的基於擴散模型的框架:來自 WaDiff 的 StegaStamp 和 A Recipe for Watermark Diffusion Models,在嵌入穩健且不可見浮水印方面的性能。通過分析結構相似性指數(SSIM)、峰值信噪比(PSNR)、生成品質評估(FID)、訓練時間、訓練生成效果及浮水印識別能力等關鍵指標,本研究探討了這兩種模型之間的權衡。 混合模型(Hybrid Model) 在後期訓練階段的 SSIM、PSNR 和 FID 等關鍵指標上表現優於 WaDiff,並展現了對過擬合的穩健性。儘管混合模型需要稍多的計算時間,但它提供了更卓越且可靠的結果,適用於高品質任務。在 trace1e5 的測試中,其表現超越 WaDiff,並且在 trace1e4 和 trace1e5 的噪聲及隨機遮罩攻擊下表現更具韌性。雖然 WaDiff 在 trace1e6 的表現略有優勢,但整體而言,混合模型仍是更可靠的選擇。 |
| 英文摘要 |
The rapid advancement of artificial intelligence-generated content (AIGC) has revolutionized creative industries by enabling the effortless creation of high-quality media. However, this technological progress presents challenges in intellectual property protection and content authenticity. Diffusion models, integral to generative AI frameworks, have shown promise in embedding imperceptible watermarks. This study evaluates the performance of two state-of-the-art diffusion-based frameworks: integrating StegaStamp from WaDiff and A Recipe for Watermark Diffusion Models, in embedding robust and invisible watermarks. By analyzing key metrics such as SSIM, PSNR, FID, training time, training generation, and watermark identification across different training step, this research identifies trade-offs between the two models. The Hybrid model outperforms WaDiff in key metrics such as SSIM, PSNR, and FID, particularly in later training stages, while demonstrating robustness against overfitting. The Hybrid model requires slightly more computation time but delivers superior and more reliable results, making it ideal for high-quality tasks. It outperforms WaDiff at trace1e5 and is more robust against attacks such as Noise and Random mask at trace1e4 and trace1e5. While WaDiff has a slight advantage at trace1e6, the Hybrid model remains the more reliable overall choice. |
| 第三語言摘要 | |
| 論文目次 |
List of Contents 1. Introduction 1 2. Relative Work 3 2.1 Image Watermarking 3 2.2 Diffusion Models 4 2.3 Watermark in Diffusion Models 5 3. Methodology 8 3.1 Problem Statement and Objectives 8 3.2 Algorithm Framework 8 3.3 Proposed Method 12 3.4 Preliminaries of Diffusion Models 15 Forward Diffusion Process 15 Reverse Denoising Process 16 Pre-training Watermark Decoders 16 Optimization Objective 17 Robustness Against Transformations 17 4. Experiment Results 18 4.1 Evaluation Metrics 18 Frechet Inception Distance (FID) 18 Structural Similarity Index (SSIM) 18 Peak Signal-to-Noise Ratio (PSNR) 19 Trace metrics 20 4.2 Experiment Setting 22 4.3 Implementation Details 22 4.4 Result 23 5. Conclusion and Future Work 37 5.1 Conclusion 37 5.2 Future Work 37 References 38 List of Illustration Figure 1: A conventional Image Watermarking Model [13]. 3 Figure 2: Diffusion Models Architecture [4] 4 Figure 3: Fingerprinting in Diffusion Models Architecture. 5 Figure 4: WaDiff Diffusion model Architecture [16] 6 Figure 5: Recipe for Watermarking DMs in Different Generation Paradigms. [10] 7 Figure 6: Illustration of Watermark Bits 9 Figure 7: Illustration of Concatenation. 9 Figure 8: Our’s Propose Architecture. 12 Figure 9: Our’s Propose Workflow. 13 Figure 10: Performance of WaDiff, across different stages of training. 24 Figure 11: Performance of Ours-hybrid, across different stages of training. 25 List of Tables Table 1: Encoder Fingerprint Upsample Shape. 10 Table 2: Ground truth contain binary arrays representing pre-generated watermark patterns. 11 Table 3: Comparison of Three Models Across Various Aspects. 14 Table 4: SSIM Table of Two Model Comparison Across Different Training Steps. 26 Table 5: PNSR of Two Model Comparison Across Different Training Steps. 27 Table 6: FID of Two Model Comparison Across Different Training Steps. 28 Table 7: Time Training of Two Model Comparison Across Different Training Steps. 29 Table 8: WaDiff and Our Time Generation. 30 Table 9: Watermark identified the true fingerprint of two model. 31 Table 10: Noise attack identified the true fingerprint of two model. 32 Table 11: Adjust brightness attack identified the true fingerprint of two model. 32 Table 12: Random mask attack identified the true fingerprint of two model. 33 Table 13: Overall performance under different attack conditions. 34 Table 14: Performance Summary Table Across Two Models. 35 |
| 參考文獻 |
References [1] Jiayang Wu, Wensheng Gan, Zefeng Chen, Shicheng Wan, and Hong Lin, "AI-Generated Content (AIGC): A Survey," arXiv:2304.06632, 2023. [2] I. Belcic, "Ibm," 11 11 2014. [Online]. Available: https://www.ibm.com/think/topics/generative-model. [3] Staphord Bengesi, Hoda El-Sayed, Md Kamruzzaman Sarker, Yao Houkpati,John rungu and Timothy Oladunni, "Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers.," Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers., 2024. [4] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer, "High-Resolution Image Synthesis with Latent Diffusion Models," in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [5] Lijun Zhang, Xiao Liu, Antoni Viros Martin, Cindy Xiong Bearfield, Yuriy Brun, Hui Guan, "Attack-Resilient Image Watermarking Using Stable Diffusion," in Advances in Neural Information Processing Systems 2024, 2024. [6] Matthew Tancik, Ben Mildenhall, Ren Ng, "StegaStamp: Invisible Hyperlinks in Physical Photographs," in Conference on Computer Vision and Pattern Recognition (CVPR), California, Berkeley, 2020. [7] Olaf Ronneberger, Philipp Fischer, Thomas Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015. [8] Jamie Hayes, George Danezis, "Generating steganographic images via adversarial training," in Advances in Neural Information Processing Systems 30 (NIPS 2017), 2017. [9] Rui Min, Sen Li, Hongyang Chen, Minhao Cheng, "A Watermark-Conditioned Diffusion Model for IP Protection," arXiv:2403.10893, Hong Kong, 2024. [10] Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung, Min Lin, "A Recipe for Watermarking Diffusion Models," arXiv:2303.10137, 2023. [11] Melih C. Yesilli, Jisheng Chen, Firas A.Khasawneh, Yang Guo, "Automated surface texture analysis via Discrete Cosine Transform and Discrete Wavelet Transform," in Precision Engineering, 2022. [12] Hai Tao, Li Chongmin, Janshi Mohamad Zain, Ahmed N. Abdalla, "Robust Image Watermarking Theories and Techniques: A Review," Journal of Applied Research and Technology, vol. 12, no. 1, pp. 122-138, 2014. [13] Debolina Mahapatra, Om Prakash Singh, Amit Singh, Preetam Amrit, "Autoencoder-convolutional neural network-based embedding and extraction model for image watermarking," Journal of Electronic Imaging, 2022. [14] Jonathan Ho, Ajay Jain, Pieter Abbeel, "Denoising diffusion probabilistic models. Advances in neural information processing systems," arXiv:2006.11239, 2020. [15] Melike Nur Ye gin, Mehmet Fatih Amasyalı, "Theoretical research Generative diffusion models: an overview," arXiv:2404.09016, 2024. [16] C. Hansen, "IBM Developer," 20 July 2022. [Online]. Available: https://developer.ibm.com/articles/generative-adversarial-networks-explained/. [17] Yaşar Demirel, Vincent Gerbaud, "Fundamentals of Nonequilibrium Thermodynamics," in Nonequilibrium Thermodynamics (Fourth Edition), Elsevier B.V., 2019. [18] Yuxin Wen, John Kirchenbauer, Jonas Geiping, Tom Goldstein, "Tree-Ring Watermarks: Fingerprints for Diffusion," in 37th Conference on Neural Information Processing Systems (NeurIPS 2023), 2023. [19] Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar, "Diffusion Models for Adversarial Purification," arXiv:2205.07460, 2022. [20] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei, "ImageNet: A large-scale hierarchical image database," in IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009. [21] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei , "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, 2015. [22] Kevin Alex Zhang, Alfredo Cuesta-Infante, Lei Xu, Kalyan Veeramachaneni, "Steganogan: High Capacity ImageSteganography with Gans," arXiv:1901.03892, 2019. [23] Jiren Zhu, Russell Kaplan, Justin Johnson, Li Fei-Fei, "HiDDeN: Hiding Data With Deep Networks," in ECCV 2018: 15th European Conference, Munich, 2018. [24] Prafulla Dhariwal, Alex Nichol, "Diffusion Models Beat GANs on Image Synthesis," arXiv:2105.05233, 2021. [25] Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon, "The Stable Signature: Rooting Watermarks in Latent Diffusion Models," in International Conference on Computer Vision (ICCV), 2023. [26] Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine, "Elucidating the design space of diffusion based generative models," in Advances in Neural Information Processing Systems (NeurIPS), 2022. [27] Diederik P. Kingma, Jimmy Ba, "Adam: A method for stochastic optimization," in In International Conference on Learning Representations (ICLR), 2015. [28] Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila, "Training generative adversarial networks with limited data.," in In Advances in Neural Information Processing Systems (NeurIPS), 2020. [29] Jiaming Song, Chenlin Meng, Stefano Ermon, "Denoising Diffusion Implicit Models," in arXiv preprint arXiv:2010.02502, 2020. |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信