| 系統識別號 | U0002-1108202510453500 |
|---|---|
| DOI | 10.6846/tku202500691 |
| 論文名稱(中文) | 機器學習模型的成員推論攻擊之研究 |
| 論文名稱(英文) | The Study Of Membership Inference Attacks Against Machine Learning Models |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 資訊工程學系碩士班 |
| 系所名稱(英文) | Department of Computer Science and Information Engineering |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 113 |
| 學期 | 2 |
| 出版年 | 114 |
| 研究生(中文) | 李為龍 |
| 研究生(英文) | Wei-Lung Lee |
| 學號 | 611410514 |
| 學位類別 | 碩士 |
| 語言別 | 繁體中文 |
| 第二語言別 | |
| 口試日期 | 2025-06-27 |
| 論文頁數 | 49頁 |
| 口試委員 |
口試委員
-
劉譯閎(randyliu@scu.edu.tw)
指導教授 - 黃仁俊(victor@gms.tku.edu.tw) 口試委員 - 陳世興 |
| 關鍵字(中) |
深度學習 成員推論攻擊 知識蒸餾 |
| 關鍵字(英) |
Deep Learning Membership Inference Attacks Knowledge Distilation |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
近年來,人工智慧技術有顯著的進展並確實有實用價值的系統被結合在些工作或生活層面,機器學習模型已成為各領域中不可或缺的工具。然而,這些模型的廣泛應用也帶來了潛在的隱私洩漏風險。許多研究者開始關注機器學習模型可能被用來反向推斷其訓練資料的相關資訊。成員推論攻擊作為一種新興的模型隱私分析技術,透過滲透測試來評估模型的隱私洩漏程度。然而,在許多情境下,這種分析方法的效果仍然有限。 知識蒸餾作為現代模型訓練中常用的方法之一,其目標與成員推論攻擊有一定的相似性,這引發了我們對將知識蒸餾應用於成員推論攻擊的研究興趣。本研究旨在設計一種基於知識蒸餾技術的成員推論攻擊方法。我們的方法假設攻擊者對目標模型的內部結構一無所知,即將其視為黑盒子,並透過我們設計的模型來推斷目標模型的原始訓練資料。透過實驗驗證了方法的效果,並且在方法的可行性方面取得了較好的成果。 |
| 英文摘要 |
In recent years, there have been significant advancements in artificial intelligence technology, and systems with practical value have indeed been integrated into various aspects of work and life. Machine learning models have become indispensable tools across different fields. However, the widespread application of these models also brings potential risks of privacy leakage. Many researchers have begun to focus on the possibility that machine learning models could be used to infer information related to their training data. Membership inference attacks, as an emerging model privacy analysis technique, assess the degree of privacy leakage through penetration testing. However, in many scenarios, the effectiveness of this analytical method remains limited. Knowledge distillation, as one of the commonly used methods in modern model training, shares certain similarities with membership inference attacks, which has sparked our interest in researching the application of knowledge distillation to membership inference attacks. This study aims to design a membership inference attack method based on knowledge distillation techniques. Our approach assumes that the attacker knows nothing about the internal structure of the target model, treating it as a black box, and infers the original training data of the target model through our designed model. Experimental results demonstrated the effectiveness of the proposed method and confirmed its feasibility with favorable outcomes. |
| 第三語言摘要 | |
| 論文目次 |
第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 論文架構 3 第二章 文獻回顧 4 2.1 知識蒸餾 4 2.2 成員推論攻擊 5 2.3 深度神經網路 8 2.4 卷積神經網路 9 2.5 綜合研究 11 第三章 系統架構 14 3.1 攻擊者假設 14 3.2 系統建置流程 15 3.3 影子模型 19 3.4 攻擊模型分類器 23 第四章 實驗與討論 26 4.1 實驗環境 26 4.2 目標模型實驗 27 4.3 影子模型實驗 30 4.3.1 影子模型建置 30 4.3.2 相異架構變因實驗 32 4.3.3 相異訓練標籤實驗 34 4.3.4 投票實驗 35 4.4 攻擊模型實驗 35 4.4.1 分類器建置 36 4.4.2 訓練資料增量實驗 38 4.4.3 多分類器投票實驗 40 4.4.4 卷積神經網路之分類器實驗 41 4.5 比較與討論 43 第五章 結論與未來研究方向 45 參考文獻 47 圖目錄 圖2.1-1 知識蒸餾訓練方法之示意圖 5 圖2.2-1 MIA方法流程示意圖 7 圖2.4-1 卷積運算方式之示意圖 10 圖2.4-2最大池化運算方式之示意圖 11 圖3.2-1 攻擊模式展示圖 16 圖3.2-2 方法系統之整體架構圖 17 圖3.2-3 影子模型建置示意圖 17 圖3.2-4 攻擊模型分類器建置示意圖 18 圖3.2-5 攻擊階段示意圖 19 圖3.3-1 資料集切割與分配方式 21 圖3.3-2 目標模型架構 22 圖3.3-3 影子模型架構 23 圖3.4-1 攻擊模型分類器架構 24 圖3.4-2 攻擊模型分類器投票做法 25 圖4.1-1 CIFAR-10資料集範例圖片 27 圖4.2-1 目標模型之測試效果 29 圖4.3.1-1 影子模型之測試效果 31 圖4.4.1-1 攻擊模型分類器在相異epoch數之測試效果 37 圖4.4.2-1 訓練資料增量分類器實驗的測試評估之混淆矩陣 39 圖4.4.3-1 分類器投票實驗的測試效果之混淆矩陣 40 圖4.4.4-1 攻擊模型CNN分類器的模擬攻擊效果 43 表目錄 表4.1-1 實驗硬體環境 26 表4.3.2-1 Kenerl數量對影子模型效果之影響 33 表4.3.2-2 卷積層數對影子模型效果之影響 33 表4.3.2-3 全連接層架構對影子模型效果之影響 33 表4.4.1-1 相異架構之攻擊模型分類器效果 37 表4.4.2-1 訓練資料增量分類器實驗之完整數據 39 表4.4.2-1 分類器投票實驗之完整數據 41 表4.4.4-1 攻擊模型CNN分類器的實作架構總攬 42 |
| 參考文獻 |
[1] N. Carlini, C. Liu, Ú. Erlingsson, J. Kos, and D. Song, "The secret sharer: Evaluating and testing unintended memorization in neural networks," in 28th USENIX security symposium (USENIX security 19), 2019, pp. 267-284.
[2] M. Fredrikson, S. Jha, and T. Ristenpart, "Model inversion attacks that exploit confidence information and basic countermeasures," in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015, pp. 1322-1333.
[3] A. Salem, Y. Zhang, M. Humbert, P. Berrang, M. Fritz, and M. Backes, "Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models," arXiv preprint arXiv:1806.01246, 2018.
[4] R. Shokri, M. Stronati, C. Song, and V. Shmatikov, "Membership inference attacks against machine learning models," in 2017 IEEE symposium on security and privacy (SP), 2017: IEEE, pp. 3-18.
[5] G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv preprint arXiv:1503.02531, 2015.
[6] J. Ye, A. Maddi, S. K. Murakonda, V. Bindschaedler, and R. Shokri, "Enhanced membership inference attacks against machine learning models," in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022, pp. 3093-3106.
[7] C. Buciluǎ, R. Caruana, and A. Niculescu-Mizil, "Model compression," in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 535-541.
[8] T. Furlanello, Z. Lipton, M. Tschannen, L. Itti, and A. Anandkumar, "Born again neural networks," in International conference on machine learning, 2018: PMLR, pp. 1607-1616.
[9] H. Hu, Z. Salcic, L. Sun, G. Dobbie, P. S. Yu, and X. Zhang, "Membership inference attacks on machine learning: A survey," ACM Computing Surveys (CSUR), vol. 54, no. 11s, pp. 1-37, 2022.
[10] S. Truex, L. Liu, M. E. Gursoy, L. Yu, and W. Wei, "Demystifying membership inference attacks in machine learning as a service," IEEE transactions on services computing, vol. 14, no. 6, pp. 2073-2089, 2019.
[11] C. Dwork, F. McSherry, K. Nissim, and A. Smith, "Calibrating noise to sensitivity in private data analysis," in Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, 2006: Springer, pp. 265-284.
[12] M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, and T. Ristenpart, "Privacy in pharmacogenetics: An {End-to-End} case study of personalized warfarin dosing," in 23rd USENIX security symposium (USENIX Security 14), 2014, pp. 17-32.
[13] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp. 436-444, 2015.
[14] J. J. Hopfield, "Neural networks and physical systems with emergent collective computational abilities," Proceedings of the national academy of sciences, vol. 79, no. 8, pp. 2554-2558, 1982.
[15] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[16] A. Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, "A survey of the recent architectures of deep convolutional neural networks," Artificial intelligence review, vol. 53, pp. 5455-5516, 2020.
[17] L. Alzubaidi et al., "Review of deep learning: concepts, CNN architectures, challenges, applications, future directions," Journal of big Data, vol. 8, pp. 1-74, 2021.
[18] Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, "A survey of convolutional neural networks: analysis, applications, and prospects," IEEE transactions on neural networks and learning systems, vol. 33, no. 12, pp. 6999-7019, 2021.
[19] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[20] O. Russakovsky et al., "Imagenet large scale visual recognition challenge," International journal of computer vision, vol. 115, pp. 211-252, 2015.
[21] Y. Liu, Z. Zhao, M. Backes, and Y. Zhang, "Membership inference attacks by exploiting loss trajectory," in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022, pp. 2085-2098.
[22] X. Tang et al., "Mitigating membership inference attacks by {Self-Distillation} through a novel ensemble architecture," in 31st USENIX security symposium (USENIX security 22), 2022, pp. 1433-1450.
[23] N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, "Membership inference attacks from first principles," in 2022 IEEE symposium on security and privacy (SP), 2022: IEEE, pp. 1897-1914.
[24] L. Song and P. Mittal, "Systematic evaluation of privacy risks of machine learning models," in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 2615-2632.
[25] L. Watson, C. Guo, G. Cormode, and A. Sablayrolles, "On the importance of difficulty calibration in membership inference attacks," arXiv preprint arXiv:2111.08440, 2021.
[26] S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha, "Privacy risk in machine learning: Analyzing the connection to overfitting," in 2018 IEEE 31st computer security foundations symposium (CSF), 2018: IEEE, pp. 268-282.
[27] A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images," 2009.
|
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信