| 系統識別號 | U0002-1108202510554200 |
|---|---|
| DOI | 10.6846/tku202500692 |
| 論文名稱(中文) | 基於影像的惡意軟體/良性軟體偵測及未知軟體分類 |
| 論文名稱(英文) | An Image-Based Malware/Benign Wares Detection with Classifying Unknown Wares |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 資訊工程學系碩士班 |
| 系所名稱(英文) | Department of Computer Science and Information Engineering |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 113 |
| 學期 | 2 |
| 出版年 | 114 |
| 研究生(中文) | 蘇梓軒 |
| 研究生(英文) | TZU-HSUAN SU |
| 學號 | 611410332 |
| 學位類別 | 碩士 |
| 語言別 | 英文 |
| 第二語言別 | |
| 口試日期 | 2025-06-27 |
| 論文頁數 | 49頁 |
| 口試委員 |
口試委員
-
劉譯閎(randyliu@scu.edu.tw)
口試委員 - 陳世興(shchen@mail.tku.edu.tw) 口試委員 - 黃仁俊(victor@gms.tku.edu.tw) 指導教授 - 黃心嘉(sjhwang@mail.tku.edu.tw) |
| 關鍵字(中) |
惡意軟體檢測 電腦視覺 深度學習 |
| 關鍵字(英) |
Malware Dection Computer Vision Deep Learning |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
本篇研究提出了一種將變分自編碼器(Variational AutoEncoder, VAE)與生成對抗網路(Generative Adversarial Network, GAN)混合之良性與惡意軟體辨識架構,幫助模型提升對於分辨未知樣本的泛化能力與視覺辨識的穩健性。我們首先會將惡意與良性軟體的可執行檔轉換為影像格式,透過 VAE 學習潛在空間中語意一致的特徵表徵,並進一步將預訓練的解碼器作為 GAN 的生成器與鑑別器互相學習對抗。為強化類別辨識能力,我們依據類別分離策略設計兩組獨立的 GAN,分別針對良性與惡意樣本進行學習,使每個鑑別器能更專注於其所屬類別之樣本分佈。當對抗訓練完成後,將鑑別器重用為分類器,進行惡意樣本的判斷任務。研究的結果顯示,本研究所提出的混合模型在 MaleVis 資料集上能達到 98.58% 的準確率,Malimg資料集上達到99.36%的準確度,且在模擬未知樣本的測試情境中仍保持 94.86% 的準確率,顯示本方法於面對未知威脅時具有良好之檢測能力與實用性。 |
| 英文摘要 |
A hybrid classification framework integrating a Variational AutoEncoder (VAE) and a Generative Adversarial Network (GAN) is proposed to improve generalization and robustness in visual recognition of unknown software samples. Executable files from both benign and malicious software are initially converted into image representations. The VAE is trained to learn semantically meaningful latent features. Its pretrained decoder is then adopted as the generator in a GAN, trained adversarially with a discriminator. To enhance class-specific representation learning, two distinct GANs are constructed based on a class-wise separation strategy: one trained exclusively on benign samples and the other on malware. This configuration enables each discriminator to specialize in the distribution of its respective class. Upon completion of adversarial training, the discriminators are employed as classifiers for final malware detection. Experimental results indicate that the proposed hybrid model achieves an accuracy of 98.58% on the MaleVis dataset and 99.36% on the Malimg dataset. Furthermore, under simulated scenarios involving previously unseen samples, the model achieves 94.86% accuracy, highlighting its effectiveness and practical applicability in detecting emerging threats. |
| 第三語言摘要 | |
| 論文目次 |
Table of Contents Chapter 1 1 Chapter 2 4 2.1 Machine learning 4 2.2 Image-Based Malware Detection Using Deep Learning 4 2.3 Transformer and Hybrid Models 7 2.4 Generative and Transfer Learning 8 Chapter 3 10 3.1 System Architecture 10 (1) Software Visualization: Converting binaries into image format 11 (2) Feature Compression: Latent space learning with a Variational AutoEncoder (VAE) 12 (3) Adversarial Training: Using the VAE decoder as a GAN generator 14 (4) Classification Task: Reusing the Discriminators as Malware Detectors 16 3.2 Classification method 17 Chapter 4 20 4.1 Experiment settings 20 4.2 Dataset 21 4.2.1 MaleVis Dataset 21 4.2.2 Malimg Dataset 21 4.3 Evaluation Metrics 24 4.4 Experimental Results 25 4.5 Comparison of Hyperparameter Settings 30 Chapter 5 35 Reference 37 List of Figures Figure 1: Architecture of proposed model 11 Figure 2: Architecture of Variational AutoEncoder 12 Figure 3: Architecture of proposed model Discriminator 14 Figure 4: Malevis Dataset categories and families distribution 23 Figure 5: Malimg Dataset categories and families distribution 23 Figure 6: Training Loss Curves of Variational AutoEncoder and GAN 33 List of Table Table 1: Decision logic table in our model 18 Table 2 : Hardware and System Environment 20 Table 3 : Malevis dataset compare with others 28 Table 4 : Malimg dataset compare with others 29 Table 5 : Performance of different Autoencoders for our architecture 31 Table 6 : Latent dim 32 Table 7 : KL divergence loss weight 33 |
| 參考文獻 |
Reference [1] Nataraj, L., Karthikeyan, S., Jacob, G., & Manjunath, B. S. (2011). Malware images: visualization and automatic classification. In Proceedings of the 8th international symposium on visualization for cyber security (pp. 1–7). [2] Mohammed, Tajuddin Manhar, et al. "Malware detection using frequency domain-based image visualization and deep learning." arXiv preprint arXiv:2101.10578 (2021). [3] Salas, Marcelo Palma, and Paulo Lício de Geus. "Deep learning applied to imbalanced malware datasets classification." Journal of Internet Services and Applications 15.1 (2024): 342-359. [4] Wong, Wei Kitt, Filbert H. Juwono, and Catur Apriono. "Vision-based malware detection: A transfer learning approach using optimal ecoc-svm configuration." Ieee Access 9 (2021): 159262-159270. [5] Kim, Jin-Young, Seok-Jun Bu, and Sung-Bae Cho. "Malware detection using deep transferred generative adversarial networks." Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, November 14-18, 2017, Proceedings, Part I 24. Springer International Publishing, 2017. [6] Shaik, Ayesha, et al. "Comparative analysis of imbalanced malware byteplot image classification using transfer learning." arXiv preprint arXiv:2310.02742 (2023). [7] Nair, Sarath Jayan, and Sreelakshmi R. Syam. "Comparing Transformers and CNN Approaches for Malware Detection: A Comprehensive Analysis." 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2024. [8] Masum, Mohammad, et al. "Ransomware classification and detection with machine learning algorithms." 2022 IEEE 12th annual computing and communication workshop and conference (CCWC). IEEE, 2022. [9] Han, KyoungSoo, BooJoong Kang, and Eul Gyu Im. "Malware analysis using visualized image matrices." The Scientific World Journal 2014.1 (2014): 132713. [10] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). [11] Hemalatha, Jeyaprakash, et al. "An efficient densenet-based deep learning model for malware detection." Entropy 23.3 (2021): 344. [12] Kim, Jin-Young, and Sung-Bae Cho. "Obfuscated malware detection using deep generative model based on global/local features." Computers & Security 112 (2022): 102501. [13] Narayanan, Barath Narayanan, Ouboti Djaneye-Boundjou, and Temesguen M. Kebede. "Performance analysis of machine learning and pattern recognition algorithms for malware classification." 2016 IEEE national aerospace and electronics conference (NAECON) and ohio innovation summit (OIS). IEEE, 2016. [14] Kim, Jin-Young, Seok-Jun Bu, and Sung-Bae Cho. "Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders." Information Sciences 460 (2018): 83-102. [15] Narayanan, Barath Narayanan, Ouboti Djaneye-Boundjou, and Temesguen M. Kebede. "Performance analysis of machine learning and pattern recognition algorithms for malware classification." 2016 IEEE national aerospace and electronics conference (NAECON) and ohio innovation summit (OIS). IEEE, 2016. [16] Aslan, Ömer, and Abdullah Asim Yilmaz. "A new malware classification framework based on deep learning algorithms." Ieee Access 9 (2021): 87936-87951. [17] Bozkir, A. S., Cankaya, A. O., & Aydos, M. (2019, April). Utilization and comparision of convolutional neural networks in malware recognition. In 2019 27th signal processing and communications applications conference (SIU) (pp. 1-4). IEEE. [18] Kakisim, A. G., Nar, M., & Sogukpinar, I. (2020). Metamorphic malware identification using engine-specific patterns based on co-opcode graphs. Computer Standards & Interfaces, 71, 103443. |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信