§ 瀏覽學位論文書目資料
  
系統識別號 U0002-1108202510554200
DOI 10.6846/tku202500692
論文名稱(中文) 基於影像的惡意軟體/良性軟體偵測及未知軟體分類
論文名稱(英文) An Image-Based Malware/Benign Wares Detection with Classifying Unknown Wares
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 113
學期 2
出版年 114
研究生(中文) 蘇梓軒
研究生(英文) TZU-HSUAN SU
學號 611410332
學位類別 碩士
語言別 英文
第二語言別
口試日期 2025-06-27
論文頁數 49頁
口試委員 口試委員 - 劉譯閎(randyliu@scu.edu.tw)
口試委員 - 陳世興(shchen@mail.tku.edu.tw)
口試委員 - 黃仁俊(victor@gms.tku.edu.tw)
指導教授 - 黃心嘉(sjhwang@mail.tku.edu.tw)
關鍵字(中) 惡意軟體檢測
電腦視覺
深度學習
關鍵字(英) Malware Dection
Computer Vision
Deep Learning
第三語言關鍵字
學科別分類
中文摘要
本篇研究提出了一種將變分自編碼器(Variational AutoEncoder, VAE)與生成對抗網路(Generative Adversarial Network, GAN)混合之良性與惡意軟體辨識架構,幫助模型提升對於分辨未知樣本的泛化能力與視覺辨識的穩健性。我們首先會將惡意與良性軟體的可執行檔轉換為影像格式,透過 VAE 學習潛在空間中語意一致的特徵表徵,並進一步將預訓練的解碼器作為 GAN 的生成器與鑑別器互相學習對抗。為強化類別辨識能力,我們依據類別分離策略設計兩組獨立的 GAN,分別針對良性與惡意樣本進行學習,使每個鑑別器能更專注於其所屬類別之樣本分佈。當對抗訓練完成後,將鑑別器重用為分類器,進行惡意樣本的判斷任務。研究的結果顯示,本研究所提出的混合模型在 MaleVis 資料集上能達到 98.58% 的準確率,Malimg資料集上達到99.36%的準確度,且在模擬未知樣本的測試情境中仍保持 94.86% 的準確率,顯示本方法於面對未知威脅時具有良好之檢測能力與實用性。
英文摘要
A hybrid classification framework integrating a Variational AutoEncoder (VAE) and a Generative Adversarial Network (GAN) is proposed to improve generalization and robustness in visual recognition of unknown software samples. Executable files from both benign and malicious software are initially converted into image representations. The VAE is trained to learn semantically meaningful latent features. Its pretrained decoder is then adopted as the generator in a GAN, trained adversarially with a discriminator.
To enhance class-specific representation learning, two distinct GANs are constructed based on a class-wise separation strategy: one trained exclusively on benign samples and the other on malware. This configuration enables each discriminator to specialize in the distribution of its respective class. Upon completion of adversarial training, the discriminators are employed as classifiers for final malware detection.

Experimental results indicate that the proposed hybrid model achieves an accuracy of 98.58% on the MaleVis dataset and 99.36% on the Malimg dataset. Furthermore, under simulated scenarios involving previously unseen samples, the model achieves 94.86% accuracy, highlighting its effectiveness and practical applicability in detecting emerging threats.
第三語言摘要
論文目次
Table of Contents
Chapter 1	1
Chapter 2	4
2.1 Machine learning	4
2.2 Image-Based Malware Detection Using Deep Learning	4
2.3 Transformer and Hybrid Models	7
2.4 Generative and Transfer Learning	8
Chapter 3	10
3.1 System Architecture	10
(1) Software Visualization: Converting binaries into image format	11
(2) Feature Compression: Latent space learning with a Variational AutoEncoder (VAE)	12
(3) Adversarial Training: Using the VAE decoder as a GAN generator	14
(4) Classification Task: Reusing the Discriminators as Malware Detectors	16
3.2 Classification method	17
Chapter 4	20
4.1 Experiment settings	20
4.2 Dataset	21
4.2.1  MaleVis Dataset	21
4.2.2  Malimg Dataset	21
4.3  Evaluation Metrics	24
4.4  Experimental Results	25
4.5  Comparison of Hyperparameter Settings	30
Chapter 5	35
Reference	37

List of Figures
Figure 1: Architecture of proposed model	11
Figure 2: Architecture of Variational AutoEncoder	12
Figure 3: Architecture of proposed model Discriminator	14
Figure 4: Malevis Dataset categories and families distribution	23
Figure 5: Malimg Dataset categories and families distribution	23
Figure 6: Training Loss Curves of Variational AutoEncoder and GAN	33

 
List of Table
Table 1: Decision logic table in our model	18
Table 2 : Hardware and System Environment	20
Table 3 : Malevis dataset compare with others	28
Table 4 : Malimg dataset compare with others	29
Table 5 : Performance of different Autoencoders for our architecture	31
Table 6 : Latent dim	32
Table 7 : KL divergence loss weight	33

參考文獻
Reference
[1]	Nataraj, L., Karthikeyan, S., Jacob, G., & Manjunath, B. S. (2011). Malware images: visualization and automatic classification. In Proceedings of the 8th international symposium on visualization for cyber security (pp. 1–7).
[2]	Mohammed, Tajuddin Manhar, et al. "Malware detection using frequency domain-based image visualization and deep learning." arXiv preprint arXiv:2101.10578 (2021).
[3]	Salas, Marcelo Palma, and Paulo Lício de Geus. "Deep learning applied to imbalanced malware datasets classification." Journal of Internet Services and Applications 15.1 (2024): 342-359.
[4]	Wong, Wei Kitt, Filbert H. Juwono, and Catur Apriono. "Vision-based malware detection: A transfer learning approach using optimal ecoc-svm configuration." Ieee Access 9 (2021): 159262-159270.
[5]	Kim, Jin-Young, Seok-Jun Bu, and Sung-Bae Cho. "Malware detection using deep transferred generative adversarial networks." Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, November 14-18, 2017, Proceedings, Part I 24. Springer International Publishing, 2017.
[6]	Shaik, Ayesha, et al. "Comparative analysis of imbalanced malware byteplot image classification using transfer learning." arXiv preprint arXiv:2310.02742 (2023).
[7]	Nair, Sarath Jayan, and Sreelakshmi R. Syam. "Comparing Transformers and CNN Approaches for Malware Detection: A Comprehensive Analysis." 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2024.
[8]	Masum, Mohammad, et al. "Ransomware classification and detection with machine learning algorithms." 2022 IEEE 12th annual computing and communication workshop and conference (CCWC). IEEE, 2022.
[9]	Han, KyoungSoo, BooJoong Kang, and Eul Gyu Im. "Malware analysis using visualized image matrices." The Scientific World Journal 2014.1 (2014): 132713.
[10]	Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
[11] Hemalatha, Jeyaprakash, et al. "An efficient densenet-based deep learning model for malware detection." Entropy 23.3 (2021): 344.
[12] Kim, Jin-Young, and Sung-Bae Cho. "Obfuscated malware detection using deep generative model based on global/local features." Computers & Security 112 (2022): 102501.
[13]	Narayanan, Barath Narayanan, Ouboti Djaneye-Boundjou, and Temesguen M. Kebede. "Performance analysis of machine learning and pattern recognition algorithms for malware classification." 2016 IEEE national aerospace and electronics conference (NAECON) and ohio innovation summit (OIS). IEEE, 2016.
[14]	Kim, Jin-Young, Seok-Jun Bu, and Sung-Bae Cho. "Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders." Information Sciences 460 (2018): 83-102.
[15]	Narayanan, Barath Narayanan, Ouboti Djaneye-Boundjou, and Temesguen M. Kebede. "Performance analysis of machine learning and pattern recognition algorithms for malware classification." 2016 IEEE national aerospace and electronics conference (NAECON) and ohio innovation summit (OIS). IEEE, 2016. 
[16]	Aslan, Ömer, and Abdullah Asim Yilmaz. "A new malware classification framework based on deep learning algorithms." Ieee Access 9 (2021): 87936-87951.
[17]	Bozkir, A. S., Cankaya, A. O., & Aydos, M. (2019, April). Utilization and comparision of convolutional neural networks in malware recognition. In 2019 27th signal processing and communications applications conference (SIU) (pp. 1-4). IEEE.
[18]	Kakisim, A. G., Nar, M., & Sogukpinar, I. (2020). Metamorphic malware identification using engine-specific patterns based on co-opcode graphs. Computer Standards & Interfaces, 71, 103443.
論文全文使用權限
國家圖書館
不同意無償授權國家圖書館
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信