§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2307201918004900
DOI 10.6846/TKU.2019.00736
論文名稱(中文) 利用深度混合模型辨識Android惡意應用程式
論文名稱(英文) A Deep Learning Hybrid Model Detecting Android Malwares
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 107
學期 2
出版年 108
研究生(中文) 鍾豪
研究生(英文) Hao Chung
學號 606410420
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2019-06-11
論文頁數 25頁
口試委員 指導教授 - 黃心嘉
委員 - 顏嵩銘
委員 - 黃仁俊
委員 - 黃心嘉
關鍵字(中) Android
惡意應用程式
Android惡意應用程式辨識
深度學習
關鍵字(英) Android
malware apps
malware app detection
deep learning
第三語言關鍵字
學科別分類
中文摘要
面對數量不斷成長的Android惡意應用程式,發展透過靜態分析,以深度學習的方法來辨識惡意應用程式是十分重要的。相較於動態分析,靜態分析的優點是需要較少的計算資源與時間。由於惡意應用程式的演化與Android版本的推陳出新,導致需要新增特徵以維持良好的辨識率。然而,新增特徵對於許多現有的透過靜態分析,以深度學習的辨識方法,將導致需要全部重新訓練,十分耗時。為了解決此一問題,本研究將提出具有彈性、適應性佳及有效能的深度類神經網路模型。該模型包含兩個主要類神經網路,一個初始網路和一個整合網路。初始網路對不同形態特徵的採用分類擷取方式,具有部份調整彈性,而另一方面整合網路可以有效且善於辨識惡意應用程式。彈性意味該新網路可以有效方式調校以增加新特徵;適應性意味透過定期更新權重以維持辨識率;效率意味僅需調整特徵擷取的子模型部分。我們的使用API方法函數呼叫與權限兩種特徵的混合模型,具有研究價值與實用性,因為實驗顯示準確率可達98.15%。
英文摘要
Due to the growing number of Android malware apps, a deep learning approach with static analysis to detect malware apps is necessary.  Even though some malware apps’ detection utilizes dynamic analysis, the detection with static analysis needs less computation resources and computational time.  Due to the evolution of malware apps and the new released version of the Android operating system, more new features should be added to increase accuracy rates.  However, to add those new features, most of the proposed deep learning detections have to re-train totally again.  To overcome this problem, a flexible, adaptable, and efficient deep neural network learning hybrid model will be proposed.  This hybrid model contains two neural networks: initial neural network and final neural network.  The initial network is flexible to extract multiple feature sets while the final network is efficient and good at malware app detection.  The flexibility means that the initial network can be adjusted for adding new features.  Adaptable property means that the neural network can be easily modified weights periodically to maintain detection rate.  The efficiency means that re-training partially neural networks for maintaining detection rate without re-training overall neural networks.  Our hybrid model using API method calls and permission feature sets is research-valuable and practical, because our accuracy rate is 98.15%.
第三語言摘要
論文目次
Table of Content

Chapter 1 Introduction	1
Chapter 2 Review	7
Chapter 3 A Hybrid Model Scheme	11
 3-1 Feature Extraction Stage	12
 3-2 Model Training Stage	13
 3-3 Detection Stage	15
Chapter 4 Experiment Results	16
 4-1 Datasets	16
 4-2 Experimental Environment	16
 4-3 Hybrid Model Malware Detection Performance	16
 4-4 Effectiveness of Hybrid Model	17
 4-5 Discussions	19
Chapter 5 Conclusions	21
References	22

List of Figures

Fig. 1. Allix et al.’s Experiment Result	5
Fig. 2. Deep Learning Methodology	7
Fig. 3. Multimodal Architecture	10
Fig. 4. Hybrid Model Architecture	11
Fig. 5. The Architecture of CNN Sub-model	12

List of Tables

Table 1: Hyper-parameter of Hybrid Model Sub-model	15
Table 2: Hybrid Model Evaluation Metrics	17
Table 3: Performance Metrics for the Model Using Single Feature Set	18
Table 4: Performance Metrics for Our Hybrid Model Using Different Feature Sets	18
參考文獻
[1]	Statista. [Online]. Available: https://www.statista.com/statistics, Accessed: Dec. 17, 2018.
[2]	C. Lueg, “New malware every 10 seconds,” G Data, Bochum, Germany, Tech. Rep., May 2018. [Online]. Available: https://www.gdatasoftware.com/blog/2018/05/30735-new-malware-every-10-seconds
[3]	N. Peiravian and X. Zhu, “Machine Learning for Android Malware Detection Using Permission and API Calls,” in Proceedings of IEEE 25th International Conference on Tools with Artificial Intelligence, 2013, pp. 300-305.
[4]	Y. LeCun, Y. Bengio, and G. Hinton, "Deep Learning," Nature, vol. 521, pp. 436-444, 2015.
[5]	I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, An MIT Press book, 2016.
[6]	J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Networks, vol. 61, 2015, pp. 85-117.
[7]	D.-J. Wu, C.-H. Mao, T.-E. Wei, H.-M. Lee, and K.-P. Wu, “DroidMat: Android Malware Detection through Manifest and API Calls Tracing,” in Proceeding of the Seventh Asia Joint Conference on Information Security (Asia JCIS), Aug. 2012, pp. 62–69.
[8]	C.-Y. Huang, Y.-T. Tsai, and C.-H. Hsu, “Performance Evaluation on Permission-Based Detection for Android Malware,” in Advances in Intelligent Systems and Applications (Smart Innovation, Systems and Technologies), vol. 2. Berlin, Germany: Springer, 2013, pp. 111–120.
[9]	Z. Aung and W. Zaw, “Permission-Based Android Malware Detection,” International Journal of Science and Technology Research, vol. 2, no. 3, pp. 228–234, 2013.
[10]	L. Deshotels, V. Notani, and A. Lakhotia, “DroidLegacy: Automated Familial Classification of Android Malware,” in Proceedings of ACM SIGPLAN on Program Protection Reverse Engineering Workshop, 2014, Article no. 3.
[11]	M. Zhang, Y. Duan, H. Yin, and Z. Zhao, “Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs,” in Proceedings of ACM Conference on Computer and Communications Security (CCS), 2014, pp. 1105–1116.
[12]	D. Arp, M. Spreitzenbarth, M. Hübner, H. Gascon, and K. Rieck, “DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket,” in Proceedings of the Symposium on Network Distributed System Security (NDSS), vol. 14, 2014, pp. 23–26.
[13]	L. K. Yan and H. Yin, “DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis,” in Proceedings of 21st USENIX Security Symposium, 2012, pp. 569–584.
[14]	W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth, “TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones,” ACM Transactions on Computer Systems, vol. 32, no. 2, p. 5, 2014.
[15]	R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and A. Thomas, “Malware Classification with Recurrent Networks,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Apr. 2015, pp. 1916–1920.
[16]	Z. Yuan, Y. Lu, and Y. Xue, “Droiddetector: Android Malware Characterization and Detection Using Deep Learning,” Tsinghua Science and Technology, vol. 21, no. 1, pp. 114–123, Feb. 2016.
[17]	Y. Bengio, “Learning Deep Architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
[18]	R. Vinayakumar, K. P. Soman, and P. Poornachandran, “Deep Android Malware Detection and Classification,” in IEEE International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2017, pp. 1677–1683.
[19]	R. Nix and J. Zhang, “Classification of Android Apps and Malware Using Deep Neural Networks,” in IEEE International Joint Conference on Neural Networks (IJCNN), May 2017, pp. 1871–1878.
[20]	N. McLaughlin, J. M. d. Rincon, B. Kang, S. Yerima, P. Miller, S. Sezer, Y. Safaei, E. Trickel, Z. Zhao, A. Doupe, and G. J. Ahn, "Deep Android Malware Detection," in Proceedings of the Seventh ACM Conference on Data Application Security and Privacy (CODASPY), 2017, pp. 301-308.
[21]	E. B. Karbab, M. Debbabi, A. Derhab, D. Mouheb, “Maldozer: Automatic Framework for Android Malware Detection Using Deep Learning,” in Digital Investigation, vol. 24, Supplement, Mar. 2018, pp. S48–S59.
[22]	T. Kim, B. Kang, M. Rho, S. Sezer, and E. G. Im, “A Multimodal Deep Learning Method for Android Malware Detection Using Various Features,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 3, Mar. 2019, pp. 773-788.
[23]	T.-P. Liang, J. S. Chandler, and I. Han, “Integrating Statistical and Inductive Learning Methods for Knowledge Acquisition,” in Expert Systems with Applications, vol. 1, no. 4, 1990, pp. 391-401.
[24]	K. C. Lee, I. Han, and Y. Kwon, “Hybrid Neural Network Models for Bankruptcy Predictions,” in Decision Support Systems, vol. 18, no. 1, Sep. 1996, pp. 63-72.
[25]	G. P. Zhang, “Times Series Forecasting Using a Hybrid ARIMA and Neural Network Model,” in Neurocomputing, vol. 50, Jan. 2003, pp. 159-175.
[26]	K. Allix, T. F. Bissyandé, J. Klein, and Y. L. Traon, “Are Your Training Datasets Yet Relevant? An Investigation into the Importance of Timeline in Machine Learning-based Malware Detection,” in Engineering Secure Software and Systems, vol. 8978 of LNCS, pp. 51–67, Switzerland: Springer, 2015.
[27]	VirusShare. Accessed: Dec. 2018. [Online]. Available: https://virusshare.com
[28]	APKtool. Accessed: Dec. 2018. [Online]. Available: https://ibotpeaches.github.io/Apktool
[29]	D. P. Kingma and J. Ba. (2014). “Adam: A Method for Stochastic Optimization.” [Online]. Available: https://arxiv.org/abs/1412.6980
[30]	TensorFlow. Accessed: Dec. 2018. [Online]. Available: https://tensorflow.org
[31]	Keras. Accessed: Dec. 2018. [Online]. Available: https://keras.io
[32]	Google Play Store. Accessed: Dec. 2018. [Online]. Available: https://play.google.com/store
[33]	AndroGuard. Accessed: Dec. 2018. [Online]. Available: https://pypi.org/project/androguard
論文全文使用權限
校內
紙本論文於授權書繳交後2年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後2年公開
校外
同意授權
校外電子論文於授權書繳交後2年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信