淡江大學覺生紀念圖書館 (TKU Library)

系統識別號 U0002-0901202016032900
中文論文名稱 偵測軟體漏洞的自動化方法:基於長短期記憶雙向殘差神經網絡
英文論文名稱 An automatic methodology of detecting vulnerabilities in software using Bi-directional long short-term memory residual neural network
校院名稱 淡江大學
系所名稱(中) 資訊工程學系碩士班
系所名稱(英) Department of Computer Science and Information Engineering
學年度 108
學期 1
出版年 109
研究生中文姓名 郭勝騎
研究生英文姓名 Sheng-Chi Kuo
學號 606410446
學位類別 碩士
語文別 英文
口試日期 2020-01-15
論文頁數 20頁
口試委員 指導教授-汪柏
中文關鍵字 靜態分析  深度學習  代碼審計  軟體分析 
英文關鍵字 Static analysis  deep learning  Software security  zero-day  software analysis 
學科別分類 學科別應用科學資訊工程
中文摘要 零日攻擊是一個未公開的漏洞,黑客可以利用該漏洞對計算機程序產生不利影
2017 年5 月,零日勒索軟件WannaCry 造成了全球性災難,從入侵英國國家衛生
WannaCry 通過EternalBlue 傳播,EternalBlue 是美國國家安全局(NSA)針對
較舊的Windows 系統開發的零時差漏洞。
從2010 年到2015 年,在CVE(通用漏洞枚舉)中新註冊的大約80,000 個漏洞
近年來,有許多研究團隊致力於漏洞檢測的自動化。例如NeuFuzzy [5]使用深
度學習提高模糊測試的效率,以及VulDeePecker(Vulnerability Deep Pecker)
在本文中,我們基於現有的用於靜態分析的雙向LSTM 實現了我們框架的原型,
LAVA-M 和四個實際應用程序。實驗結果表明我們的框架可以找到
英文摘要 A zero-day attack is an undisclosed vulnerability that hackers can exploit to adversely affect computer programs.
May 2017, a zero-day ransomware WannaCry has caused world-wide catastrophe, from knocking U.K. National Health Service hospitalsoffline to shutting down a Honda Motor Company in Japan [1], caused numerous economic damage around the world.
The WannaCry propagated through EternalBlue, a zero-day developed by the United States National Security Agency (NSA) for olderWindows systems [2].
So far, Zero-day attacks are still emerging, thus at that point in time, such cyber threat draw the attention in zero-day detecting software vulnerability become a critical problem.
As hacking techniques become advanced, vulnerabilities have been exponentially increasing.
The number of vulnerabilities in which about 80,000 vulnerabilities are newly registered in CVE (Common Vulnerability Enumeration) from 2010 to 2015 is increasing [3]. For software vulnerability detection, the traditional solution have its own shortcoming respectively, for presence solutions in static analysis like pattern recolonization is often depended on the expert experience to manually defined features.
Artificial analysis not only an error-prone task but also a time consuming job, for another presence solutions based on symbolic execution, they often have path explosion problem resulting in having difficulty implementing in large project.
In this context, a new way for automation of vulnerability detection with both high efficiency and accuracy become a matter of urgency.
In recent years, there are number of research teams have committed themselves on automation of vulnerabilities detection. like NeuFuzzy [5] using the deep learning to
improve the efficiency of Fuzzy test, and VulDeePecker(Vulnerability Deep Pecker) a vulnerability detection method based on deep learning, which eliminates the need to manually define features.
In this paper, we implemented a prototype of our framework based on an existing Bidirectional LSTM for static analysis and evaluated it on two different test suites: LAVA-M and four real-world applications. The experimental results showed that our framework can find more vulnerabilities than the presence work. We have found 8 new security bugs in these applications, 6 of which have been assigned as CVE IDs. Index Terms— Software security, Deep Learning, zero-day, static analysis, software analysis.
論文目次 Chapter 1 - 1 -
Introduction - 1 -
1.1 Background Introduction - 1 -
1.2 Field of our work - 4 -
1.3 Relate works in static source code analysis - 4 -
1.4 Challenges - 5 -
1.5 Our Contribution - 8 -
Chapter 2 - 9 -
Methods - 9 -
2.1 Training data preparation - 9 -
2.2 Data pre-processing - 11 -
2.3 Model training - 16 -
Chapter 3 Conclusion - 17 -
3.1 Comparison in result - 17 -
3.2 Limitation and Future Works - 17 -
References - 19 -

List of Figures
Figure 1 - 7 -
Figure 2 - 11 -
Figure 3 - 12 -
Figure 4 - 12 -
Figure 5 - 13 -
Figure 6 - 13 -
Figure 7 - 14 -
Figure 8 - 16 -
Figure 9 - 17 -
參考文獻 [1] Honda halts japan car plant after wannacry virus hits computer network, June 2017.
http://www.reuters.com/ article/us-honda-cyberattack-idUSKBN19C0EI.
[2] nakashima2017nsa NSA officials worried about the day its potent hacking tool would get loose. Then it did, Nakashima, Ellen and Timberg, Crai, Washington Post, 2017
[3] U.S. National Vulnerability Database. http://cve.mitre.org/cve/
[4] SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, Zhaoxuan Chen
[5] NeuFuzz: Efficient Fuzzing With Deep Neural Network Yunchao Wang ; Zehui Wu ; Qiang Wei ; Qingxian Wang
[6] VulDeePecker: A Deep Learning-Based System for Vulnerability Detection Zhen Li, Deqing Zou] , Shouhuai Xu , Xinyu Ou , Hai Jin , Sujuan Wang , Zhijun Deng and Yuyi Zhong Services Computing Technology and System Lab, Big Data Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology

[7] S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards realtime object detection with region proposal networks, in Advances in Neural Information Processing Systems, 2015, pp. 9199.

[8] A. Shrivastava, A. Gupta, and R. B. Girshick, Training region-based object detectors with online hard example mining, in 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 761769.

[9] Z. Xu, T. Kremenek, and J. Zhang, A memory model for static analysis of C programs, in Proc. 4th Int. Conf. Leveraging Applications of Formal Methods, Verification, and Validation, pp. 535548, 2010.
[10] Serena E. Ponta and Henrik Plate and Antonino Sabetta and Michele Bezzi and Cedric Dangremont https://arxiv.org/pdf/1902.02595.pdf A Manually-Curated Dataset of Fixes to
Vulnerabilities of Open-Source Software, 2019
[11] NIST, Julite test suite v1.3 2017.https://semate.nist.gov/SRD/testsuite.php
[12] https://cwe.mitre.org/about/index.html
[13] AEG: Automatic Exploit Generation Thanassis Avgerinos, Sang Kil Cha, Brent Lim Tze Hao and David Brumley Carnegie Mellon University, Pittsburgh, PA thanassis, sangkilc, brentlim, dbrumley@cmu.edu

[14] N. Nethercote, Dynamic Binary Analysis and Instrumentation or Building Tools is Easy, Ph.D. dissertation, Trinity College, 2004.
[15] Github) https://github.com/
[16] VulDeePecker: A Deep Learning-Based System for Vulnerability Detection Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, Yuyi Zhong
[17] LAVA: Large-scale Automated Vulnerability Addition, Brendan Dolan-Gavitt, Patrick
Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, Ryan Whelan
[18] Bidirectional Recurrent Neural Networks Mike Schuster and Kuldip K. Paliwal, Member,
[19] LLVM: A Compilation Framework for Lifelong Program Analysis Transformation
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2020-02-24公開。
  • 同意授權瀏覽/列印電子全文服務,於2020-02-24起公開。

  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2486 或 來信