§ 瀏覽學位論文書目資料
  
系統識別號 U0002-0901202016032900
DOI 10.6846/TKU.2020.00195
論文名稱(中文) 偵測軟體漏洞的自動化方法:基於長短期記憶雙向殘差神經網絡
論文名稱(英文) An automatic methodology of detecting vulnerabilities in software using Bi-directional long short-term memory residual neural network
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 108
學期 1
出版年 109
研究生(中文) 郭勝騎
研究生(英文) Sheng-Chi Kuo
學號 606410446
學位類別 碩士
語言別 英文
第二語言別
口試日期 2020-01-15
論文頁數 20頁
口試委員 指導教授 - 汪柏(sinolegend@gmail.com)
委員 - 洪文斌(Horng@mail.tku.edu.tw)
委員 - 彭文建(Pchw8598@mail.chilee.tku.edu.tw)
關鍵字(中) 靜態分析
深度學習
代碼審計
軟體分析
關鍵字(英) Static analysis
deep learning
Software security
zero-day
software analysis
第三語言關鍵字
學科別分類
中文摘要
零日攻擊是一個未公開的漏洞,黑客可以利用該漏洞對計算機程序產生不利影
響。
2017 年5 月,零日勒索軟件WannaCry 造成了全球性災難,從入侵英國國家衛生
服務系統的醫院離線到關閉日本本田汽車公司[1],在世界範圍內造成了無數經
濟損失。
WannaCry 通過EternalBlue 傳播,EternalBlue 是美國國家安全局(NSA)針對
較舊的Windows 系統開發的零時差漏洞。
到目前為止,零日攻擊仍在出現,在任何時間點,這種網絡威脅引起了對零日檢
測軟件漏洞的關注,成為一個必須要解決的迫切問題。
隨著黑客技術的發展,漏洞數量呈指數級增長。
從2010 年到2015 年,在CVE(通用漏洞枚舉)中新註冊的大約80,000 個漏洞
的數量正在增加[3]對於軟件漏洞檢測,傳統解決方案分別有其缺點,對於靜態
分析(使用特徵辨析)中的特徵提取通常取決於專家經驗來人為定義。
人工分析不僅容易出錯,而且耗時,而對於基於符號執行的其他狀態解決方案,
它們經常會遇到路徑爆炸問題,從而導致在大型項目中難以實施。
在這種情況下,一種既高效又準確的自動化漏洞檢測的新方法已成為當務之急。
近年來,有許多研究團隊致力於漏洞檢測的自動化。例如NeuFuzzy [5]使用深
度學習提高模糊測試的效率,以及VulDeePecker(Vulnerability Deep Pecker)
基於深度學習的漏洞檢測方法,從而無需手動定義功能。
在本文中,我們基於現有的用於靜態分析的雙向LSTM 實現了我們框架的原型,
並在兩個不同的測試套件上對其進行了評估:
LAVA-M 和四個實際應用程序。實驗結果表明我們的框架可以找到
比現行方法更多的漏洞。
英文摘要
A zero-day attack is an undisclosed vulnerability that hackers can exploit to adversely affect computer programs.
May 2017, a zero-day ransomware WannaCry has caused world-wide catastrophe, from knocking U.K. National Health Service hospitalsoffline to shutting down a Honda Motor Company in Japan [1], caused numerous economic damage around the world.
The WannaCry propagated through EternalBlue, a zero-day developed by the United States National Security Agency (NSA) for olderWindows systems [2].
So far, Zero-day attacks are still emerging, thus at that point in time, such cyber threat draw the attention in zero-day detecting software vulnerability become a critical problem.
As hacking techniques become advanced, vulnerabilities have been exponentially increasing.
The number of vulnerabilities in which about 80,000 vulnerabilities are newly registered in CVE (Common Vulnerability Enumeration) from 2010 to 2015 is increasing [3]. For software vulnerability detection, the traditional solution have its own shortcoming respectively, for presence solutions in static analysis like pattern recolonization is often depended on the expert experience to manually defined features.
Artificial analysis not only an error-prone task but also a time consuming job, for another presence solutions based on symbolic execution, they often have path explosion problem resulting in having difficulty implementing in large project.
In this context, a new way for automation of vulnerability detection with both high efficiency and accuracy become a matter of urgency.
In recent years, there are number of research teams have committed themselves on automation of vulnerabilities detection. like NeuFuzzy [5] using the deep learning to
improve the efficiency of Fuzzy test, and VulDeePecker(Vulnerability Deep Pecker) a vulnerability detection method based on deep learning, which eliminates the need to manually define features.
In this paper, we implemented a prototype of our framework based on an existing Bidirectional LSTM for static analysis and evaluated it on two different test suites: LAVA-M and four real-world applications. The experimental results showed that our framework can find more vulnerabilities than the presence work. We have found 8 new security bugs in these applications, 6 of which have been assigned as CVE IDs. Index Terms— Software security, Deep Learning, zero-day, static analysis, software analysis.
第三語言摘要
論文目次
Chapter 1	- 1 -
Introduction	- 1 -
1.1 Background Introduction	- 1 -
1.2 Field of our work	- 4 -
1.3 Relate works in static source code analysis	- 4 -
1.4 Challenges	- 5 -
1.5 Our Contribution	- 8 -
Chapter 2	- 9 -
Methods	- 9 -
2.1 Training data preparation	- 9 -
2.2	Data pre-processing	- 11 -
2.3	Model training	- 16 -
Chapter 3 Conclusion	- 17 -
3.1 Comparison in result	- 17 -
3.2 Limitation and Future Works	- 17 -
References	- 19 -

List of Figures
Figure 1  - 7 -
Figure 2  - 11 -
Figure 3  - 12 -
Figure 4  - 12 -
Figure 5  - 13 -
Figure 6  - 13 -
Figure 7  - 14 -
Figure 8  - 16 -
Figure 9  - 17 -
參考文獻
[1] Honda halts japan car plant after wannacry virus hits computer network, June 2017.
http://www.reuters.com/ article/us-honda-cyberattack-idUSKBN19C0EI.
[2] nakashima2017nsa NSA officials worried about the day its potent hacking tool would get loose. Then it did, Nakashima, Ellen and Timberg, Crai, Washington Post, 2017
[3] U.S. National Vulnerability Database. http://cve.mitre.org/cve/
[4] SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, Zhaoxuan Chen
[5] NeuFuzz: Efficient Fuzzing With Deep Neural Network Yunchao Wang ; Zehui Wu ; Qiang Wei ; Qingxian Wang
[6] VulDeePecker: A Deep Learning-Based System for Vulnerability Detection Zhen Li, Deqing Zou] , Shouhuai Xu , Xinyu Ou , Hai Jin , Sujuan Wang , Zhijun Deng and Yuyi Zhong Services Computing Technology and System Lab, Big Data Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology 

[7] S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards realtime object detection with region proposal networks, in Advances in Neural Information Processing Systems, 2015, pp. 9199.

[8] A. Shrivastava, A. Gupta, and R. B. Girshick, Training region-based object detectors with online hard example mining, in 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 761769.

[9] Z. Xu, T. Kremenek, and J. Zhang, A memory model for static analysis of C programs, in Proc. 4th Int. Conf. Leveraging Applications of Formal Methods, Verification, and Validation, pp. 535548, 2010.
[10] Serena E. Ponta and Henrik Plate and Antonino Sabetta and Michele Bezzi and Cedric Dangremont https://arxiv.org/pdf/1902.02595.pdf A Manually-Curated Dataset of Fixes to
Vulnerabilities of Open-Source Software, 2019
[11] NIST, Julite test suite v1.3 2017.https://semate.nist.gov/SRD/testsuite.php
[12] https://cwe.mitre.org/about/index.html
[13] AEG: Automatic Exploit Generation Thanassis Avgerinos, Sang Kil Cha, Brent Lim Tze Hao and David Brumley Carnegie Mellon University, Pittsburgh, PA thanassis, sangkilc, brentlim, dbrumley@cmu.edu

[14] N. Nethercote, Dynamic Binary Analysis and Instrumentation or Building Tools is Easy, Ph.D. dissertation, Trinity College, 2004.
[15] Github) https://github.com/
[16] VulDeePecker: A Deep Learning-Based System for Vulnerability Detection Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, Yuyi Zhong
[17] LAVA: Large-scale Automated Vulnerability Addition, Brendan Dolan-Gavitt, Patrick
Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, Ryan Whelan
[18] Bidirectional Recurrent Neural Networks Mike Schuster and Kuldip K. Paliwal, Member,
IEEE
[19] LLVM: A Compilation Framework for Lifelong Program Analysis Transformation
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信