電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2019-09-27起於校外公開使用
本論文紙本於2019-09-27起公開使用

系統識別號	U0002-2609201910324400
DOI	10.6846/TKU.2019.00903
論文名稱(中文)	使用深度學習對檔案類型辨識之研究
論文名稱(英文)	Research on Using Deep Learning to File Type Identification
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士在職專班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	107
學期	2
出版年	108
研究生(中文)	蔡扶哲
研究生(英文)	Fu-Che Tsai
學號	706410163
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2019-06-29
論文頁數	29頁
口試委員	指導教授 - 陳建彰委員 - 楊權輝委員 - 洪文斌
關鍵字(中)	檔案辨識檔案位元組檔案片段
關鍵字(英)	File Identification File Bytes File Fragment
第三語言關鍵字
學科別分類
中文摘要	本研究旨在探討在電腦系統環境中如何辨識電腦檔案的類型，回顧檢視早期電腦辨識檔案的運作方式，對照過去研究辨識的方法，研發更新型的辨識技術，應用深度學習的實驗，開發新的檔案辨識技術。本研究採取深度學習的技術，比較多層神經網路模型與卷積神經網路模型，本研究隨機蒐集5種檔案類型，每種檔案類型各蒐集11000個，總計彙集了55000檔案，進行檔案位元組數據的擷取，針對檔案內容片段進行分析。研究結果顯示，在多層神經網路模型的實驗結果整體準確度達到98%，卷積神經網路模型的實驗結果整體準確度達到99%，這兩種辨識檔案類型的方法是基於檔案內容片段的分析，本研究的方法未來期望進一步延伸到其他檔案的研究與分析。
英文摘要	The purpose of this study is to explore how to identify the types of computer files in a computer system environment, review the operation of early computer identification files, develop newer identification techniques, apply deep learning experiments, and develop new ones based on past research identification methods. File identification technology. This study adopts the technique of deep learning, compares the multi-layer neural network model and the convolutional neural network model. This study randomly collects 5 file types, each of which collects 11,000 files, and aggregates 55,000 files for file bits. The capture of the group data is analyzed for the fragment of the archive content. The results show that the overall accuracy of the experimental results in the multi-layer neural network model reaches 98%, and the overall accuracy of the experimental results of the convolutional neural network model reaches 99%. The two methods for identifying file types are based on the archive content fragments. Analysis, the future expectation of this research method is further extended to the research and analysis of other archives.
第三語言摘要
論文目次	目錄目錄 i 圖目錄 ii 表目錄 iii 第一章緒論 1 1.1研究動機與目的 1 1.2名詞釋義 2 第二章文獻探討 5 2.1檔案類型的相關研究 5 2.2檔案識別方法研究 5 第三章研究方法 7 3.1研究架構 7 3.2研究檔案類型 8 3.3實驗模型 8 3.4資料處理 12 第四章研究結果 12 4.1 MLP& CNN實驗模型之訓練結果分析 12 4.2 檔案位元組長度與各檔案類型之預測結果 14 第五章結論與建議 17 5.1研究結論 17 5.2研究限制與建議 17 參考文獻 19 中文部分 19 英文部分 19 英文論文 21 圖目錄圖3-1 本研究架構圖 7 圖3-2 本研究MLP結構圖 9 圖3-3 本研究CNN結構圖 10 圖3-4 relu 激活函數 11 圖3-5 sigmoid 激活函數 12 圖4-1 MLP與CNN訓練圖形 13 圖4-2 MLP與CNN實驗混淆矩陣圖 15 表目錄表4-1 混淆矩陣準確度統計表 14
參考文獻	中文部分 [1] Wiki – the relu definition source https://zh.wikipedia.org/wiki/線性整流函數 [2] Wiki – the sigmoid definition source https://zh.wikipedia.org/wiki/S函數 [3] Wiki – the Keras definition source https://zh.wikipedia.org/wiki/Keras 英文部分 [4] M.C. Amirani, M. Toorani, A. Beheshti. “A new approach to content-based file type detection”. In 13th IEEE Symposium on Computers and Communications (ISCC’08), Marrakech, Morocco, 2008; 1103–1108. [5] D.J. Hickok, D.R. Lesniak, M.C. Rowe. “File type detection technology”. In 38th Midwest Instruction and Computing Symposium (MICS’05), Eau Claire,Wisconsin, 2005. [6] Q. Chen, Q. Liao, Z. L. Jiang, J. Fang, S. Yiu, G. Xi, Rong Li, Z. Yi, X. Wang, L. C.K. Hui, D. Liu, E. Zhang. “File Fragment Classification Using Grayscale Image Conversion and Deep Learning in Digital Forensics”. In 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, 2018 [7] W. Li, K. Wang, S.J. Stolfo, B. Herzog, “Fileprints: Identifying file types by n-gram Analysis,” Proceedings of the 6th IEEE Systems, Man and Cybernetics [8] M. McDaniel, “Automatic File Type Detection Algorithm,” Masters Thesis, James Madison University, 2001. [9] V. Roussev and S. L. Garfinkel, “File fragment classificationthe case for specialized approaches, ” in International IEEE Workshop on Systematic Approaches To Digital Forensic Engineering, 2009, pp. 3-14. [10] Wiki – the MD5 definition source https://zh.wikipedia.org/wiki/MD5 [11] FILExt, The file extension source http://www.filext.com [12] LeNet – the CNN definition source http://deeplearning.net/tutorial/lenet.html [13] Billatnapier– the Magic number source https://billatnapier.wordpress.com/2013/04/22/magic-numbers-in-files/
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信