電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2024-08-01起於校外公開使用
本論文紙本於2024-08-01起公開使用

系統識別號	U0002-2907202410470600
DOI	10.6846/tku202400597
論文名稱(中文)	基於OCR及深度學習之新聞分類系統
論文名稱(英文)	News Classification System Based On OCR And Deep Learning
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士在職專班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	112
學期	2
出版年	113
研究生(中文)	李承偉
研究生(英文)	Cheng-Wei Lee
學號	711410059
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2024-07-03
論文頁數	44頁
口試委員	指導教授 - 石貴平(kpshih@mail.tku.edu.tw) 口試委員 - 廖文華口試委員 - 蒯思齊口試委員 - 張志勇
關鍵字(中)	神經網路自然語言處理損失函數
關鍵字(英)	Deep learning Artificial neural network BERT model
第三語言關鍵字
學科別分類
中文摘要	本論文研究的主題是以深度學習為主的文字辨識技術，著重在利用深度學習，將類神經網路演算法加以應用，而在深度學習中，文字辨識將印刷或手寫的文字轉換成電腦可判別理解的格式，以便檢索或分析。當深度學習應用於文字辨識時，會使用BERT (Bidirectional Encoder Representations from Transformers)模型及LSTM (Long Short-Term Memory)模型，這是一個對於自然語言處理(NLP)領域中很適合的訓練模型，本論文的整個流程，從準備資料集並下載載入到模型，接著將文字資料轉換成模型可接受的格式，其中可能會包含文字中的單詞或子詞，接著建立深度學習模型架構，使用TensorFlow或PyTorch深度學習框架來進行模型訓練，定義損失函數、優化器和訓練的迭代次數，最後獲得經由以上AI模型校正後，衡量指標可能包括準確度、精確度，當所有的模型評估完成後，再將模型部屬到應用中，可能會將模型包裝成API服務，整合到網站或是應用程式中。
英文摘要	The artificial intelligence has been used this generation, not just in technology but in other different industries. Most companies have also begun to use automation, machine learning algorithms. Artificial intelligence will become the future trend. The research topic of this thesis is deep learning and optical character recognition. This thesis uses artificial neural network to complete the writing. In the process of deep learning research, optical character recognition converts copy or handwriting text into a format that computers can discern and understand. When I use deep learning for character recognition applications, I use the BERT model to train it. I also use TensorFlow or PyTorch deep learning framework for model training. I will use deep learning to calibrate the AI model, and I deploy the results to the application.
第三語言摘要
論文目次	摘要 i Abstract ii 目錄 iii 圖目錄 vi 表目錄 viii 第一章緒論 1 1.1 研究背景 1 1.2 研究動機 2 1.3 研究目的 2 1.4 論文架構 3 第二章文獻探討 4 2.1 OCR光學字元辨識 4 2.2 OCR系統基礎 5 2.3 文字檢測基礎 9 2.4 場景文字檢測 11 2.5 文字分割識別 12 2.6 模型訓練應用 13 第三章研究方法設計 19 3.1 資料收集與預處理 19 3.2 CNN特徵提取 20 3.3 模型設計與選擇 22 3.4 模型訓練優化與評估 26 第四章研究結果 30 4.1 研究背景與目的 30 4.2 OCR實驗結果 30 4.3 模型架構 31 4.4 訓練與測試 32 4.5 研究結果 33 4.6 模型差異比較 37 第五章結論與未來展望 40 5.1 結論 40 5.2 未來展望 41 參考文獻 43 圖目錄圖2.1 灰階公式圖 7 圖2.2 LeNet Model架構圖 10 圖2.3 LSTM 架構圖 13 圖2.4 LSTM - 記憶單元架構圖 14 圖2.5 LSTM 遺忘門示意圖 15 圖2.6 LSTM 輸入門示意圖 16 圖2.7 LSTM 輸出門示意圖 17 圖3.1 加載VGG16模型代碼示意圖 20 圖3.2 調整圖像像素代碼示意圖 21 圖3.3 正規化圖像代碼示意圖 21 圖3.4 提取特徵代碼示意圖 22 圖3.5 文本轉換整數序列代碼示意圖 23 圖3.6 整數標籤轉換為二進制代碼示意圖 23 圖3.7 LSTM輸出層代碼示意圖 25 圖3.8 模型預測代碼示意圖 26 圖3.9 計算性能指標代碼示意圖 28 圖3.10 混淆矩陣示意圖 28 圖3.11 分類報告代碼示意圖 29 圖4.1 BERT模型Transformer架構參數圖 32 表目錄表4.1 LSTM 模型混淆矩陣表 35 表4.2 BERT 模型混淆矩陣表 35 表4.3 LSTM 模型準確率、召回率及F1分數分類表 36 表4.4 BERT 模型準確率、召回率及F1分數分類表 36 表4.5 準確率、召回率及F1分數分類差異比較表 37
參考文獻	[1] T. Lee and D. Mumford, Hierarchical Bayesia inference in the visual cortex, J.Opt. Soc. Amer., vol.20, pt.7, pp.1434-1448, 2008 [2] Leon A Gatys, Alexander S Ecker, and Matthias Bethge. Image style transfer using con- volutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2414–2423, 2016. [3] Liu, Hailong, and Xiaoqing Ding. "Handwritten character recognition using gradient feature and quadratic classifier with multiple discrimination schemes." Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on. IEEE, 2005.A. [4] Dalal.N., B.Triggs, ‘Histogram of oriented gradients for human detection’, IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05). June 2005. [5] Jonathan Tompson, Murphy Stein, Yann Lecun, Ken Perlin, “ Real- Time Continuous Pose Recovery of Human Hands Using Convolutional Networks”, 23 September 2014, ACM Transactions on GraphicsVolume 33,Issue 5Article No.: 169,pp 1~10, [6] P. Le Callet, C. Viard-Gaudin, D. Barba, “A convolutional neural network approach for objective video quality assessment,” IEEE Transactions on Neural Networks, vol. 17, no. 5, 2006, pp.1316-1327. [7] Y LeCun, LD Jackel, L Bottou, C Cortes, JS Denker, H Drucker, I Guyon,” Learning algorithms for classification: A comparison on handwritten digit recognition,” Neural networks: the statistical mechanics perspective, pp. 261-276. [8] H. Lee, R. Grosse, R. Ranganath, A. Y. Ng, “Unsupervised learning of hierarchical representations with convolutional deep belief networks,” Communications of the ACM, vol. 54, no. 10, 2011, pp. 95-103. [9] K. Kavukcuoglu, P. Sermanet, Y. L. Boureau, K. Gregor, M. Mathieu, Y. LeCun, “Learning convolutional feature hierarchies for visual recognition,” 23rd International Conference on Neural Information Processing Systems (NIPS), vol. 1, Dec. 2010, pp. 1090-1098. [10] C. Cheng, X. Y. Zhang, X. H. Shao, X. D. Zhou, “Handwritten Chinese Character Recognition by Joint Classification and Similarity Ranking,” 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, Oct 2016 [11] C. C. Wu, C. H. Chou, and F. Chang, “A Machine-Learning Approach for Analyzing Document Layout Structures with Two Reading Orders,” Pattern Recognition, vol. 41, no. 10, 2008, pp. 3200-3213. [12] C. H. Chou, S. Y. Chu, and F. Chang, “Estimation of Skew Angles for Scanned Documents Based on Piecewise Covering by Parallelograms,” Pattern Recognition, vol. 40, no. 2, 2007, pp. 443-455.
論文全文使用權限	國家圖書館：同意無償授權國家圖書館，書目與全文電子檔於繳交授權書後, 於網際網路立即公開校內：校內紙本論文立即公開同意電子論文全文授權於全球公開校內電子論文立即公開校外：同意授權予資料庫廠商校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信