| 系統識別號 | U0002-1808202108231000 |
|---|---|
| DOI | 10.6846/TKU.2021.00440 |
| 論文名稱(中文) | 利用深度學習演算法分析中文自然語言利用遞迴神經網路 (RNN) 分析淡江大學的學則 |
| 論文名稱(英文) | Chinese Natural Language Processing Model Based on Deep Learning Tamkang University Academic Policies Analysis using a RNN |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 數學學系數學與數據科學碩士班 |
| 系所名稱(英文) | Master's Program, Department of Mathematics |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 109 |
| 學期 | 2 |
| 出版年 | 110 |
| 研究生(中文) | 李魁修 |
| 研究生(英文) | Kuei-Hsiu Lee |
| 學號 | 606190105 |
| 學位類別 | 碩士 |
| 語言別 | 繁體中文 |
| 第二語言別 | |
| 口試日期 | 2021-07-21 |
| 論文頁數 | 23頁 |
| 口試委員 |
指導教授
-
楊定揮
委員 - 鄭凱仁 委員 - 林建仲 |
| 關鍵字(中) |
深度學習 神經網路 自然語言 淡江大學學則 |
| 關鍵字(英) |
Deep Learning Neural Network Natural Language Tamkang University Academic Policies |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
現在的網路資訊都很方便,很多大大小小的問題都能在網路上所找到,但是,有些問題是無法藉由關鍵字匹對找到答案,或許是資料過於龐大而每有發現到它,亦或是可能有相見似的答案而搜尋引擎無法回答出來。所以我們想建立一個小規模的搜尋引擎去實現能找出相似的答案符合我們心中的問題。我們將藉由淡江大學學則作為資料庫去研發出一個針對這個資料庫的小型搜尋引擎,只要打出問題可以找到資料庫給你的答案。本論文針對深度學習下神經網路的多文本分析問題,藉由介紹機器語言一些處理的技巧,從文章上抽取關鍵的詞語,詞語跟詞語間的關聯性跟如何將這些詞語集合的重要主題,應用在淡江大學學則上。最後建立的預測模型的結果並不完美,由損失函數及預測準確率的趨勢圖可知道準確率並不是很高,這個機器回答的問題並不是我心中所想的,可能是這個搜尋引擎的資料庫並不龐大,或是各個類別間底下的學則彼此不夠獨立,導致所問的問題可能會跑到其他的類別上,如何除錯這些問題就是以後的研究。創立一個全新的搜尋引擎,方便去讓我們更能找到所需要的東西,而其研究過程也能適用於其他文章或是網站上,讓各個地方都能運用上此研究。語言處理上還運用到了分詞,統計建立模型在這個神經網路上的研究是個重要的研究主題。 |
| 英文摘要 |
The current Internet information is very convenient and many problems can be found on the Internet. However, some questions cannot be answered by keyword matching. Perhaps the data is too large and every time it is found or there may be similar answers that search engines cannot answer. So we want to build asmall-scale search engine to find similar answers to the questions in our minds. We will use Tamkang University Academic Policies as a database to develop a small search engine for this database. Just type in the question to find the answer that the database gives you. This thesis focuses on the multi-text analysis of neural networks under deep learning. By introducing some processing techniques of machine language, it extracts key words from the article, the relationship between words and words, and the important topics of how to group these words together. Applied to the academic rules of Tamkang University. The result of the final prediction model is not perfect. From the loss function and the trend graph of the prediction accuracy, we can see that the accuracy is not very high. The question answered by this machine is not what I thought. It may be that the database of this search engine is not huge, or the underlying learning among the various categories is not sufficiently independent of each other, causing the questions asked may go to other categories. How to debug these problems is for future research. Create a brand new search engine so that we can find what we need more easily, and its research process can also be applied to other articles or websites, so that this research can be used everywhere. Word segmentation is also used in language processing, and the study of statistical modeling on this neural network is an important research topic. |
| 第三語言摘要 | |
| 論文目次 |
目錄
1. 緒論.....................................1
1.1 研究背景................................1
1.2 研究目的................................2
1.3 研究問題................................2
1.4 研究價值................................3
2. 研究方法.................................3
2.1 機器學習,深度學習.......................3
2.2 人工神經網路(Artificial Neural Network),
循規神經網路(RNN).......................5
2.3 長短期記憶模型(LSTM)....................7
2.4 研究設計................................9
2.5 研究材料及資料蒐集程序..................11
2.6 資料分析................................12
3. 研究結果.................................14
4. 討論.....................................15
4.1 研究結論................................15
4.2 探討重要的研究發現......................15
4.3 研究限制................................16
4.4 建議未來研究............................17
附錄:完整的程式碼...........................18
References..................................22
表目錄
表1 問題及預測後的結果表....................14
圖目錄
Figure.1 RNN的示意圖........................6
Figure.2 LSTM的示意圖.......................8
Figure.3 程式設計架構圖.....................11
Figure.4 損失函數趨勢圖.....................13
Figure.5 準確率趨勢圖.......................14
|
| 參考文獻 |
[1] https://aclanthology.org/O03-1014.pdf [2] 施威銘, 人深度學習是機器學習中的一個子領域..., 2019. [3] https://en.wikipedia.org/wiki/Artificial_neural_network [4] https://iter01.com/583035.html [5] https:// brohrer.mcknote.com/ zh-Hant/ how_machine_learning _works/ how_rnns_lstm_work.html [6] https://zhuanlan.zhihu.com/p/30844905 [7] 施威銘, 人的記憶有分為長期記憶和短期記憶..., 2020. [8] https://en.wikipedia.org/wiki/Long_short-term_memory [9] https://blog.csdn.net/Geek_of_CSDN/article/details/86559464 |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信