系統識別號 | U0002-2901200711215200 |
---|---|
DOI | 10.6846/TKU.2007.00942 |
論文名稱(中文) | 以互動式中文回饋機制建置的自學系統 |
論文名稱(英文) | A Chinese Interactive Feedback Mechanism for a Self-Learning System |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系博士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 95 |
學期 | 1 |
出版年 | 96 |
研究生(中文) | 簡志宇 |
研究生(英文) | Chih-Yu Jian |
學號 | 890190118 |
學位類別 | 博士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2007-01-04 |
論文頁數 | 74頁 |
口試委員 |
指導教授
-
陳瑞發
委員 - 莊淇銘 委員 - 林偉川 委員 - 王英宏 委員 - 葛煥昭 委員 - 陳瑞發 |
關鍵字(中) |
自然語言 切詞方法 文法 互動回饋 語彙資料庫 |
關鍵字(英) |
natural language segmentation method grammar interactive feedback lexical database |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
隨著網際網路的廣泛使用,線上互動式自學系統可讓使用者以自己的喜好進行學習,但要讓電腦系統瞭解人類使用的自然語言還是有其困難所在,尤其是中文,首先,中文裡面每個字詞之間並沒有任何區隔符號,第二是因為口語中所使用的句型未必符合正規的文法規則,使得難以去剖析及瞭解使用者真正的語意,第三是要把應用範圍限制在一個特定的領域,否則將太過複雜而難以分析。本論文提出一個互動式回饋機制的自學系統,能剖析並瞭解使用者輸入的自然語言,進而瞭解使用者要學習的內容,對不同的使用者提供客制化的回應內容。 |
英文摘要 |
Considering the popularity of the Internet, an automatic interactive feedback system for self-learning websites is becoming increasingly desirable. However, computers still have problems understanding natural languages, especially the Chinese language, firstly because the Chinese language has no space to segment lexical entries (its segmentation method is more difficult than that of English) and secondly because of the lack of a complete grammar in the Chinese language, making parsing more difficult and complicated. Building an automated Chinese feedback system for special application domains could solve these problems. This thesis proposes an interactive feedback mechanism in a self-learning website that can parse, understand and respond to Chinese sentences. This mechanism utilizes a specific lexical database according to the particular application. In this way, self-learning websites can implement a special application domain that chooses the proper response in a user friendly, accurate, and timely manner. |
第三語言摘要 | |
論文目次 |
Contents Chapter1 Introduction……………………………………………….....1 Chapter2 Review of Related Works…………………………………....4 2.1 Segmentation methods……………………………………………4 2.1.1 Method of Regular Segmentation…………………………..4 2.1.2 Method of Statistically Segmenting Sentences.....................5 2.1.3 Mixed Method of Segmenting Sentence………………...…6 2.1.4 Segmenting Method of Genetic Algorithms (Gas)…………6 2.2 Grammar Analysis………………………………………………..7 2.2.1 Context Free Grammar (CFG)……………………………...7 2.2.2 Slot and Filler………………………………………………9 2.2.3 Link Grammar Technology………………………………...9 2.3 Memory-based Parsing System…………………………………12 2.4 Bayesian Network..…………………………………………..…14 Chapter3 Overview of the Proposed Architecture…………………..16 Chapter4 Segmentation System………………………………………19 4.1 Separation……...………………………………………………..21 4.2 Corpora-comparing system……………………………………...23 4.3 Unknown Word Judgment System……………………………...25 4.4 The Data Structure of the Segmentation Tree Node…………….26 4.5 Context-proofreading System…………………………………...28 4.6 Weighted-calculation System……………………………….......30 4.7 Process of Segmentation System………………………………..31 Chapter5 Syntactic Analysis System…………………………………34 5.1 Word-based Link Grammar……………………………………..35 5.2 Fault-tolerance Mechanism……………………………………..38 5.3 Process of Syntactic Analysis System…………………………..39 Chapter6 Semantic Analysis System…………………………………44 6.1 Memory-based parsing system………………………………….46 6.1.1 Concept Sequence Layer…………………………………46 6.1.2 Semantic Concept Hierarchy……………………………..48 6.1.3 Instance Layer…………………………………………..…49 6.2 Learning Mechanism of Semantics………………………….….50 6.2.1 Generalization..……………………………………………50 6.2.2 Specialization……………………………………………..52 6.3 Semantic Network……………………………………………....53 6.4 Process of Semantic Analysis System…………………………54 Chapter7 Response System……………………………………………56 7.1 Searching goal…...……………………………………………...56 7.1.1 Existence of Searching goal……………………….……..58 7.1.2 Knowledge domain of Searching goal…………………….63 7.2 Knowledge domain….........…………………………………...65 7.3 Teaching Material Difficulty…..……………………………...66 Chapter8 Conclusion and Future Work…………..………………….68 References…………………………………………………………….70 Table of Figures Fig. 1 CFG grammar analysis…………………………………………….8 Fig. 2 Words and its plug……………………………….………………10 Fig. 3 Words and connectors in the dictionary………………………..10 Fig. 4 The simplified form of Fig. 2 & 3……………………………..11 Fig. 5 Part of knowledge base used for processing : “The Shining Path”………………………….…………………………………………13 Fig. 6 An example of BN……………………………………………….14 Fig. 7 Flowchart of feedback system…………………………………..17 Fig. 8 The architecture of the segmentation system……………………19 Fig. 9 Separation system’s flow chart…………….………………….…22 Fig. 10 Unknown word judgment system process…………………….25 Fig. 11 segmentation tree structure…………………………………..…26 Fig. 12 node data structure…………………………………………...…27 Fig. 13 The flow chart of the keyword in context comparing system…..29 Fig. 14 List of separating from the first layer to the fourth layer……..32 Fig. 15 Flowchart of the syntactic analysis system……………………35 Fig. 16 Fault-tolerance processing with omitting of the preposition……39 Fig. 17 Process of syntactic analysis……………………………………42 Fig. 18 Result of the syntactic analysis…………………………………43 Fig. 19 Flowchart of the semantic analysis system………………..……44 Fig. 20 Structure of the concept sequence layer………………………47 Fig. 21 Structure of semantic concept hierarchy………………………49 Fig. 22 Example of semantic verification……………………………….50 Fig. 23 Generalization……………………………………………..……51 Fig. 24 Specialization……………………………………………….......52 Fig. 25 An example of semantic network………………………………54 Fig. 26 Semantic network of example sentence…...……………………55 Fig. 27 Belief network of existence……………………………………..58 Fig. 28 Trained topology of BN and keyword groups………………..…60 Fig. 29 Belief network of knowledge domain…………………….….…64 Fig. 30 Bayesian Network of Knowledge domain………………….......65 Fig. 31 Bayesian Network of Teaching Material Difficulty………….....66 IV Table of Tables Table 1: The words and linking requirements in a dictionary…………11 Table 2: corpora data structure……………………………..…………. .23 Table 3: The segmentation table of sentence…………………………....33 Table 4: The linking rules of each part of speech…………………….…35 |
參考文獻 |
References [Carpenter 98] B. Carpenter and J. Chu-Carroll, “Natural language call routing: A robust, self-organizing approach”, The 4th International Conference on Spoken Language Processing, 1998. [Chung 93] Chung, M.; Moldovan, D., “Parallel memory-based parsing on SNAP”, Parallel Processing Symposium, Proceedings of Seventh International Conference, 1993, pp. 680-684 [Chen 92a] Chen,H. C, ”Reading Comprehension in Chinese: Implications from character Reading Times.In H. C. Chen,& Ovid. J. L. Tzeng(Eds.)”, Language Processing in Chinese, 1992 [Chen 92b] Chen, K. J., & Liu, S. H., “Word identification for mandarin Chinese sentences”, Proceedings of the Fifteenth International Conference on Computational Linguistics, 1992, Nantes, pp. 101-107. [Chen 03a] Jui-Fa Chen, Wei-Chuan Lin, Chih-Yu Jian, “Using the Keyword in Context Segmentation Method for Collaborative Design in a Chinese Website”, The 10th ISPE International Conference on Concurrent Engineering: Research and Applications, 2003, pp. 967–975 [Chen 03b] Jui-Fa Chen, Wei-Chuan Lin, Chih-Yu Jian, Tzong-Yuh Ho Shi-Yao Dai “Using the Keyword in Context Segmentation Method for a Chinese Website”, 2003 International Conference on Computer-Assisted Instruction, National Taiwan Normal University, Taipei, Taiwan, 2003, pp. 74-80 [Chen 03c] Jui-Fa Chen, Wei-Chuan Lin, Chih-Yu Jian, Ching-Chung Hung , “A Chinese Automatic Interactive Feedback System for Applying in a Website”, The Second International Human.Society@Internet Conference, 2003, pp. 238-248 [Chen 05] Jui-Fa Chen, Wei-Chuan Lin, Chih-Yu Jian and Ching-Chung Hung, ”A Chinese Interactive Feedback System for an e-Learning Website”, Journal of Information Science and Engineering, Vol. 21, No.5, September 2005, pp. 929-957 [Daniel 91] Daniel Sleator and Davy Temperley, Parsing English with a Link Grammar”, Carnegie Mellon University Computer Science technical report CMU-CS-91-196, October 1991 [Finn 01] Finn V. Jensen, Bayesian Networks and Decision Graphs, Springer-Verlag, New York, 2001. [Holland75] Holland,J.H., ”Adaptation in natural and artificial systems.”The University of Michigan Press,Ann Arbor, 1975 [Hongeng 00] Hongeng. S, Bremond. F, Nevatia .R, “Bayesian framework for video surveillance application”, Pattern Recognition, 2000. Proc. 15th Int’l Conf., Barcelona, Spain, Spt 2000, pp. 164-170. [James 95] James Allen, “Natural language understanding”, 2nd edition 1995, ISBN 0-8053-0334-0, pp. 41-75, 227-319 [Kim 93] Kim, J.-T.; Moldovan, D.I., “Acquisition of semantic patterns for information extraction from corpora”, 1993 Proceedings. Ninth Conference on Artificial Intelligence for Applications, 1993, pp. 171-176 [Kim 95] Jun-Tae Kim; Moldovan, D.I., “Acquisition of linguistic patterns for knowledge-based information extraction”, IEEE Transactions on Knowledge and Data Engineering, Vol. 7, Issue 5, Oct. 1995, pp. 713-724 [Kuhn 95] R. Kuhn and R. De Mori, “The application of semantic classification trees for natural language understanding”, IEEE Trans. Pattern Anal. Machine Intell., Vol. 17, May 1995, pp. 449-460,. [Miller 94] S. Miller and R. Bobrow, “Statistical language processing using hidden understanding models”, The Human Language Technology Workshop, 1994, pp. 278-282. [Minhwa 94a] Minhwa Chung; Moldevan, D. “Applying parallel processing to natural-language processing”, IEEE Expert (see also IEEE Intelligent Systems) , Vol. 9, Issue: 1, Feb. 1994, pp. 36-44 [Minhwa 94b] Minhwa Chung; Moldovan, D. “Memory-based parsing with parallel marker-passing”, Proceedings of the Tenth Conference on Artificial Intelligence for Applications, 1994, pp. 202-207 [Peilin 02] Peilin Lan, Qiang Ji, Looney, C.G, ” Information fusion with Bayesian networks for monitoring human fatigue”, Information Fusion, 2002. Proceedings of the Fifth Int’l Conference, July 2002, pp. 535-542. [Pieraccini 92] R. Pieraccini and E. Levin, “Stochastic representation of semantic structure for speech understanding”, Speech Communication, Vol. 11, 1992, pp. 283-288. [Sahin 01] F. Sahin, J. S. Bay, “Learning from experience using a decision-theoretic intelligent agent in multi-agent systems,” Soft Computing in Industrial Applications, 2001. SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop, Blacksburg, VA, USA, June, 2001, pp. 109 -114 [Sproat 90] Sproat R.,Shih C.,”A Statistical Method for Finding Word Boundaries in Chinese Text”,Computer Processing of Chinese and Oriental Languages, 1990, pp. 336-351 [Quillian 68] M. R. Quillian, “Semantic Memory”, In Semantic Information Processing, MIT Press, Cambridge, MA, 1968, pp. 216-270 [Wright 98] K. Arai, J. Wright, G. Riccardi, and A. Gorin, “Grammar fragment acquisition using syntactic and semantic clustering”, The 4th International Conference on Spoken Language Processing, 1998. [Yeh 91] Yeh C. L.,Lee H. J.,”Rule-Based Word Identification for Mandarin Chinese Sentences-A Unification Approach”, Computer Processing of Chinese and Oriental Language, 1991, pp. 97-118 |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信