§ 瀏覽學位論文書目資料
  
系統識別號 U0002-0607202109022000
DOI 10.6846/TKU.2021.00146
論文名稱(中文) 利用回推結合關聯式分類器改進中文意見探勘結果
論文名稱(英文) Use Push Back and Associative Classifiers to Improve Chinese Opinion Survey Results
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 109
學期 2
出版年 110
研究生(中文) 廖威博
研究生(英文) Wei-Bo Liao
學號 608410139
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2021-06-18
論文頁數 59頁
口試委員 指導教授 - 蔣璿東
委員 - 王鄭慈
委員 - 陳柏榮
關鍵字(中) 中文意見探勘
意見探勘
中文意見探勘系統
回推
關鍵字(英) Chinese opinion mining
opinion mining
Chinese opinion mining system
push back
第三語言關鍵字
學科別分類
中文摘要
由於網路和科技的快速發展,大眾在購買產品或某公司的服務之前,為了買到性價比較高的產品,大多會先到網路上查詢相關的評論後,再決定是購買哪個產品或哪間公司的服務。有研究指出,將近81%的民眾會先看網路上的評論,再決定要購買哪個產品或哪間公司的服務,所以,對於公司而言,對關於顧客評論的文章進行口碑分析,並即時的查看網友們的意見,讓該公司能盡快的回覆和平衡與公司有關之負面評價,是一件至關重要的事情,不過,人工蒐集這些口碑意見非常耗時,而且也容易只會蒐集到片面的評論。因此就上述而言,我們研究室開發了一個屬於aspect-level的中文意見探勘系統,使得可以快速且較全面的蒐集相關的口碑意見,不過當分析的完整句中意見詞為動詞時,中文意見探勘系統的結果不好,準確率只有25.16%。因為當分析的完整句中意見詞為動詞時,4-Tuple的表示法產生的完整句無法精準的表達網友的意見,若意見詞為動詞時,可能會需要兩個Topic、Feature或是item,在不影響中文意見探勘系統原有功能的前提之下,新增擷取動詞之後的Topic、Feature或item,由7個意見元素來組成完整句,我們稱之為7-Tuple的表示法,所以,此研究我們將分別對4-Tuple的表示法產生的完整句、7-Tuple的表示法產生的完整句,以及7-Tuple的表示法產生的文章做回推的動作,來補充缺少的面向和子面向,並結合AC來做實驗,比較並驗證是否回推後的結果較好。
英文摘要
Due to the rapid development of the Internet and technology. Before people purchase a product or a company’s service. The majority of people look up reviews on the Internet before deciding which product or which company's service to buy. According to the study, nearly 81% of people will read reviews on the Internet before deciding which product or which company's service to purchase. However, it is very time-consuming to collect these opinions manually, and it is also easy to collect only one-sided comments. For this reason, we have developed an aspect-level Chinese opinion mining system that allows us to collect relevant IWOM opinions quickly and comprehensively. But the accuracy of the system is only 25.16% when the verb in the full sentence is analyzed. The reason is that when the verb is used in the complete sentence, the 4-Tuple representation cannot accurately express the users' opinions. Therefore, in this study, we will do push back on the complete sentences generated by the 4-Tuple representation, the complete sentences generated by the 7-Tuple representation, and the articles generated by the 7-Tuple representation respectively. To supplement the missing aspect and sub-aspect, and combine AC to do the experiment is conducted to compare and verify whether the push back results are better.
第三語言摘要
論文目次
第一章	緒論	1
1-1 研究動機與目的	1
1-2 研究架構	5
第二章	文獻探討	6
2-1 中文意見探勘系統相關研究	6
2-2 本研究室開發的中文意見探勘系統研究	9
2-3 關聯式分類	12
第三章	問題陳述與研究方法	15
3-1問題陳述	15
3-2研究方法	19
第四章	實驗結果與成效	21
4-1 排序與評估方法	21
4-2 4-TUPLE的表示法之回推前與回推後句子之比較	23
4-3 7-TUPLE的表示法之回推前與回推後句子之比較	26
4-4 回推前與回推後文章之比較	30
第五章	結論	36
參考文獻	39
 
圖目錄
圖1-1分類後的準確率	3
圖 1-2 7-TUPLE的表示法之數據	4
圖 2-1中文意見探勘系統架構圖	11
圖2-2 ASSOCIATION CLASSIFIER	14
圖3-1中華各面向原始評價數	18
圖3-2經回推後中華各面向評價數	18
圖3-3人工檢查介面	20
圖 4-1 回推前與回推後之訓練資料的數據	25
圖 4-2 回推前與回推後之測試資料的數據	25
圖 4-3回推後的數據比回推前差的分類結果	25
圖 4-4 回推前與回推後之訓練資料的數據	27
圖 4-5 回推前與回推後之測試資料的數據	27
圖 4-6回推後的訓練資料之YES.60的分類結果	28
圖 4-7回推後的訓練資料之NO.90的分類結果	29
圖 4-8回推後的測試資料之YES.60的分類結果	29
圖 4-9 回推後的測試資料之NO.90的分類結果	30
圖 4-10回推前與回推後文章之訓練資料的數據	32
圖 4-11回推前與回推後文章之訓練資料的數據	32
圖 4-12回推後的訓練資料之YES.60的分類結果	34
圖 4-13回推後的訓練資料之NO.90的分類結果	34
圖 4-14回推後的測試資料之YES.60的分類結果	35
圖 4-15回推後的測試資料之NO.90的分類結果	35
 
表目錄
表 1-1資料來源	2
表2-1意見元素定義表	11
表 4-1附錄之表格示意圖	23
表 5-1最初的測試資料	37
表 5-2回推前之測試資料的結果	37
表 5-3回推後之測試資料的結果	37
表 5-4最初的測試資料(文章)	38
表 5-5回推前之測試資料的結果(文章)	38
表 5-6回推後之測試資料的結果(文章)	38
參考文獻
[1]	陳一帆, "分類規則排序對中文意見探勘結果之研究" 淡江大學資訊工程學系資訊碩士專班碩士論文, 2021.
[2]	蔡伊玲, "使用上下文關聯性改善中文意見探勘系統的效能," 淡江大學資訊工程學系資訊網路與多媒體碩士班碩士論文, 2017.
[3]	Z. Li, M. Zhang, S. Ma, B. Zhou, and Y. Sun, "Automatic Extraction for Product Feature Words from Comments on the Web Information Retrieval Technology." vol. 5839, G. Lee, D. Song, C.-Y. Lin, A. Aizawa, K. Kuriyama, M. Yoshioka, et al., Eds., ed: Springer Berlin / Heidelberg, 2009, pp. 112-123.
[4]	M. Chen and T. Yao, "Combining dependency parsing with shallow semantic analysis for Chinese opinion-element relation identification," in Universal Communication Symposium (IUCS), 2010 4th International (pp. 299-305), 2010, pp. 299-305.
[5]	Z. Li, M. Zhang, S. Ma, B. Zhou, and Y. Sun (2009). Automatic Extraction for Product Feature Words from Comments on the Web Information Retrieval Technology. vol. 5839, G. Lee, D. Song, C.-Y. Lin, A. Aizawa, K. Kuriyama, M. Yoshioka, et al., Eds., ed: Springer Berlin / Heidelberg, pp. 112-123.
[6]	C. Zhang, D. Zeng, J. Li, F.-Y. Wang, and W. Zuo (2009). Sentiment analysis of Chinese documents: From sentence to document level. J. Am. Soc. Inf. Sci. Technol, vol. 60, pp. 2474-2487,.
[7]	M. Chen and T. Yao (2010). Combining dependency parsing with shallow semantic analysis for Chinese opinion-element relation identification. Universal Communication Symposium (IUCS), 2010 4th International, pp. 299-305.
[8]	P. Ting-Chun and S. Chia-Chun (2010). Using Chinese part-of-speech patterns for sentiment phrase identification and opinion extraction in user generated reviews. Digital Information Management (ICDIM), 2010 Fifth International Conference on (pp. 120-127), , pp. 120-127.
[9]	V. Hatzivassiloglou and K. R. McKeown (1997).  (1997). Predicting the semantic orientation of adjectives. 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174-181.
[10]	G. Qiu, B. Liu, J. Bu, and C. Chen (2009). Expanding domain sentiment lexicon through double propagation. 21st international jont conference on Artifical intelligence, pp.1199-1204.
[11]	M. Hu and B. Liu (2004.) Mining and summarizing customer reviews. 10th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 168-177.
[12]	S.-M. Kim and E. Hovy (2004). Determining the sentiment of opinions. 20th international conference on Computational Linguistics, pp. 1367.
[13]	S. Bin and C. Kuiyu (2006). Mining Chinese Reviews. Data Mining Workshops, 2006. ICDM Workshops 2006. 6th IEEE International Conference on, pp. 585-589.
[14]	Y. Qiang, S. Wen, and L. Yijun (2006). Sentiment Classification for Movie Reviews in Chinese by Improved Semantic Oriented Approach. System Sciences,. HICSS '06. Proceedings of the 39th Annual Hawaii International Conference on, pp. 53b-53b.
[15]	L. Zhuang, F. Jing, and X.-Y. Zhu (2006). Movie review mining and summarization. 15th ACM international conference on Information and knowledge management, pp. 43-50.
[16]	X. Ding, B. Liu, and P. S. Yu (2008). A holistic lexicon-based approach to opinion mining. 2008 International Conference on Web Search and Data Mining. pp. 231-240.
[17]	L. W. Ku, I. C. Liu, C. Y. Lee, K. Chen, and H. H. Chen (2008). Sentence-Level Opinion Analysis by CopeOpi in NTCIR-7. 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Cross-Lingual Information Access. pp. 260-267.
[18]	G. A. Miller. (1980). WordNet. Available: http://wordnet.princeton.edu/
[19]	P. J. Stone, D. C. Dunphy, and M. S. Smith (1966). The General Inquirer: A Computer Approach to Content Analysis. Oxford, England: M.I.T. Press.
[20]	A. Esuli and F. Sebastiani (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. 5th Conference on Language Resources and Evaluation. pp. 417-422.
[21]	B. Ohana and B. Tierney (2009). Sentiment classification of reviews using SentiWordNet. 9th. IT & T Conference, p. 13.
[22]	L.-W. Ku, H.-W. Ho, and H.-H. Chen (2009). Opinion mining and relationship discovery using CopeOpi opinion analysis system. Journal of the American Society for Information Science and Technology, vol. 60, pp. 1486-1503.
[23]	Shih-Jung Wu, Rui-Dong Chiang, Zheng-Hong Ji (2016, Jul). Development of a Chinese opinion-mining system for application to Internet online forums. The Journal of Supercomputing, pp.1-15. (SCI, 23/51, COMPUTER SCIENCE, HARDWARE & ARCHITECTURE).
[24]	Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. Acm Sigmod Record, , 22. (2) pp. 207-216. 
[25]	Thabtah, F. (2007). A review of associative classification mining. The Knowledge Engineering Review, 22(1), 37-65.
[26]	Ma, Bing Liu Wynne Hsu Yiming, & Liu, B. (1998). Integrating classification and association rule mining. 4th International Conference on Knowledge Discovery and Data Mining, pp. 80-86
[27]	Wenmin Li, Jiawei Han, & Jian Pei. (2001). CMAR: Accurate and efficient classification based on multiple class-association rules. 2001 IEEE International Conference on Data Mining, pp. 369-376.
[28]	E. Baralis, & P. Garza. (2002). A lazy approach to pruning classification rules. 2002 IEEE International Conference on Data Mining, 2002. Proceedings. pp. 35-42.
[29]	Wang, K., He, Y., & Cheung, D. W. (2001). Mining confident rules without support requirement. 10th International Conference on Information and Knowledge Management, pp. 89-96. 
[30]	Thabtah, F. A., Cowling, P., & Peng, Y. (2004). MMAC: A new multi-class, multi-label associative classification approach. Data Mining, 2004. ICDM'04. Fourth IEEE International Conference on, pp. 217-224.
[31]	Chen, C., Chiang, R., Lee, C., & Chen, C. (2012). Improving the performance of association classifiers by rule prioritization. Knowledge-Based Systems, 36, 59-67.
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信