§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2807201410061500
DOI 10.6846/TKU.2014.01159
論文名稱(中文) 中文意見探勘系統之優化及應用研究
論文名稱(英文) Research of Optimization and Application for Chinese Opinion Mining System
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士在職專班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 102
學期 2
出版年 103
研究生(中文) 王百祿
研究生(英文) Pai-Lu Wang
學號 700410037
學位類別 碩士
語言別 繁體中文
第二語言別 英文
口試日期 2014-06-19
論文頁數 121頁
口試委員 指導教授 - 蔣璿東(081863@mail.tku.edu.tw)
委員 - 蔣璿東(081863@mail.tku.edu.tw)
委員 - 葛煥昭
委員 - 王鄭慈
關鍵字(中) 中文意見探勘系統
意見詞
關鍵字(英) Chinese Opinion Mining System
Opinion Word
第三語言關鍵字
學科別分類
中文摘要
隨著社群網路的蓬勃發展,人與人或人與組織間的溝通模式也有很大的轉變,也使得消費者更願意主動於網路上,分享其所消費商品有關的各種經驗及評價。這些網路消費資訊對消費者及廠商都是具有很高參考價值,但因資料繁多及成長快速且具有時效性,往往需投入大量人力及時間進行分析,本實驗室已研發設計出一中文意見探勘系統解決此問題,也能提供更正確客觀之意見判斷。但於實務使用時,發現因缺少了自動排程功能及負面意見即時監控功能,無法得到最新評論資料的即時分析;因此本研究加入此功能,使系統可自動啟動收集文章並利用現有詞庫進行分析,能即時獲得初步分析結果,如遇重大事件發生時,也能即時對負面意見進行追蹤管理。本研究也將對演算法及報表功能進行修正及補強,以提升文章分析之準確度及避免報表查詢結果失真。
英文摘要
With the rapid development of social networks, communication patterns between people or between people and organizations have undergone major changes, making consumers more willing to share their consumer goods related experiences and evaluations through the Internet. Online consumer information has a very high reference value for consumers and vendors. However, due to the large quantity, rapid growth, and timeliness of data, a large amount of manpower input and time is often needed to carry out analysis. This laboratory has developed a Chinese opinion mining system to solve this problem, at the same time providing more objective opinions and judgment. Nevertheless, it was found through practical use that it lacked the features of automatic scheduling and real-time monitoring of negative comments, making it unable to obtain real-time analysis of the latest comments. Hence, these features were added in this research in order to enable the system to automatically start article collection and engage in analysis through the existing thesaurus, thereby deriving at preliminary analysis results and instantly tracking and managing negative opinions when major events occur. In this study, the algorithm and report functions were also corrected and reinforced in order to improve the accuracy of article analyses and avoid distortion of report query results.
第三語言摘要
論文目次
目錄
第一章 序論 1 
1.1. 研究動機與目的 1 
1.2. 論文架構 4 
第二章 文獻探討 5 
2.1. 意見單元定義 5 
2.2. 特徵詞的抽取與判斷 9 
2.2.1. 人工建立特徵詞詞庫 10 
2.2.2. 使用自然語言技術截取特徵詞 12 
2.3. 意見詞的擴充 18 
2.3.1. 利用詞庫擴充意見詞 18 
2.3.2. 利用語料庫擴充意見詞 21 
2.3.3. 半自動化系統 27 
2.4. 意見極性判斷 30 
2.4.1. 判斷意見詞傾向 30 
2.4.1.1. 利用統計計算意見傾向 31 
2.4.1.2. 特徵詞和意見詞之間的對應關係 33 
2.4.2. 否定詞和連接詞的判斷 35 
2.5. 意見探勘系統 37 
2.5.1. 英文意見探勘系統 37 
2.5.1.1. Opinion Observer 37 
2.5.1.2. IBM WebFountain 39 
2.5.1.3. RevMiner 41 
2.5.2. 中文意見探勘系統 44 
2.5.2.1. CopeOpi 44 
2.5.2.2. Chien-Liang’s work 46 
第三章 研究方法 47 
3.1. 問題及需求陳述 47 
3.2. 系統優化功能及流程之設計 49 
3.2.1. 優化功能之架構設計 49 
3.2.2. 優化流程之架構設計 50 
3.3. 自動作業流程設計 52 
3.3.1. 排程自動作業設計 52 
3.3.2. 指定領域論壇評論資料自動收集及分析設計 54 
3.3.3. 自動監控負面意見評論及通知流程設計 57 
3.4. 調整意見元素擷取及擴增演算法設計 59 
3.5. 調整結果分析及統計圖表設計 62 
3.5.1. 監控結果查詢 62 
3.5.2. 負面文章查詢 63 
3.5.3. 負面發文者查詢 65 
3.5.4. 報表分析新增種類查詢條件 66 
第四章 研究探討 68 
4.1. 開發工具 68 
4.2. 環境設置 69 
4.3. 系統之功能介面 70 
4.3.1. 自動排程設定介面 71 
4.3.2. 監控規格設定介面 74 
4.3.3. 意見元素新增管理介面 79 
4.3.3.1. 「意見詞標記」演算法 79 
4.3.3.2. 「斷詞斷字」演算法 81 
4.3.3.3. 「OP+OP」及「OP不OP」演算法 82 
4.3.3.4. 「OP+"了"」演算法 83 
4.3.3.5. 「OP+名詞」演算法 84 
4.3.4. 負面文章分析介面 85 
4.3.4.1. 自動監控結果查詢介面 86 
4.3.4.2. 負面文章查詢介面 88 
4.3.4.3. 負面文章發文者查詢介面 90 
4.3.5. 報表分析介面 91 
第五章 結論 96 
參考文獻 97 
附錄-英文論文 102

圖目錄
圖1 共生模式八種類型 11 
圖2 特徵詞與意見詞配對矩陣 15 
圖3 意見詞擴充示意圖 19 
圖4 在汽車領域中半自動標註與人工標註的比較 28 
圖5 在遊戲領域中半自動標註與人工標註的比較 29 
圖6 Feature-Opinion對應圖 35 
圖7 Opinion Observer的比較畫面 38 
圖8 人工標註系統畫面 39 
圖9 WebFountain GUI 經過意見分析後的產品比較圖 40 
圖10 WebFountain可以讓使用者選擇產品以及來源 40 
圖11 ReMiner在手機上根據特徵分類(Common圖) 41 
圖12 Special圖 42 
圖13 Cloud圖 42 
圖14 Categories 圖 43 
圖15 CopeOpi使用者選擇畫面 45 
圖16 各個時間趨勢 45 
圖17 包含主題的文章 45 
圖18 可選擇有關的電影以及特徵,並且知道正負傾向評論等級 46 
圖19 系統架構圖 49 
圖20 排程設定流程圖 53 
圖21 指定領域論壇評論資料自動收集及分析圖 54 
圖22 自動監控負面意見評論及通知流程圖 57 
圖23 調整後意見元素擷取及擴增執行示意圖 59 
圖24 監控結果查詢流程圖 63 
圖25 負面文章查詢流程圖 64 
圖26 負面發文者查詢流程圖 65 
圖27 報表分析異動後流程圖 66 
圖28 系統啟動Console介面 69 
圖29 系統登入首頁 70 
圖30 系統功能介面 71 
圖31 自動排程設定介面 71 
圖32 排程時間設定介面 73 
圖33 文章自動更新作業記錄查詢介面 74 
圖34 監控規格設定主介面 74 
圖35 監控通知Mail內容 76 
圖36 監控規格設定條件介面 77 
圖37 監控規格新增介面 78 
圖38 監控規格條件選擇介面 78 
圖39 意見詞標記演算法編輯介面 80 
圖40 意見詞與Feature 關聯編輯介面 80 
圖41 斷詞斷字找意見元素編輯介面 82 
圖42 OP+OP結果編輯介面 83 
圖43 OP+"了"編輯介面 84 
圖44 OP+名詞編輯介面 85 
圖45 意見元素新增管理執行完畢訊息介面 85 
圖46 自動監控結果查詢主介面 87 
圖47 監控結果查詢明細資料介面 88 
圖48 瀏覽文章介面 88 
圖49 負面文章查詢主介面 89 
圖50 負面文章發文者查詢主介面 91 
圖51 Topic評價分析介面一 92 
圖52 Topic評價分析介面二 92 
圖53 Topic評價分析之直線圖結果 93 
圖54 Topic評價分析之折線圖結果 93 
圖55 Topic評價分析之圓餅圖結果 93 
圖56 雷達圖分析調整後介面 94 
圖57 雷達圖分析結果圖 95 

表目錄
表1 意見元素 6 
表2 電影元素的特徵表 11 
表3 特徵詞詞性 14 
表4 意見詞與特徵詞之間的定義 23 
表5 Propagation rule表 24
參考文獻
[1] P. D. Turney, "Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews," presented at the Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania, 2002. 
[2] M. Hu and B. Liu, "Mining and summarizing customer reviews," presented at the Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle, WA, USA, 2004. 
[3] L.-W. Ku and H.-H. Chen, "Mining opinions from the Web: Beyond relevance retrieval," Journal of the American Society for Information Science and Technology, vol. 58, pp. 1838-1850, 2007. 
[4] N. Kobayashi, K. Inui, and Y. Matsumoto, "Opinion Mining from Web Documents: Extraction and Structurization," Information and Media Technologies, vol. 2, pp. 326-337, 2007. 
[5] S.-M. Kim and E. Hovy, "Determining the sentiment of opinions," presented at the Proceedings of the 20th international conference on Computational Linguistics, Geneva, Switzerland, 2004. 
[6] B. Liu and L. Zhang, "A Survey of Opinion Mining and Sentiment Analysis 
Mining Text Data," C. C. Aggarwal and C. Zhai, Eds., ed: Springer US, 2012, pp. 415-463. 
[7] B. Liu, M. Hu, and J. Cheng, "Opinion observer: analyzing and comparing opinions on the Web," presented at the Proceedings of the 14th international conference on World Wide Web, Chiba, Japan, 2005. 
[8] G. A. Miller. (1980). WordNet. Available: http://wordnet.princeton.edu/ 
[9] P. J. Stone, D. C. Dunphy, and M. S. Smith, "The General Inquirer: A Computer Approach to Content Analysis," 1966. 
[10] A. Esuli and F. Sebastiani, "Sentiwordnet: A publicly available lexical resource for opinion mining," 2006, pp. 417-422. 
[11] B. Ohana and B. Tierney, "Sentiment classification of reviews using SentiWordNet," in 9th. IT & T Conference, 2009, p. 13. 
[12] General Inquire. Available: http://www.wjh.harvard.edu/~inquirer/ 
[13] A. Esuli and F. Sebastiani, "Determining term subjectivity and term orientation for opinion mining," 2006, pp. 193-200. 
[14] A. Andreevskaia and S. Bergler, "Mining WordNet for fuzzy sentiment: Sentiment tag extraction from WordNet glosses," 2006, pp. 209-216. 
[15] L. Zhuang, F. Jing, and X.-Y. Zhu, "Movie review mining and summarization," presented at the Proceedings of the 15th ACM international conference on Information and knowledge management, Arlington, Virginia, USA, 2006. 
[16] 董振東, "HowNet," 1999 
[17] T. Peiliang, L. Yuanchao, L. Ming, and Z. Shanzong, "Research of Product Ranking Technology Based on Opinion Mining," in Intelligent Computation Technology and Automation, 2009. ICICTA '09. Second International Conference on, 2009, pp. 239-243. 
[18] S. Bin and C. Kuiyu, "Mining Chinese Reviews," in Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on, 2006, pp. 585-589. 
[19] 杨锋, 彭勤科, and 徐涛, "基于随机网络的在线评论情绪倾向性分类," 自动化学报, vol. 36, pp. 837-844, 2010. 
[20] 李林琳, "基于特定领域的汉语句子意见挖掘," 上海交通大学, 2008. 
[21] 娄德成 and 姚天昉, "汉语句子语义极性分析和观点抽取方法的研究," 计算机应用, vol. 26, pp. 2622-2625, 2006. 
[22] L.-W. Ku, H.-W. Ho, and H.-H. Chen, "Opinion mining and relationship discovery using CopeOpi opinion analysis system," Journal of the American Society for Information Science and Technology, vol. 60, pp. 1486-1503, 2009. 
[23] 陳立, "中文情感語意自動分類之研究," 2010. 
[24] 楊盛帆, "以整合式規則來做網路論壇上的 3C 產品口碑分析," 元智大學資訊管理學系研究所碩士論文, 2009. 
[25] 孫瑛澤, 陳建良, 劉峻杰, 劉昭麟, and 蘇豐文, "中文短句之情緒分類," 2010. 
[26] 謝鎮宇, "意見探勘在中文評鑑語料之應用," 交通大學資訊學院碩士在職專班資訊組學位論文, 交通大學, 2010. 
[27] H. Xu, K. Zhao, L. Qiu, and C. Hu, "Expanding Chinese sentiment dictionaries from large scale unlabeled corpus," 2011. 
[28] S. Tan, Y. Wang, and X. Cheng, "Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples," presented at the Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, Singapore, Singapore, 2008. 
[29] P. Turney and M. L. Littman, "Measuring praise and criticism: Inference of semantic orientation from association," 2003. 
[30] H. Kanayama and T. Nasukawa, "Fully automatic lexicon expansion for domain-oriented sentiment analysis," presented at the Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, 2006. 
[31] G. Qiu, B. Liu, J. Bu, and C. Chen, "Expanding domain sentiment lexicon through double propagation," 2009, pp. 1199-1204. 
[32] G. Qiu, B. Liu, J. Bu, and C. Chen, "Opinion Word Expansion and Target Extraction through Double Propagation," Computational Linguistics, vol. 37, pp. 9-27, 2011/03/01 2011. 
[33] A.-M. Popescu and O. Etzioni, "Extracting product features and opinions from reviews," presented at the Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, 2005. 
[34] Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai, "Topic sentiment mixture: modeling facets and opinions in weblogs," presented at the Proceedings of the 16th international conference on World Wide Web, Banff, Alberta, Canada, 2007. 
[35] Z. Zhai, B. Liu, L. Zhang, H. Xu, and P. Jia, "Identifying evaluative sentences in online discussions," 2011. 
[36] N. Kobayashi, K. Inui, Y. Matsumoto, K. Tateishi, and T. Fukushima, "Collecting Evaluative Expressions for Opinion Extraction 
Natural Language Processing – IJCNLP 2004." vol. 3248, K.-Y. Su, J. i. Tsujii, J.-H. Lee, and O. Kwong, Eds., ed: Springer Berlin / Heidelberg, 2005, pp. 596-605. 
[37] M. Fuketa, Y. Kadoya, E. Atlam, T. Kunikata, K. Morita, S. Kashiji, et al., "A method of extracting and evaluating good and bad reputations for natural language expressions," International Journal of Information Technology & Decision Making, vol. 4, pp. 177-196, 2005. 
[38] A. Esuli and F. Sebastiani, "Determining the semantic orientation of terms through gloss classification," presented at the Proceedings of the 14th ACM international conference on Information and knowledge management, Bremen, Germany, 2005. 
[39] C. Zhang, D. Zeng, J. Li, F.-Y. Wang, and W. Zuo, "Sentiment analysis of Chinese documents: From sentence to document level," J. Am. Soc. Inf. Sci. Technol., vol. 60, pp. 2474-2487, 2009. 
[40] L. Zhuang, F. Jing, and X. Y. Zhu, "Movie review mining and summarization," 2006, pp. 43-50. 
[41] 邱鴻達, "意見探勘在中文電影評論之應用," 國立交通大學 資訊科學與工程研究所, 2011. 
[42] 梅家駒等編著, 同義詞詞林, 1983. 
[43] X. Ding, B. Liu, and P. S. Yu, "A holistic lexicon-based approach to opinion mining," 2008, pp. 231-240. 
[44] Q. Su, X. Xu, H. Guo, Z. Guo, X. Wu, X. Zhang, et al., "Hidden sentiment association in chinese web opinion mining," presented at the Proceedings of the 17th international conference on World Wide Web, Beijing, China, 2008. 
[45] V. Hatzivassiloglou and K. R. McKeown, "Predicting the semantic orientation of adjectives," 1997, pp. 174-181. 
[46] Y. Qiang, S. Wen, and L. Yijun, "Sentiment Classification for Movie Reviews in Chinese by Improved Semantic Oriented Approach," in System Sciences, 2006. HICSS '06. Proceedings of the 39th Annual Hawaii International Conference on, 2006, pp. 53b-53b. 
[47] L. W. Ku, I. C. Liu, C. Y. Lee, K. Chen, and H. H. Chen, "Sentence-Level Opinion Analysis by CopeOpi in NTCIR-7," 2008. 
[48] P. Ting-Chun and S. Chia-Chun, "Using Chinese part-of-speech patterns for sentiment phrase identification and opinion extraction in user generated reviews," in Digital Information Management (ICDIM), 2010 Fifth International Conference on, 2010, pp. 120-127. 
[49] K. Dave, S. Lawrence, and D. M. Pennock, "Mining the peanut gallery: opinion extraction and semantic classification of product reviews," presented at the Proceedings of the 12th international conference on World Wide Web, Budapest, Hungary, 2003. 
[50] M. Gamon, A. Aue, S. Corston-Oliver, and E. Ringger, "Pulse: Mining Customer Opinions from Free Text 
Advances in Intelligent Data Analysis VI." vol. 3646, A. Famili, J. Kok, J. Pena, A. Siebes, and A. Feelders, Eds., ed: Springer Berlin / Heidelberg, 2005, pp. 741-741. 
[51] T. Wilson, P. Hoffmann, S. Somasundaran, J. Kessler, J. Wiebe, Y. Choi, et al., "OpinionFinder: a system for subjectivity analysis," presented at the Proceedings of HLT/EMNLP on Interactive Demonstrations, Vancouver, British Columbia, Canada, 2005. 
[52] J. Huang, O. Etzioni, L. Zettlemoyer, K. Clark, and C. Lee, "RevMiner: an extractive interface for navigating reviews on a smartphone," presented at the Proceedings of the 25th annual ACM symposium on User interface software and technology, Cambridge, Massachusetts, USA, 2012. 
[53] J. Yi and W. Niblack, "Sentiment mining in WebFountain," in Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on, 2005, pp. 1073-1083. 
[54] L. Chien-Liang, H. Wen-Hoar, L. Chia-Hoang, L. Gen-Chi, and E. Jou, "Movie Rating and Review Summarization in Mobile Environment," Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. 42, pp. 397-407, 2012.
論文全文使用權限
校內
紙本論文於授權書繳交後5年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後5年公開
校外
同意授權
校外電子論文於授權書繳交後5年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信