淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-0507201015124500
中文論文名稱 利用多層次類別優先度之規則排序以改善關聯式分類效能
英文論文名稱 Improving the performance of Associative Classification by using the Multi-level Class Priority of Rule Ranking
校院名稱 淡江大學
系所名稱(中) 資訊工程學系碩士班
系所名稱(英) Department of Computer Science and Information Engineering
學年度 98
學期 2
出版年 99
研究生中文姓名 邱信淵
研究生英文姓名 Hsin-Yuan Chiou
學號 697410099
學位類別 碩士
語文別 中文
第二語文別 英文
口試日期 2010-06-15
論文頁數 55頁
口試委員 指導教授-蔣定安
委員-蔣定安
委員-葛煥昭
委員-王鄭慈
中文關鍵字 關聯式法則  多層次類別優先  規則排序  規則產生 
英文關鍵字 Associative Rule  Multi-level class priority  Rule Ranking  Rule Generation 
學科別分類 學科別應用科學資訊工程
中文摘要 一般關聯式分類法(Associative Classification, AC)在規則排序(Ranking)[1][2]上,會先依照信賴度由高至低排序,接著依支援度由高至低排序,再依規則由短至長排序,短規則因為通用性較高,通常為了讓更多文件可以分類,因此短規則在排序上優於長規則,為了讓特殊文件能夠準確的分類,本論文採用了Lazy演算法,依照信賴度及支援度由高至低排序外,規則長度較長者先排序。
本論文核心為規則排列問題,除了採用Lazy法[3]所提出的排序法則為一般排序原則外,再加上本論文提出之多層次類別優先度來探討其對分類準確率及效能,並與Lazy演算法比較及對照。
英文摘要 In general, the approach in rule ranking of associative classification (AC)[1][2] begins first with confidence value in order of the highest to the lowest, then support value in order of the highest to the lowest, and finally rule in order of the shortest to the longest. In order to make more documents classifiable, short rules are ranked higher than long rules as short rules also have higher compatibility. With the use of discourse-based experiments in this study, it was found that common characteristics existed between certain categories and they were not always mutually associated. One could achieve a considerable degree of improvement by placing rules of a certain category in front of rules of another category.
The core of this paper is centered on the issue of rule ranking. Apart from adopting the ranking method proposed by Lazy[3] method as the general principle, Multi-Level class priority was proposed to explore its impact on the classification performance. It was proven in the experiments that adding Multi-Level class priority in rule ranking would help to achieve better classification performance than any general ranking principles.
論文目次 目錄
目錄 IV
圖目錄 VII
表目錄 VIII
第1章 緒論 1
1.1 前言 1
1.2 研究動機 2
1.3 論文架構 6
第2章 相關文獻與研究探討 7
2.1 關聯式分類流程圖 7
2.2 關聯式分類 (ASSOCIATIVE CLASSIFICATION) 8
2.2.1 Lazy 12
2.2.2 CBA 16
2.2.3 CMAR 19
2.3 評估值 21
第3章 研究方法 23
3.1 問題描述 23
3.2 實驗流程 26
第4章 實驗結果 30
4.1 資料來源 30
4.2 實驗步驟 34
4.3 實驗結果 35
4.4 實驗結果分析 39
第5章 結論與未來展望 41
5.1 結論 41
5.2 未來展望 42
參考文獻 43
附錄一英文論文 45
圖目錄
圖2.1 關聯式分類器分類流程圖 7
圖2.2 DATABASE COVERAGE 演算法 12
圖2.3  L3 規則修剪演算法 15
圖2.4 CBA-RG 演算法 16
圖2.5 CBA-CB NAIVE(CALLED M1) ALGORITHM 17
圖2.6 SELECTING RULES BASED ON DATABASE COVERAGE 20
圖3.1 MULTI-LEVEL CLASS PRIORITY在訓練文件分類流程圖 26
圖4.1 REUTERS文件標籤 31
圖4.2 REUTERS文件範例 32
圖4.3 多層次類別優先分類走勢圖 38

表目錄
表2.1 關聯式規則搜索與關聯式分類差異表 9
表2.2 文件數量分佈表 21
表3.1 各階層的類別優先度 23
表3.2 各規則在訓練文件中分對正確及錯誤舉例 24
表4.1 不同分類的文件數 33
表4.2 訓練及測試文件數 34
表4.3 總分類文件數及準確率比較 35
表4.4 LAZY各類別分類結果 35
表4.5 MULTI-LEVEL CLASS PRIORITY 各類別分類結果 36
表4.6 各類別在LAZY分類器中分類狀況 37
表4.7 各類別在MULTI-LEVEL CLASS PRIORITY分類器中分類狀況 37
表4.8 兩分類器比較 39
參考文獻 參考文獻
[1]B. Liu, W. Hsu and Y. Ma. ”Integrating classification and association rule mining”. Knowledge Discovery and Data Mining. ,1998, pp. 86, 80.
[2]F. Thabtah. , ”A review of associative classification mining.” Knowl. Eng. Rev. , vol. 22, 2007 pp. 37-65.
[3]E. Baralis and P. Garza. ”A lazy approach to pruning classification rules.” Data Mining, IEEE International Conference on 0pp. 35. Dec. 2002.
[4] P. Clark and R. Boswell. ”Rule induction with CN2: Some recent improvements.”1991 Presented at EWSL-91: Proceedings of the European Working Session on Learning.
[5]Han-Sheng Hsiung.”Improving the accuracy of text categorization by using association rule with class priority.” Master thesis of Tamkang University, Jun. 2009, pp. 1-54.
[6]M. L. Antonie and O. R. Zaiane. ”Text document categorization by term association.” Proceedings of the 2002 IEEE International Conference on Data Mining.
[7]R. Agrawal and R. Srikant. ”Fast algorithms for mining association rules.” Presented at Proc. 20th Int. Conf. very Large Data Bases, VLDB.
[8]R. J. Quinlan and Mike. ”FOIL: A midterm report. Presented at Machine Learning:” ECML-93, European Conference on Machine Learning, Proceedings.
[9] W. Li, J. Han, and J. Pei, CMAR: Accurate and efficient classification based on multiple class-association rules., ICDM-01, San Jose, CA, Nov. 2001, pp. 369-376.
[10]Tseng, Yuen-Hsien. ”Effectiveness issues in automatic text categ orization.”Bulletin of the Library Association of China, vol. 68, Jun. 2002, pp. 62-83.
[11]University of california irvine knowledge discovery in databases archivehttp://kdd.ics.uci.edu/
[12]The reuters-21578 text categorization test collection.
http://www.daviddlewis.com/resources/testcollections/reuters21578.
[13]SQL server integration services "英文斷詞系統,
http://msdn.microsoft.com/zh-tw/library/ms141026.aspx."
[14]Cho-Ming Lee, “Classifying Chinese Text Documents by Association Rule”, Master thesis of Tamkang University Jun 2006,pp. 1-66
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2011-07-07公開。
  • 同意授權瀏覽/列印電子全文服務,於2011-07-07起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信