§ 瀏覽學位論文書目資料
  
系統識別號 U0002-0507201015124500
DOI 10.6846/TKU.2010.00129
論文名稱(中文) 利用多層次類別優先度之規則排序以改善關聯式分類效能
論文名稱(英文) Improving the performance of Associative Classification by using the Multi-level Class Priority of Rule Ranking
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 98
學期 2
出版年 99
研究生(中文) 邱信淵
研究生(英文) Hsin-Yuan Chiou
學號 697410099
學位類別 碩士
語言別 繁體中文
第二語言別 英文
口試日期 2010-06-15
論文頁數 55頁
口試委員 指導教授 - 蔣定安(chiang@cs.tku.edu.tw)
委員 - 蔣定安(chiang@cs.tku.edu.tw)
委員 - 葛煥昭(keh@cs.tku.edu.tw)
委員 - 王鄭慈(ctwang@tea.ntue.edu.tw)
關鍵字(中) 關聯式法則
多層次類別優先
規則排序
規則產生
關鍵字(英) Associative Rule
Multi-level class priority
Rule Ranking
Rule Generation
第三語言關鍵字
學科別分類
中文摘要
一般關聯式分類法(Associative Classification, AC)在規則排序(Ranking)[1][2]上,會先依照信賴度由高至低排序,接著依支援度由高至低排序,再依規則由短至長排序,短規則因為通用性較高,通常為了讓更多文件可以分類,因此短規則在排序上優於長規則,為了讓特殊文件能夠準確的分類,本論文採用了Lazy演算法,依照信賴度及支援度由高至低排序外,規則長度較長者先排序。
   本論文核心為規則排列問題,除了採用Lazy法[3]所提出的排序法則為一般排序原則外,再加上本論文提出之多層次類別優先度來探討其對分類準確率及效能,並與Lazy演算法比較及對照。
英文摘要
In general, the approach in rule ranking of associative classification (AC)[1][2] begins first with confidence value in order of the highest to the lowest, then support value in order of the highest to the lowest, and finally rule in order of the shortest to the longest. In order to make more documents classifiable, short rules are ranked higher than long rules as short rules also have higher compatibility. With the use of discourse-based experiments in this study, it was found that common characteristics existed between certain categories and they were not always mutually associated. One could achieve a considerable degree of improvement by placing rules of a certain category in front of rules of another category.
   The core of this paper is centered on the issue of rule ranking. Apart from adopting the ranking method proposed by Lazy[3] method as the general principle, Multi-Level class priority was proposed to explore its impact on the classification performance. It was proven in the experiments that adding Multi-Level class priority in rule ranking would help to achieve better classification performance than any general ranking principles.
第三語言摘要
論文目次
目錄
目錄	IV
圖目錄	VII
表目錄	VIII
第1章 緒論	1
1.1 前言	1
1.2 研究動機	2
1.3 論文架構	6
第2章 相關文獻與研究探討	7
2.1 關聯式分類流程圖	7
2.2 關聯式分類 (ASSOCIATIVE CLASSIFICATION)	8
2.2.1 Lazy	12
2.2.2 CBA	16
2.2.3 CMAR	19
2.3 評估值	21
第3章 研究方法	23
3.1 問題描述	23
3.2 實驗流程	26
第4章 實驗結果	30
4.1 資料來源	30
4.2 實驗步驟	34
4.3 實驗結果	35
4.4 實驗結果分析	39
第5章 結論與未來展望	41
5.1 結論	41
5.2 未來展望	42
參考文獻	43
附錄一英文論文	45
圖目錄
圖2.1 關聯式分類器分類流程圖	7
圖2.2 DATABASE COVERAGE 演算法	12
圖2.3  L3 規則修剪演算法	15
圖2.4 CBA-RG 演算法	16
圖2.5 CBA-CB NAIVE(CALLED M1) ALGORITHM	17
圖2.6 SELECTING RULES BASED ON DATABASE COVERAGE	20
圖3.1 MULTI-LEVEL CLASS PRIORITY在訓練文件分類流程圖	26
圖4.1 REUTERS文件標籤	31
圖4.2 REUTERS文件範例	32
圖4.3 多層次類別優先分類走勢圖	38

表目錄
表2.1 關聯式規則搜索與關聯式分類差異表	9
表2.2 文件數量分佈表	21
表3.1 各階層的類別優先度	23
表3.2 各規則在訓練文件中分對正確及錯誤舉例	24
表4.1 不同分類的文件數	33
表4.2 訓練及測試文件數	34
表4.3 總分類文件數及準確率比較	35
表4.4 LAZY各類別分類結果	35
表4.5 MULTI-LEVEL CLASS PRIORITY 各類別分類結果	36
表4.6 各類別在LAZY分類器中分類狀況	37
表4.7 各類別在MULTI-LEVEL CLASS PRIORITY分類器中分類狀況	37
表4.8 兩分類器比較	39
參考文獻
參考文獻
[1]B. Liu, W. Hsu and Y. Ma. ”Integrating classification and association rule  mining”. Knowledge Discovery and Data Mining. ,1998, pp. 86, 80.
[2]F. Thabtah. , ”A review of associative classification mining.” Knowl. Eng. Rev. , vol. 22, 2007  pp. 37-65. 
[3]E. Baralis and P. Garza. ”A lazy approach to pruning classification rules.” Data Mining, IEEE International Conference on 0pp. 35.  Dec. 2002.
[4] P. Clark and R. Boswell. ”Rule induction with CN2: Some recent improvements.”1991 Presented at EWSL-91: Proceedings of the European Working Session on Learning.
[5]Han-Sheng Hsiung.”Improving the accuracy of text categorization by using association rule with class priority.” Master thesis of Tamkang University, Jun. 2009, pp. 1-54.
[6]M. L. Antonie and O. R. Zaiane. ”Text document categorization by term association.” Proceedings of the 2002 IEEE International Conference on Data Mining. 
[7]R. Agrawal and R. Srikant. ”Fast algorithms for mining association rules.” Presented at Proc. 20th Int. Conf. very Large Data Bases, VLDB. 
[8]R. J. Quinlan and Mike. ”FOIL: A midterm report. Presented at Machine Learning:” ECML-93, European Conference on Machine Learning, Proceedings. 
[9] W. Li, J. Han, and J. Pei, CMAR: Accurate and efficient classification based on multiple class-association rules., ICDM-01, San Jose, CA, Nov. 2001, pp. 369-376.
[10]Tseng, Yuen-Hsien. ”Effectiveness issues in automatic text categ  orization.”Bulletin of the Library Association of China, vol. 68, Jun. 2002, pp. 62-83. 
[11]University of california irvine knowledge discovery in databases  archivehttp://kdd.ics.uci.edu/
[12]The reuters-21578 text categorization test collection. 
http://www.daviddlewis.com/resources/testcollections/reuters21578.
[13]SQL server integration services "英文斷詞系統,
http://msdn.microsoft.com/zh-tw/library/ms141026.aspx." 
[14]Cho-Ming Lee, “Classifying Chinese Text Documents by Association Rule”,   Master thesis of Tamkang University Jun 2006,pp. 1-66
論文全文使用權限
校內
紙本論文於授權書繳交後1年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後1年公開
校外
同意授權
校外電子論文於授權書繳交後1年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信