電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2014-07-21起於校外公開使用
本論文紙本於2014-07-21起公開使用

系統識別號	U0002-1507200900510200
DOI	10.6846/TKU.2009.00485
論文名稱(中文)	以非對稱權重矩陣改善順序型分類器之績效評估指標
論文名稱(英文)	Improvement of Performance Index for Ordinal Classifiers with Asymmetrically Weighted Cost Matrix
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	統計學系碩士班
系所名稱(英文)	Department of Statistics
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	97
學期	2
出版年	98
研究生(中文)	洪惠萍
研究生(英文)	Hui-Ping Hung
學號	696650075
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2009-06-18
論文頁數	49頁
口試委員	指導教授 - 陳景祥委員 - 歐士田委員 - 陳怡如
關鍵字(中)	資料探勘分類器判別分析績效評估指標順序型類別資料類神經網路決策樹
關鍵字(英)	data mining classifier discriminant analysis artificial neural networks decision tree
第三語言關鍵字
學科別分類
中文摘要	在順序型類別的分類是實務上很常見的問題，至今有許多專家學者提出針對順序型類別資料的分類器方法，包含許多統計常用的模型和近年來被廣為引用的資料探勘方法。但是，多數分類器所採用的統計模型必須滿足前提假設才能做配適，例如資料必須符合均質性、常態性和獨立性。另一方面，評估分類器所使用的績效評估指標也攸關到最後決定分類器的決策，不恰當的績效評估指標可能會導致最後選擇的分類器效果不佳。本文建議使用加權kappa係數來評估順序型分類器的績效，並利用實際的誤判成本計算出的非對稱權重矩陣做加權，既接近真實情況，又能考慮到預測類別和實際類別的一致性。本文也嘗試將統計常用的線性判別分析與資料探勘中的十個分類方法做比較，以找到分類績效較好的分類器。
英文摘要	There are many conventional methods to classify ordered data into some specific classes, including statistical methods and data mining. But the use of statistical methods should be based on the assumption of normality, independence and homogeneity. The thesis aims to compare classifiers built with linear discriminant analysis and ten methods of data mining, and found that the further is powerless. Moreover, the performance index used to compare classifiers is related to the precise of decision. An improper performance index may lead to wrong choice of classifiers. The performance index proposed in this thesis is improved from weighted Kappa and considered asymmetrically weighted cost matrix calculated by the cost of misclassification. The results show that the performance index proposed is more credible.
第三語言摘要
論文目次	目錄第一章緒論 1 1.1 研究動機與目的 1 1.2 研究流程 3 1.3 論文架構 4 第二章文獻探討 5 2.1 類神經網路（Artificial Neural Networks）5 2.1.1 類神經網路之簡介 5 2.1.2 倒傳遞類神經網路及演算法 7 2.2 決策樹（Decision Tree）8 2.2.1 決策樹之簡介 8 2.2.2 C4.5及其演算法 9 2.3 線性判別分析（Linear Discriminant Analysis）10 2.4 其他WEKA分類器 11 2.5 分類器的績效評估指標 14 第三章研究方法 17 3.1 單峰模型（Unimodal Model）17 3.2 分類器的績效評估指標 20 第四章模擬與實例分析 29 4.1 模擬資料 30 4.1.1 n=10000 31 4.1.2 n=5000 33 4.1.3 n=3000 35 4.2 實例資料 37 4.2.1 Car資料 38 4.2.2 Nursery資料 40 4.2.3 LEV資料 42 4.2.4 SWD資料 43 第五章研究結論與建議 46 5.1 結論 46 5.2 建議 47 參考文獻 48 表目錄表4.1 n=10000時的kappa_new和kappa_old以及loss ratio 33 表4.2 n=5000時kappa_new和kappa_old以及loss ratio 35 表4.3 n=3000時kappa_new和kappa_old以及loss ratio 37 表4.4 Car資料下kappa_new和kappa_old以及loss ratio 39 表4.5 Nursery資料下kappa_new和kappa_old以及loss ratio 41 表4.6 LEV資料下kappa_new和kappa_old以及loss ratio 43 表4.7 SWD資料下kappa_new和kappa_old以及loss ratio 45 圖目錄圖1.1 研究架構圖 3 圖2.1 處理單元間傳導訊息流程圖 6 圖2.2 倒傳遞神經網路架構 7 圖2.3 RBF類神經網路架構 12 圖4.1 模擬資料歸屬類別示意圖 31 圖4.2 n=10000各分類器的績效評估值 32 圖4.3 n=5000各分類器的績效評估值 34 圖4.4 n=3000各分類器的績效評估值 36 圖4.5 Car資料下各分類器的績效評估值 38 圖4.6 Nursery資料下各分類器的績效評估值 40 圖4.7 LEV資料下各分類器的績效評估值 42 圖4.8 SWD資料下各分類器的績效評估指標值 44
參考文獻	英文部分： 1.Ben-David, A. and Frank, E. (1992), Accuracy old machine learning models versus “hand crafted” expert systems – A credit scoring case study, Expert Systems with Applications, 36, 5264-5271. 2.Ben-David, A., Sterling, L. and Tran, T, (2009), Adding monotonicity to learning algorithms may impair their accuracy, Expert Systems with Applications, 36, 6627-6634. 3.Berry, M. J. A. and Linoff, G. (1997), Data Mining Techniques for Marketing Sale and Customer Support, Wiley Computer, New York, NY. 4.Bishop, C. M. (1995), Neural networks for pattern, Oxford: Clarendon Press. 5.Bohanec, M. and Rajkovic, V. (1990), Expert system for decision making, Sistemica 1, 145-157. 6.Cohen, J. (1960), A coefficient of agreement for nominal scales, Edu. and Psych. Meas., 20, 37-46. 7.Cohen, J. (1968), Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit, Psych. Bull., 70, 213-220. 8.Domingos, P., and Pazzani, M. (1997), On the optimality of the simple Bayesian classifier under zero-one loss, Machine Learnings, 29, 103-130. 9.Frank, E., Hall, M., Pfahringer, B (2003), Locally Weighted Naive Bayes, 19th Conference in Uncertainty in Artificial Intelligence, 249-256. 10.Graham, P. and Jackson, R. (1993), The analysis of ordinal agreement data: Beyond weighted kappa, Journal of Clinical Epidemiology, 46, 1055-1062. 11.Holte, R. C. (1993), Very simple classification rules perform well on most commonly used datasets, Machine Learning, 11, 63-91. 12.John, G. H. and Langley, P. (1995), Estimating Continuous Distributions in Bayesian Classifiers, Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, 338-345. 13.Landwehr, N., Hall, M. and Frank E. (2005), Logistic Model Trees, Machine Learning, 95, 161-205. 14.Olave, M., Rajkovic, V. and Bohanec, M. (1989), An application for admission in public school systems. Expert Systems in Public Administration, 145-160. 15.Pinto da Costa, J. F., Alonso, H. and Cardoso, J. S. (2008), The unimodal model for the classification of ordinal data, Neural Networks, 21, 78–91. 16.Press, W., Flannery, B., Teukolsky, S. and Vetterling, W. (1992). Numerical recipes in C: The art of scientific computing, 2nd edition, Cambridge : Cambridge University Press. 17.Quinlan, J. R. (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers. 18.Venables, W. N. and Ripley, B. D. (2002), Modern applied statistics with S, 4th edition, New York : Springer. 19.Witten, I. H. and Frank, E. (2005), Data Mining: Practical Machine Learning Tools and Techniques, 2nd edition, San Francisco, Calif, : Elsevier/Morgan Kaufmann. 網路部分： WEKA, http://www.cs.waikato.ac.nz/~ml/weka/index.html
論文全文使用權限	校內：紙本論文於授權書繳交後5年公開同意電子論文全文授權校園內公開校內電子論文於授權書繳交後5年公開校外：同意授權校外電子論文於授權書繳交後5年公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信