淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-0808201218191200
中文論文名稱 具一致性的模糊資料探勘方法之研究
英文論文名稱 A Study on Fuzzy Coherent Data-Mining Techniques
校院名稱 淡江大學
系所名稱(中) 資訊工程學系碩士班
系所名稱(英) Department of Computer Science and Information Engineering
學年度 100
學期 2
出版年 101
研究生中文姓名 李艾芳
研究生英文姓名 Ai-Fang Li
學號 697410172
學位類別 碩士
語文別 英文
口試日期 2012-07-05
論文頁數 72頁
口試委員 指導教授-陳俊豪
委員-王鄭慈
委員-蔣璿東
委員-陳俊豪
中文關鍵字 模糊規則  隸屬函數  一致性規則  領域衍生資料探勘  高一致性效益頻率項目集 
英文關鍵字 Fuzzy rule  membership functions  coherent rule  domain-driven data mining  high coherent utility fuzzy itemsets 
學科別分類 學科別應用科學資訊工程
中文摘要 在真實世界中,交易資料通常包含數值型資料。因此,許多利用模糊理論的方法被提出用來從數值型資料中探勘模糊關聯規則。此外,由於每個商品有它自己的利潤,最近這幾年利潤商品集探勘領域也變得相當受歡迎。然而這些方法的共通問題,第一、這些方法的最小支持度不易設定;第二、所探勘出的規則只揭露常識性的資訊,使得這些規則不具有商業價值。故本論文中,我們提出兩個具有命題邏輯特性的演算法,分別為模糊一致性規則(FCR)及高一致性利潤模糊商品集(HCUFI)去克服上述問題。
在第一個演算法,它首先將數值型資料轉換成模糊集合。之後,根據產生出來的模糊集進一步產生候選模糊一致性規則。最後,計算每條候選模糊一致性規則的列聯表,並利用它來檢查規則是否有滿足命題邏輯的四個標準。如果是,該規則就是一條模糊一致性規則。
而在第二個演算法中,由於商品有不同的利潤,因此,我們首先提出一個領域衍生模糊資探勘架構,根據此架構,我們進一步提出一個高一致性利潤模糊商品集探勘演算法來提升探勘樣式的商業價值。它首先轉換數值型交易資料成模糊集合。然後,利用外部利潤表計算每個模糊商品集利潤值,假如此利潤值大於或等於最小利潤比率值,這個商品集會被當作高利潤模糊商品集。最後,計算高利潤模糊商品集的列聯表,並檢查它是否有滿足命題邏輯的四個標準。如果是,該商品集就是一個高一致性利潤模糊商品集。
在實驗部份,透過foodmart資料集亦顯示所提的兩個演算法是有效的。因此,第一個演算法的優點是在不用設定最小支持度下可以自動探勘具有符合命題邏輯且具商業價值的規則。而第二個演算法的優點則是透過領域衍生探勘架構,所提的方法可以探勘出更貼近商業需求的可行動知識樣式。
英文摘要 In real-world applications, transactions usually contain quantitative values. Many fuzzy data mining approaches have been proposed for finding fuzzy association rules from the give quantitative transactions. In addition, since each item has its own utility, utility itemset mining has thus become an interesting field in recent years. However, the common problems of those approaches are that: first, an appropriate minimum support is not easy to be set; second, the derived rules usually expose common-sense knowledge which may not interesting in business point of view. In this thesis, we thus propose two algorithms, called Fuzzy Coherent Rules (FCR) mining and High Coherent Utility Fuzzy Itemsets (HCUFI) mining, to overcome the mentioned problems with the properties of propositional logic.
The first algorithm first transforms quantitative transactions into fuzzy sets. Then, those generated fuzzy sets are further collected to generate candidate fuzzy coherent rules. Finally, contingency tables for every candidate fuzzy coherent rules are calculated and used for checking those candidate fuzzy coherent rules satisfy four criteria or not. If yes, it is then a fuzzy coherent rule.
In second algorithm, due to each item has its utility, we first propose a domain-driven fuzzy data-mining framework. According to the framework, we further propose a high coherent utility fuzzy itemsets mining algorithm for increasing patterns’ business merits. It first transforms quantitative transactions into fuzzy sets. Then, utility of each fuzzy itemsets is then calculated according to the given external utility table. If the value is large than or equals to the minimum utility ratio, it is considered as high utility fuzzy itemset (HUFI). Finally, contingency tables are calculated and used for checking those HUFIs satisfy specific four criteria or not. If yes, it is a High Coherent Utility Fuzzy Itemsets (HCUFI).
Experiments on the foodmart dataset have also been made to show the efficiency of these two proposed approaches. The advantage of first algorithm is that it can derive business interestingness rules with propositional logic without setting minimum support. And, the advantage of second algorithm is that it can derive more actionable knowledge pattern with business interestingness based on the domain-driven fuzzy data-mining framework.
論文目次 CHAPTER 1 INTRODUCTION 1
1.1 PROBLEM DEFINITION AND MOTIVATION 1
1.2 CONTRIBUTIONS 4
1.3 READER'S GUIDE 5
CHAPTER 2 REVIEW OF RELATED WORKS 6
2.1 REVIEW OF FUZZY SET CONCEPTS 6
2.2 REVIEW OF BINARY AND FUZZY DATA MINING APPROACHES 8
2.3 THE ISSUE OF MINIMUM SUPPORT THRESHOLD 11
2.4 REVIEW OF UTILITY OF FUZZY ITEMSET MINING APPROACHES 14
2.5 REVIEW OF DOMAIN-DRIVEN DATA MINIG 15
CHAPTER 3 THE FUZZY COHERENT RULES MINING ALGORITHM 19
3.1 NOTATIONS 19
3.2 THE PROPOSED MINING ALGORITHM 20
3.3 AN EXAMPLE 22
CHAPTER 4 A HIGH COHERENT UTILITY FUZZY ITEMSETS MINING ALGORITHM 29
4.1 THE PROPOSED UI-AKD-BASED FUZZY MINING FRAMEWORK 29
4.2 NOTATIONS 31
4.3 THE PROPOSED HCUFI MINING ALGORITHM 32
4.4 AN EXAMPLE 35
CHAPTER 5 EXPERIMENTAL RESULTS 43
5.1 EXPERIMENTAL RESULTS FOR METHOD (Ⅰ) 43
5.1.1 Dataset Descriptions 43
5.1.2 Experimental Evaluations 44
5.2 EXPERIMENTAL RESULTS FOR METHOD (Ⅱ) 50
5.2.1 Dataset Descriptions 50
5.2.2 Experimental Evaluations 51
CHAPTER 6 CONCLUSIONS AND FUTURE WORKS 55
REFERENCES 57
APPENDIXES: ENGLISH PAPER 61

List of Figures
FIGURE 1: THE MEMBERSHIP FUNCTION USED IN THIS EXAMPLE 23
FIGURE 2: UI-AKD-BASED FUZZY MINING FRAMEWORK 30
FIGURE 3: THE MEMBERSHIP FUNCTION USED IN THIS EXAMPLE 36
FIGURE 4: THE MEMBERSHIP FUNCTION USED IN THIS EXAMPLE 44
FIGURE 5: THE MEMBERSHIP FUNCTION USED IN THIS EXAMPLE 50

List of Tables
TABLE 1: THE FOUR CONDITIONS FOR MAPPING RULES TO EQUIVALENCE 12
TABLE 2: THE CONTINGENCY TABLE OF A RULE 13
TABLE 3: SIX TRANSACTIONS IN THIS EXAMPLE 23
TABLE 4: THE FUZZY SETS TRANSFORMED FROM THE DATA IN TABLE 3 24
TABLE 5: THE COMPLEMENT FUZZY SETS TRANSFORMED TABLE 4 25
TABLE 6: THE COUNT VALUE OF COUNT(BREAD.MIDDLE, MILK.MIDDLE) 27
TABLE 7: THE CONTINGENCY TABLE FOR (BREAD.MIDDLE, MILK.MIDDLE) 27
TABLE 8: THE DERIVED FUZZY COHERENT RULES 28
TABLE 9: SIX TRANSACTIONS IN THIS EXAMPLE 36
TABLE 10: THE EXTERNAL UTILITY (EU) USED IN THIS EXAMPLE 37
TABLE 11: THE FUZZY SETS TRANSFORMED FROM THE DATA IN TABLE 9 37
TABLE 12: THE COMPLEMENT FUZZY SETS TRANSFORMED FROM TABLE 11 38
TABLE 13: THE UFI VALUE OF EACH FUZZY REGIONS IN A 40
TABLE 14: THE UFI VALUE OF ALL ITEMSETS IN CUFI2 41
TABLE 15: THE CONTINGENCY TABLE FOR (MILK.HIGH, BREAD.HIGH) 42
TABLE 16: COMPARISON RESULTS BETWEEN FCR AND FAR 44
TABLE 17: THE STATISTICS ANALYSES BETWEEN FAR AND FCR 46
TABLE 18: THE DERIVED RULES BY FCR AND FAR IN FOOD CATEGORY 47
TABLE 19: COMPARISON RESULTS BETWEEN FCR AND FAR IN NON-CONSUMABLE CATEGORY 48
TABLE 20: COMPARISON RESULTS BETWEEN HCUFI AND FAR 51
TABLE 21: COMPARISON RESULTS BETWEEN HUFI AND HCUFI IN FOOD CATEGORY 52
TABLE 22: COMPARISON RESULTS BETWEEN HUFI AND HCUFI IN NON-CONSUMABLE CATEGORY 53
參考文獻 [1] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules”. International Conference on Very Large Data Bases, pp. 487-499, 1994.
[2] R. Agrawal, T. Imielinksi and A. Swami, “Mining association rules between sets of items in large database,“ The 1993 ACM SIGMOD Conference, Washington DC, USA, 1993.
[3] W. H. Au and K.C.C. Chan, "Mining fuzzy association rules in a bank-account database," IEEE Transactions on Fuzzy Systems, Vol. 11, No 2, pp. 238 - 248, 2003.
[4] K.C.C. Chan and W.H. Au, “An Effective Algorithm for Discovering Fuzzy Rules in Relational Databases,” The IEEE International Conference on Fuzzy Systems, Vol.2, pp.1314-1319, 1998.
[5] C. H. Cai, W. C. Fu, C. H. Cheng and W. W. Kwong, “Mining association rules with weighted items,” The International Database Engineering and Applications Symposium, pp. 68-77, 1998.
[6] L. Cao, "Domain Driven Data Mining (D3M)," IEEE International Conference on Data Mining Workshops(ICDMW'08), pp.74-76, 2008.
[7] T. P. Hong, C. S. Kuo and S. C. Chi, "Mining association rules from quantitative data", Intelligent Data Analysis, Vol. 3, No. 5, pp. 363-376, 1999.
[8] T. P. Hong and C. Y. Lee, "Induction of fuzzy rules and membership functions from training examples," Fuzzy Sets and Systems, Vol. 84, pp. 33-47, 1996.
[9] T. P. Hong, K. Y. Lin and B. C. Chien, "Mining Fuzzy Multiple-Level Association Rules from Quantitative Data," Applied Intelligence, Vol. 18, No. 1, pp. 79-90, 2003.
[10] C. M. Kuok, A. W. C. Fu and M. H. Wong, “Mining fuzzy association rules in databases,” ACM SIGMOD Record, Vol. 27, No. 1, pp. 41–46, 1998.
[11] Y.S. Koh, N. Rountree and R.A. O’Keefe, “Finding Non-Coincidental Sporadic Rules Using Apriori-Inverse,” International Journal of Data Warehousing and Mining, vol. 2, pp. 38-54, 2006.
[12] Y. C. Lee, T. P. Hong and T. C. Wang, "Multi-level fuzzy mining with multiple minimum supports," Expert Systems with Applications, Vol. 34, No. 1, pp. 459-468, 2008.
[13] Y. C. Lee, T. P. Hong and W. Y. Lin, "Mining fuzzy association rules with multiple minimum supports using maximum constraints", Lecture Notes in Computer Science, Vol. 3214, pp. 1283-1290, 2004.
[14] B. Liu, W. Hsu, and Y. Ma, “Mining Association Rules with Multiple Minimum Supports,” ACM SIGKDD, pp. 337-341, 1999.
[15] W. Y. Lin, M. C. Tseng and J.H. Su, “A Confidence-Lift Support Specification for Interesting Associations Mining,” The Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining, pp. 148-158, 2002.
[16] Microsoft Corporation, Example Database FoodMart of Microsoft Analysis Services.
[17] A. Mangalampalli and V. Pudi, "FPrep: Fuzzy Clustering driven Efficient Automated Pre-processing for Fuzzy Association Rule Mining", 2010 IEEE International Conference on Fuzzy Systems, pp.1-8, 2010.
[18] W. Ouyang and Q. Huang, "Mining direct and indirect weighted fuzzy association rules in large transaction databases," International Conference on Fuzzy Systems and Knowledge Discovery, Vol. 3, pp. 128-132, 2009.
[19] A. Tajbakhsh, M. Rahmati, A. Mirzaei, "Intrusion detection using fuzzy association rules," Applied Soft Computing, Vol. 9, No. 2, pp. 462-469, 2009.
[20] G.I. Webb and S. Zhang, “k-Optimal Rule Discovery,” Data Mining and Knowledge Discovery, Vol. 10, No. 1, pp. 39-79, 2005.
[21] H. Yun, D. Ha, B. Hwang and K.H. Ryu, “Mining Association Rules on Significant Rare Data Using Relative Support,” Journal of Systems Software, Vol. 67, pp. 181-191, 2003.
[22] S. Yue, E. Tsang, D. Yeung and D. Shi, “Mining fuzzy association rules with weighted items,” The IEEE International Conference on Systems, Man and Cybernetics, pp. 1906-1911, 2000.
[23] R. Chan, Q. Yang and Y. Shen, "Mining High Utility Itemsets", Third IEEE International Conference on Data Mining(ICDM), pp. 19 -26, 2003.
[24] C. Chu, V. S. Tseng and T. Liang, "An efficient algorithm for mining temporal high utility itemsets from data streams", Journal of Systems and Software, Vol.81 No.7, 2008.
[25] H. Li, H. Huang, Y. Chen, Y. Liu and S. Lee, "Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams", Eighth IEEE International Conference on Data Mining (ICDM '08), pp.881 - 886, 2008.
[26] B. Vo, H. Nguyen and B. Le, "Mining High Utility Itemsets from Vertical Distributed Databases", International Conference on Computing and Communication Technologies (RIVF '09), pp.1 - 4, 2009.
[27] C. Wang, S. Chen, and Y. Huang, "A Fuzzy Approach for Mining High Utility Quantitative Itemsets", IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp.1909 - 1913, 2009.
[28] J. Pillai, O.P. Vyas, S. Soni and Dr. M. Muyeba, "A Conceptual Approach to Temporal Weighted Item set Utility Mining", 2010 International Journal of Computer Applications, Vol.1, No.28, 2010.
[29] C. Lai, P. Chung and V. S. Tseng, "A Novel Algorithm for Mining Fuzzy High Utility Itemsets", Information and Control International Journal of Innovative Computing (ICIC), Vol. 6, No. 10, 2010.
[30] A. Mangalampalli and V. Pudi, "Fuzzy Association Rule Mining Algorithm for Fast and Efficient Performance on Very Large Datasets", IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp.1163-1168, 2009.
[31] J. Wu and X. Li , "Mining Multidimensional Fuzzy Association Rules of Alarms in Communication Networks", 2011 International Conference on Computer Science and Service System (CSSS), pp.2326 - 2330, 2011.
[32] SathiyaPriya K., Sadasivam G.S. and Celin N., "A New Method for Preserving Privacy in Quantitative Association Rules using DSR Approach with Automated Generation of Membership Function", 2011 World Congress on Information and Communication Technologies (WICT), pp.148 - 153, 2011.
[33] J. Zhao and L. Yao, "A General Framework for Fuzzy Data Mining", 2010 International Conference on Computational Intelligence and Software Engineering (CiSE), pp.1 - 3, 2010.
[34] Matthews, S.G., Gongora, M.A. and Hopgood, A.A., "Evolving Temporal Fuzzy Itemsets from Quantitative Data with a Multi-Objective Evolutionary Algorithm", 2011 IEEE 5th International Workshop on Genetic and Evolutionary Fuzzy Systems (GEFS), pp.9 - 16, 2011.
[35] Paranjape-Voditel P. and Deshpande U., "An Association Rule Mining based Stock Market Recommender system", 2011 Second International Conference on Emerging Applications of Information Technology (EAIT), pp.21 - 24, 2011.
[36] W. Ouyang and Q. Huang, "Mining Direct and Indirect Fuzzy Association Rules with Multiple Minimum Supports in Large Transaction Databases", 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Vol.2, pp.947 - 951, 2011.
[37] R. Intan and O. Yenty, "Mining Multidimensional Fuzzy Association Rules from a Normalized Database", International Conference on Convergence and Hybrid Information Technology(ICHIT '08), pp.425 - 432, 2008.
[38] T. Martin and Y. Shen, "Fuzzy Association Rules in Soft Conceptual Hierarchies", Annual Meeting of the North American Fuzzy Information Processing Society(NAFIPS), pp.1 - 6, 2009.
[39] K. Kianmehr, M Kaya, A.M. ElSheikh, J. Jida and R. Alhajj, "Fuzzy association rule mining framework and its application to effective fuzzy associative classification", WIREs Data Mining and Knowledge Discovery, Vol.1, 2011.
[40] J. Alcala-Fdez, N. Flugy-Pape, A. Bonarini and F. Herrera, "Analysis of the Effectiveness of the Genetic Algorithms based on Extraction of Association Rules", Journal Fundamenta Informaticae Intelligent Data Analysis in Granular Computing, Vol.98 No.1, 2010.
[41] M. Fazzolari, R. Alcal’a, Y. Nojima, H. Ishibuchi and F. Herrera, "A Review of the Application of Multi-Objective Evolutionary Fuzzy Systems: Current Status and Further Directions", IEEE Transactions on Fuzzy Systems, 2012.
[42] D. Dubois, H. Prade and T. Sudkamp, "On the representation, measurement, and discovery of fuzzy associations," IEEE Transactions on Fuzzy Systems, Vol. 13, No 2, pp. 250 - 262, 2005.
[43] L. Cao, "Domain-Driven Data Mining: Challenges and Prospects," IEEE Transactions on Knowledge and Data Engineering, Vol. 22 , No. 6, pp.755-769, 2010.
[44] L. Cao, Y. Zhao, H. Zhang, D. Luo, C. Zhang, and E.K. Park, "Flexible frameworks for actionable knowledge discovery," IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 9, pp.1299-1312, 2010.
[45] A. T. H. Sim, M. Indrawan, S. Zutshi and B. Srinivasan, "Logic-based pattern discovery," IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 6, pp.798-811, 2010.
[46] L. A. Zadeh, “Fuzzy sets,” Information and Control, Vol. 8, No. 3, 1965, pp. 338-353.
[47] S. B. Lin, “A Study on Fuzzy Temporal Data Mining,” Master thesis, National Sun Yat-sen University 2011.
[48] T. Watanabe, "Fuzzy Association Rules Mining Algorithm Based on Output Specification and Redundancy of Rules," 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp.283 - 289, 2011.
[49] Y. Bai, X. Meng and X. Han, “Mining Fuzzy Association Rules in Quantitative Databases,” Applied Mechanics and Materials, Vol. 182-183, pp. 2003-2007, 2011.
[50] S. Prakash and R.M.S. Parvathi, “Qualitative Approach for Quantitative Association Rule Mining using Fuzzy Rule Set,” Computational Information Systems, Vol.7, pp.1879-1885, 2011.
[51] L. Shen and S. Liu, “A New Fuzzy Association Rules Mining in Data Streams,” Advances in Intelligent and Soft Computing, Vo. 117/2012, pp.163-172, 2012.
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2014-08-22公開。
  • 同意授權瀏覽/列印電子全文服務,於2014-08-22起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信