淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-2806200601120400
中文論文名稱 有效率的複合式後項關聯式法則探勘演算法-以壽險業為例
英文論文名稱 An Efficient Algorithm for Association Rule with Disjunctive Consequent–The Case of the Insurance Industry
校院名稱 淡江大學
系所名稱(中) 資訊工程學系碩士班
系所名稱(英) Department of Computer Science and Information Engineering
學年度 94
學期 2
出版年 95
研究生中文姓名 陳智揚
研究生英文姓名 Zhi-Yang Chen
電子信箱 693190257@s93.tku.edu.tw
學號 693190257
學位類別 碩士
語文別 中文
口試日期 2006-06-13
論文頁數 67頁
口試委員 指導教授-陳伯榮
委員-蔣定安
委員-王鄭慈
中文關鍵字 關聯式法則  複合式物品項目  資料探勘  壽險 
英文關鍵字 Association Rule  Disjunctive Consequent  Data Mining  Insurance 
學科別分類 學科別應用科學資訊工程
中文摘要 在資料採礦中,關聯式法則是經常被使用的技術之一,然而對於新上市的產品而言,關聯式法則的運用卻受到支持度及信賴區間最小門檻值的限制。

一般而言,只有當關聯法則A→B和A→C這兩條的支持度和信賴度皆高於最小門檻值時,才表示這兩條法則是有用的。但在現實生活中支持度低可能表示A為較晚推出的產品。另外,當A→B與A→C的信賴度未達門檻值時,並不表示說A→B∨C的信賴度也不會達到門檻值。

因此,本論文針對此種狀況,提出複合式後項關聯式法則探勘演算法,發掘出這類有用的規則。並將此法則運用於保險業的產品組合及行銷。由實證結果顯示,主要險種搭配特定的附險銷售時,消費者除了主險外也會一併購買附險。
英文摘要 The association rule is one of the frequently adopted techniques in data mining. However, it practically limited to the minimum support and confidence for newly marketed products.

When association rules A→B and A→C can not be discovered from the database, it does not mean that A→B∨C will not be an association rule from the same database. In fact, when A is the newly marketed product, A→B∨C shall be a very useful rule in some cases.

Therefore, we propose a new and very simple algorithm to discover this type of rules. Since the consequent item of this kind of rule is formed by a disjunctive composite item, we call this type of rules as the disjunctive consequent association rules. Moreover, when we apply our algorithm to insurance policy for cross selling, the useful results have been proven by the insurance company.
論文目次 第一章 緒論------------------------------------------1
1.1 研究動機-----------------------------------------1
1.2 組織架構-----------------------------------------6
第二章 背景知識--------------------------------------8
2.1 關聯式法則分析-----------------------------------8
2.2 購物籃分析--------------------------------------10
2.3 Apriori演算法-----------------------------------13
2.4 DLG演算法---------------------------------------17
2.5 DHP演算法---------------------------------------21
2.6 DIC演算法---------------------------------------23
2.7 Pincer-Search演算法-----------------------------24
2.8 序列型樣分析------------------------------------26
第三章 複合項目演算法-------------------------------34
3.1選擇特定項目的物品當作前項-----------------------36
3.2 複合式後項演算法--------------------------------38
3.3 改良複合式後項演算法----------------------------43
第四章 實驗結果與分析-------------------------------49
4.1一般傳統關聯式法則-------------------------------49
4.2 實驗結果----------------------------------------51
4.3 效率分析----------------------------------------53
第五章 結論-----------------------------------------58
參考文獻--------------------------------------------59
英文論文--------------------------------------------62
圖 目 錄

圖1.1 導致物品項目支持度和信賴度降低的因素 3
圖1.2 飲料分類階層圖 4
圖2.1 知識發掘流程圖 8
圖2.2 產生候選項目 14
圖2.3 計算候選項目次數 15
圖2.4 Apriori演算法過程 16
圖2.5 DLG資料庫D 19
圖2.6 DLG中L1和Bit-Map 19
圖2.7 DLG有向圖 20
圖2.8 AprioriAll演算法 29
圖2.9 將Lk-1合併得到Ck演算法 29
圖2.10找出maximal sequences演算法 30
圖2.11實例說明交易資料 30
圖2.12 經排序整理後的交易資料 31
圖2.13 經排序整理後的交易資料 32
圖2.14整理項目集編號並移除不重要項目 32
圖2.15 所有frequent sequences 33
圖2.16 所有maximal sequences 33
圖3.1 研究方法流程圖 34
圖3.2 複合式後項演算法 39
圖3.3 取出Z商品相關的所有交易記錄 40
圖3.4 建立規則並刪除已產生關聯規則的候選項目 41
圖3.5 建立規則並刪除符合條件的候選項目 41
圖3.6 建立規則並刪除符合條件的候選項目 42
圖3.7 產生候選項目演算法 45
圖3.8 取出Z商品相關的所有交易記錄 45
圖3.9 建立規則並刪除已產生關聯規則的候選項目 46
圖3.10 建立規則並刪除已產生關聯規則的候選項目 47
圖3.11建立規則並刪除符合條件的候選項目 48
圖4.1 IBM Intelligent Miner分析結果 50
圖 4.2分析結果畫面 51
圖4.3 _QNM15主險使用傳統及定義3.1方法數量比較圖 54
圖4.4 _QNM10主險使用傳統及定義3.1方法數量比較圖 55
圖4.5 _QNM15主險使用定義3.1及定義3.2方法數量比較圖 56
圖4.6 _QNM10主險使用定義3.1及定義3.2方法數量比較圖 56

公 式 目 錄

公式3.1 36
公式3.2 36


參考文獻 [1].R. Agrawal, T. Imielinski, A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proceedings of the 1993 ACM SIGMOD Conferencen on Management of Data, Washington D.C., pp. 207-216, May 1993.

[2].R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., AAAI/MIT Press, pp. 307-328, 1996.

[3].S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rule for Market Basket Data,” Proceedings of the 1997 SIGMOD Conference on Management of Data, pp. 255-264, 1997.

[4].J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000.

[5].J. Hipp, U. Güntzer, and G. Nakhaeizadeh, “Algorithms for Association Rule Mining – A General Survey and Comparison,” SIGKDD Explorations, Vol. 2, Issue 1, pp. 58-64, 2000.

[6].J. S. Park, M. S. Chen, and P. S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Transactions on Knowledge and Data Engineering, Vol. 9, No. 5, pp. 813-825, 1997.167.

[7].R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14, March 1995.

[8].F. Masseglia, F. Cathala, and P. Poncelet, “The PSP Approach for Mining Sequential Patterns,” Proceedings of 1998 2nd European Symposium on Principles of Data Mining and Knowledge Discovery, Vol. 1510, Nantes, France, pp. 176-184, Sep. 1998.

[9].R. Srikant and R. Agrawal, “Mining Sequential Patterns: Generalizations and Performance Improvements,” Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, pp. 3-17, 1996. (An extended version is the IBM Research Report RJ 9994)

[10].C. Bettini, X. S. Wang, and S. Jajodia, “Mining Temporal Relationships with Multiple Granularities in Time Sequences,” Data Engineering Bulletin, Vol. 21, pp. 32-38, 1998.

[11].H. Mannila, H. Toivonen, and A. I. Verkamo, "Discovering Frequent Episodes in Sequences," Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95), pp. 210-215, Montreal, Canada, 1995.

[12].T. Oates, M. D. Schmill, D. Jensen, and P. R. Cohen, “A Family of Algorithms for Finding Temporal Structure in Data,” Proceedings of the 6th International Workshop on AI and Statistics, Fort Lauderdale, Florida, pp. 371-378, 1997.

[13].P. Rolland, “FlExPat: Flexible Extraction of Sequential Patterns,” Proceedings of the IEEE International Conference on Data Mining 2001, pp. 481-488, 2001.

[14].M. J. Zaki, "Fast Mining of Sequential Patterns in Very Large Databases,"Technical Report 668, The University of Rochester, New York, Nov. 1997.

[15].M. J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,”Machine Learning Journal, Vol. 42, No. 1/2, pp. 31-60, 2001.

[16].M. J. Zaki, “Efficient enumeration of frequent sequences,” Proceedings of the 7th International Conference on Information and Knowledge Management, Washington, USA, pp. 68-75, Nov.1998.

[17].Rakesh Agrawal & Ramakrishman Srikant, “Mining Sequential Patterns,” Research Report, IBM Research Division.

[18].Jiawei Han & Micheline, “Data Mining Concepts and Techniques,” MORGAN KAUFMANN Publishers, ISBN 1-55860-489-8.

[19].Neter, Kutner, Nachtsheim, and Wasserman, “Applied Linear Statistical Models, 4th Edition, ” McGRAW-HILL, ISBN 0-07-116616-5.

[20].J. Han and Y. Fu, “Mining Multiple-level Association Rules in Large Databases,” IEEE Trans. on Knowledge and Data Eng., Vol. 11. No. 5, pp. 798-805, 1999.

[21].A. Savasere, E. Omiecinski, S. Navathe, “An Efficient Algorithm for Mining Association Rules in Large Database,” Proc. 21th VLDB, Zurich, Switzerland, pp. 432-444, 1995.

[22].H. Toivonen, “Sampling large databases for association rules,” In Proc. 22nd VLDB Conference, Bombay, India, pp. 134-145,Sept. 1996.

[23].J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000.

[24].J.S. Park, M.S. Chen, and P.S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Trans. on Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Sept./Oct. 1997.

[25].Roberto J. Bayardo Jr., “Efficiently Mining Long Patterns from Databases,” Proc. of the ACM SIGMOD Int’l Conf. On Management of Data, pp. 85-93, Seattle, Washington, June 1998.
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2006-06-28公開。
  • 同意授權瀏覽/列印電子全文服務,於2006-06-28起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信