電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2006-06-28起於校外公開使用
本論文紙本於2006-06-28起公開使用

系統識別號	U0002-2806200601120400
DOI	10.6846/TKU.2006.00873
論文名稱(中文)	有效率的複合式後項關聯式法則探勘演算法-以壽險業為例
論文名稱(英文)	An Efficient Algorithm for Association Rule with Disjunctive Consequent–The Case of the Insurance Industry
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	94
學期	2
出版年	95
研究生(中文)	陳智揚
研究生(英文)	Zhi-Yang Chen
學號	693190257
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2006-06-13
論文頁數	67頁
口試委員	指導教授 - 陳伯榮委員 - 蔣定安委員 - 王鄭慈
關鍵字(中)	關聯式法則複合式物品項目資料探勘壽險
關鍵字(英)	Association Rule Disjunctive Consequent Data Mining Insurance
第三語言關鍵字
學科別分類
中文摘要	在資料採礦中，關聯式法則是經常被使用的技術之一，然而對於新上市的產品而言，關聯式法則的運用卻受到支持度及信賴區間最小門檻值的限制。一般而言，只有當關聯法則A→B和A→C這兩條的支持度和信賴度皆高於最小門檻值時，才表示這兩條法則是有用的。但在現實生活中支持度低可能表示A為較晚推出的產品。另外，當A→B與A→C的信賴度未達門檻值時，並不表示說A→B∨C的信賴度也不會達到門檻值。因此，本論文針對此種狀況，提出複合式後項關聯式法則探勘演算法，發掘出這類有用的規則。並將此法則運用於保險業的產品組合及行銷。由實證結果顯示，主要險種搭配特定的附險銷售時，消費者除了主險外也會一併購買附險。
英文摘要	The association rule is one of the frequently adopted techniques in data mining. However, it practically limited to the minimum support and confidence for newly marketed products. When association rules A→B and A→C can not be discovered from the database, it does not mean that A→B∨C will not be an association rule from the same database. In fact, when A is the newly marketed product, A→B∨C shall be a very useful rule in some cases. Therefore, we propose a new and very simple algorithm to discover this type of rules. Since the consequent item of this kind of rule is formed by a disjunctive composite item, we call this type of rules as the disjunctive consequent association rules. Moreover, when we apply our algorithm to insurance policy for cross selling, the useful results have been proven by the insurance company.
第三語言摘要
論文目次	第一章緒論------------------------------------------1 1.1 研究動機-----------------------------------------1 1.2 組織架構-----------------------------------------6 第二章背景知識--------------------------------------8 2.1 關聯式法則分析-----------------------------------8 2.2 購物籃分析--------------------------------------10 2.3 Apriori演算法-----------------------------------13 2.4 DLG演算法---------------------------------------17 2.5 DHP演算法---------------------------------------21 2.6 DIC演算法---------------------------------------23 2.7 Pincer-Search演算法-----------------------------24 2.8 序列型樣分析------------------------------------26 第三章複合項目演算法-------------------------------34 3.1選擇特定項目的物品當作前項-----------------------36 3.2 複合式後項演算法--------------------------------38 3.3 改良複合式後項演算法----------------------------43 第四章實驗結果與分析-------------------------------49 4.1一般傳統關聯式法則-------------------------------49 4.2 實驗結果----------------------------------------51 4.3 效率分析----------------------------------------53 第五章結論-----------------------------------------58 參考文獻--------------------------------------------59 英文論文--------------------------------------------62 圖目錄圖1.1 導致物品項目支持度和信賴度降低的因素 3 圖1.2 飲料分類階層圖 4 圖2.1 知識發掘流程圖 8 圖2.2 產生候選項目 14 圖2.3 計算候選項目次數 15 圖2.4 Apriori演算法過程 16 圖2.5 DLG資料庫D 19 圖2.6 DLG中L1和Bit-Map 19 圖2.7 DLG有向圖 20 圖2.8 AprioriAll演算法 29 圖2.9 將Lk-1合併得到Ck演算法 29 圖2.10找出maximal sequences演算法 30 圖2.11實例說明交易資料 30 圖2.12 經排序整理後的交易資料 31 圖2.13 經排序整理後的交易資料 32 圖2.14整理項目集編號並移除不重要項目 32 圖2.15 所有frequent sequences 33 圖2.16 所有maximal sequences 33 圖3.1 研究方法流程圖 34 圖3.2 複合式後項演算法 39 圖3.3 取出Z商品相關的所有交易記錄 40 圖3.4 建立規則並刪除已產生關聯規則的候選項目 41 圖3.5 建立規則並刪除符合條件的候選項目 41 圖3.6 建立規則並刪除符合條件的候選項目 42 圖3.7 產生候選項目演算法 45 圖3.8 取出Z商品相關的所有交易記錄 45 圖3.9 建立規則並刪除已產生關聯規則的候選項目 46 圖3.10 建立規則並刪除已產生關聯規則的候選項目 47 圖3.11建立規則並刪除符合條件的候選項目 48 圖4.1 IBM Intelligent Miner分析結果 50 圖 4.2分析結果畫面 51 圖4.3 _QNM15主險使用傳統及定義3.1方法數量比較圖 54 圖4.4 _QNM10主險使用傳統及定義3.1方法數量比較圖 55 圖4.5 _QNM15主險使用定義3.1及定義3.2方法數量比較圖 56 圖4.6 _QNM10主險使用定義3.1及定義3.2方法數量比較圖 56 公式目錄公式3.1 36 公式3.2 36
參考文獻	[1].R. Agrawal, T. Imielinski, A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proceedings of the 1993 ACM SIGMOD Conferencen on Management of Data, Washington D.C., pp. 207-216, May 1993. [2].R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., AAAI/MIT Press, pp. 307-328, 1996. [3].S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rule for Market Basket Data,” Proceedings of the 1997 SIGMOD Conference on Management of Data, pp. 255-264, 1997. [4].J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000. [5].J. Hipp, U. Güntzer, and G. Nakhaeizadeh, “Algorithms for Association Rule Mining – A General Survey and Comparison,” SIGKDD Explorations, Vol. 2, Issue 1, pp. 58-64, 2000. [6].J. S. Park, M. S. Chen, and P. S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Transactions on Knowledge and Data Engineering, Vol. 9, No. 5, pp. 813-825, 1997.167. [7].R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14, March 1995. [8].F. Masseglia, F. Cathala, and P. Poncelet, “The PSP Approach for Mining Sequential Patterns,” Proceedings of 1998 2nd European Symposium on Principles of Data Mining and Knowledge Discovery, Vol. 1510, Nantes, France, pp. 176-184, Sep. 1998. [9].R. Srikant and R. Agrawal, “Mining Sequential Patterns: Generalizations and Performance Improvements,” Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, pp. 3-17, 1996. (An extended version is the IBM Research Report RJ 9994) [10].C. Bettini, X. S. Wang, and S. Jajodia, “Mining Temporal Relationships with Multiple Granularities in Time Sequences,” Data Engineering Bulletin, Vol. 21, pp. 32-38, 1998. [11].H. Mannila, H. Toivonen, and A. I. Verkamo, "Discovering Frequent Episodes in Sequences," Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95), pp. 210-215, Montreal, Canada, 1995. [12].T. Oates, M. D. Schmill, D. Jensen, and P. R. Cohen, “A Family of Algorithms for Finding Temporal Structure in Data,” Proceedings of the 6th International Workshop on AI and Statistics, Fort Lauderdale, Florida, pp. 371-378, 1997. [13].P. Rolland, “FlExPat: Flexible Extraction of Sequential Patterns,” Proceedings of the IEEE International Conference on Data Mining 2001, pp. 481-488, 2001. [14].M. J. Zaki, "Fast Mining of Sequential Patterns in Very Large Databases,"Technical Report 668, The University of Rochester, New York, Nov. 1997. [15].M. J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,”Machine Learning Journal, Vol. 42, No. 1/2, pp. 31-60, 2001. [16].M. J. Zaki, “Efficient enumeration of frequent sequences,” Proceedings of the 7th International Conference on Information and Knowledge Management, Washington, USA, pp. 68-75, Nov.1998. [17].Rakesh Agrawal & Ramakrishman Srikant, “Mining Sequential Patterns,” Research Report, IBM Research Division. [18].Jiawei Han & Micheline, “Data Mining Concepts and Techniques,” MORGAN KAUFMANN Publishers, ISBN 1-55860-489-8. [19].Neter, Kutner, Nachtsheim, and Wasserman, “Applied Linear Statistical Models, 4th Edition, ” McGRAW-HILL, ISBN 0-07-116616-5. [20].J. Han and Y. Fu, “Mining Multiple-level Association Rules in Large Databases,” IEEE Trans. on Knowledge and Data Eng., Vol. 11. No. 5, pp. 798-805, 1999. [21].A. Savasere, E. Omiecinski, S. Navathe, “An Efficient Algorithm for Mining Association Rules in Large Database,” Proc. 21th VLDB, Zurich, Switzerland, pp. 432-444, 1995. [22].H. Toivonen, “Sampling large databases for association rules,” In Proc. 22nd VLDB Conference, Bombay, India, pp. 134-145,Sept. 1996. [23].J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000. [24].J.S. Park, M.S. Chen, and P.S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Trans. on Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Sept./Oct. 1997. [25].Roberto J. Bayardo Jr., “Efficiently Mining Long Patterns from Databases,” Proc. of the ACM SIGMOD Int’l Conf. On Management of Data, pp. 85-93, Seattle, Washington, June 1998.
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信