§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2806200601120400
DOI 10.6846/TKU.2006.00873
論文名稱(中文) 有效率的複合式後項關聯式法則探勘演算法-以壽險業為例
論文名稱(英文) An Efficient Algorithm for Association Rule with Disjunctive Consequent–The Case of the Insurance Industry
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 94
學期 2
出版年 95
研究生(中文) 陳智揚
研究生(英文) Zhi-Yang Chen
學號 693190257
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2006-06-13
論文頁數 67頁
口試委員 指導教授 - 陳伯榮
委員 - 蔣定安
委員 - 王鄭慈
關鍵字(中) 關聯式法則
複合式物品項目
資料探勘
壽險
關鍵字(英) Association Rule
Disjunctive Consequent
Data Mining
Insurance
第三語言關鍵字
學科別分類
中文摘要
在資料採礦中,關聯式法則是經常被使用的技術之一,然而對於新上市的產品而言,關聯式法則的運用卻受到支持度及信賴區間最小門檻值的限制。

一般而言,只有當關聯法則A→B和A→C這兩條的支持度和信賴度皆高於最小門檻值時,才表示這兩條法則是有用的。但在現實生活中支持度低可能表示A為較晚推出的產品。另外,當A→B與A→C的信賴度未達門檻值時,並不表示說A→B∨C的信賴度也不會達到門檻值。

因此,本論文針對此種狀況,提出複合式後項關聯式法則探勘演算法,發掘出這類有用的規則。並將此法則運用於保險業的產品組合及行銷。由實證結果顯示,主要險種搭配特定的附險銷售時,消費者除了主險外也會一併購買附險。
英文摘要
The association rule is one of the frequently adopted techniques in data mining. However, it practically limited to the minimum support and confidence for newly marketed products.

When association rules A→B and A→C can not be discovered from the database, it does not mean that A→B∨C will not be an association rule from the same database. In fact, when A is the newly marketed product, A→B∨C shall be a very useful rule in some cases. 

Therefore, we propose a new and very simple algorithm to discover this type of rules. Since the consequent item of this kind of rule is formed by a disjunctive composite item, we call this type of rules as the disjunctive consequent association rules. Moreover, when we apply our algorithm to insurance policy for cross selling, the useful results have been proven by the insurance company.
第三語言摘要
論文目次
第一章 緒論------------------------------------------1
1.1 研究動機-----------------------------------------1
1.2 組織架構-----------------------------------------6
第二章 背景知識--------------------------------------8
2.1 關聯式法則分析-----------------------------------8
2.2 購物籃分析--------------------------------------10
2.3 Apriori演算法-----------------------------------13
2.4 DLG演算法---------------------------------------17
2.5 DHP演算法---------------------------------------21
2.6 DIC演算法---------------------------------------23
2.7 Pincer-Search演算法-----------------------------24
2.8 序列型樣分析------------------------------------26
第三章 複合項目演算法-------------------------------34
3.1選擇特定項目的物品當作前項-----------------------36
3.2 複合式後項演算法--------------------------------38
3.3 改良複合式後項演算法----------------------------43
第四章 實驗結果與分析-------------------------------49
4.1一般傳統關聯式法則-------------------------------49
4.2 實驗結果----------------------------------------51
4.3 效率分析----------------------------------------53
第五章 結論-----------------------------------------58
參考文獻--------------------------------------------59
英文論文--------------------------------------------62
圖  目  錄

圖1.1 導致物品項目支持度和信賴度降低的因素	3
圖1.2 飲料分類階層圖	4
圖2.1 知識發掘流程圖	8
圖2.2 產生候選項目	14
圖2.3 計算候選項目次數	15
圖2.4 Apriori演算法過程	16
圖2.5 DLG資料庫D	19
圖2.6 DLG中L1和Bit-Map	19
圖2.7 DLG有向圖	20
圖2.8 AprioriAll演算法	29
圖2.9 將Lk-1合併得到Ck演算法	29
圖2.10找出maximal sequences演算法	30
圖2.11實例說明交易資料	30
圖2.12 經排序整理後的交易資料	31
圖2.13 經排序整理後的交易資料	32
圖2.14整理項目集編號並移除不重要項目	32
圖2.15 所有frequent sequences	33
圖2.16 所有maximal sequences	33
圖3.1 研究方法流程圖	34
圖3.2 複合式後項演算法	39
圖3.3 取出Z商品相關的所有交易記錄	40
圖3.4 建立規則並刪除已產生關聯規則的候選項目	41
圖3.5 建立規則並刪除符合條件的候選項目	41
圖3.6 建立規則並刪除符合條件的候選項目	42
圖3.7 產生候選項目演算法	45
圖3.8 取出Z商品相關的所有交易記錄	45
圖3.9 建立規則並刪除已產生關聯規則的候選項目	46
圖3.10 建立規則並刪除已產生關聯規則的候選項目	47
圖3.11建立規則並刪除符合條件的候選項目	48
圖4.1 IBM Intelligent Miner分析結果	50
圖 4.2分析結果畫面	51
圖4.3 _QNM15主險使用傳統及定義3.1方法數量比較圖	54
圖4.4 _QNM10主險使用傳統及定義3.1方法數量比較圖	55
圖4.5 _QNM15主險使用定義3.1及定義3.2方法數量比較圖	56
圖4.6 _QNM10主險使用定義3.1及定義3.2方法數量比較圖	56
 
公  式  目  錄

公式3.1	36
公式3.2	36
參考文獻
[1].R. Agrawal, T. Imielinski, A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proceedings of the 1993 ACM SIGMOD Conferencen on Management of Data, Washington D.C., pp. 207-216, May 1993.

[2].R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., AAAI/MIT Press, pp. 307-328, 1996.

[3].S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rule for Market Basket Data,” Proceedings of the 1997 SIGMOD Conference on Management of Data, pp. 255-264, 1997.

[4].J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000.

[5].J. Hipp, U. Güntzer, and G. Nakhaeizadeh, “Algorithms for Association Rule Mining – A General Survey and Comparison,” SIGKDD Explorations, Vol. 2, Issue 1, pp. 58-64, 2000.

[6].J. S. Park, M. S. Chen, and P. S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Transactions on Knowledge and Data Engineering, Vol. 9, No. 5, pp. 813-825, 1997.167.

[7].R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14, March 1995.

[8].F. Masseglia, F. Cathala, and P. Poncelet, “The PSP Approach for Mining Sequential Patterns,” Proceedings of 1998 2nd European Symposium on Principles of Data Mining and Knowledge Discovery, Vol. 1510, Nantes, France, pp. 176-184, Sep. 1998.

[9].R. Srikant and R. Agrawal, “Mining Sequential Patterns: Generalizations and Performance Improvements,” Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, pp. 3-17, 1996. (An extended version is the IBM Research Report RJ 9994)

[10].C. Bettini, X. S. Wang, and S. Jajodia, “Mining Temporal Relationships with Multiple Granularities in Time Sequences,” Data Engineering Bulletin, Vol. 21, pp. 32-38, 1998.

[11].H. Mannila, H. Toivonen, and A. I. Verkamo, "Discovering Frequent Episodes in Sequences," Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95), pp. 210-215, Montreal, Canada, 1995.

[12].T. Oates, M. D. Schmill, D. Jensen, and P. R. Cohen, “A Family of Algorithms for Finding Temporal Structure in Data,” Proceedings of the 6th International Workshop on AI and Statistics, Fort Lauderdale, Florida, pp. 371-378, 1997.

[13].P. Rolland, “FlExPat: Flexible Extraction of Sequential Patterns,” Proceedings of the IEEE International Conference on Data Mining 2001, pp. 481-488, 2001.

[14].M. J. Zaki, "Fast Mining of Sequential Patterns in Very Large Databases,"Technical Report 668, The University of Rochester, New York, Nov. 1997.

[15].M. J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,”Machine Learning Journal, Vol. 42, No. 1/2, pp. 31-60, 2001.

[16].M. J. Zaki, “Efficient enumeration of frequent sequences,” Proceedings of the 7th International Conference on Information and Knowledge Management, Washington, USA, pp. 68-75, Nov.1998.

[17].Rakesh Agrawal & Ramakrishman Srikant, “Mining Sequential Patterns,” Research Report, IBM Research Division.

[18].Jiawei Han & Micheline, “Data Mining Concepts and Techniques,” MORGAN KAUFMANN Publishers, ISBN 1-55860-489-8.

[19].Neter, Kutner, Nachtsheim, and Wasserman, “Applied Linear Statistical Models, 4th Edition, ” McGRAW-HILL, ISBN 0-07-116616-5.

[20].J. Han and Y. Fu, “Mining Multiple-level Association Rules in Large Databases,” IEEE Trans. on Knowledge and Data Eng., Vol. 11. No. 5, pp. 798-805, 1999.

[21].A. Savasere, E. Omiecinski, S. Navathe, “An Efficient Algorithm for Mining Association Rules in Large Database,” Proc. 21th VLDB, Zurich, Switzerland, pp. 432-444, 1995. 

[22].H. Toivonen, “Sampling large databases for association rules,” In Proc. 22nd VLDB Conference, Bombay, India, pp. 134-145,Sept. 1996.

[23].J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000. 
 
[24].J.S. Park, M.S. Chen, and P.S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Trans. on Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Sept./Oct. 1997. 

[25].Roberto J. Bayardo Jr., “Efficiently Mining Long Patterns from Databases,” Proc. of the ACM SIGMOD Int’l Conf. On Management of Data, pp. 85-93, Seattle, Washington, June 1998.
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信