系統識別號 | U0002-1707200515551500 |
---|---|
DOI | 10.6846/TKU.2005.00359 |
論文名稱(中文) | 序列型樣之週期性間隔分析 |
論文名稱(英文) | The Periodical Intervals Analysis on Sequential Patterns |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系碩士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 93 |
學期 | 2 |
出版年 | 94 |
研究生(中文) | 李宜靝 |
研究生(英文) | Yi-Tian Lee |
學號 | 692190084 |
學位類別 | 碩士 |
語言別 | 繁體中文 |
第二語言別 | |
口試日期 | 2005-06-17 |
論文頁數 | 83頁 |
口試委員 |
指導教授
-
蔣定安
委員 - 王鄭慈 委員 - 葛煥昭 |
關鍵字(中) |
序列型樣 時間間隔 週期分佈 迴歸分析 |
關鍵字(英) |
Sequential Patterns Time Interval Periodical Distribution Regression Analysis |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
在處理大量交易資料分析時,我們往往透過關聯式法則(Asso- ciation Rules)分析所有交易項目的搭配銷售組合,並使用序列型樣(Sequential Patterns)分析顧客先後交易習性,但我們運用在產品推薦系統上僅能利用序列型樣得知產品先後購買順序,卻無法得知先後購買產品的間隔時間,以致無法在所有顯著序列資訊的產品之間根據適當時間給予最有利的產品行銷。 本篇論文提出一套用來分析序列型樣對於時間間隔是否潛在週期之演算法,首先我們提出PDT/PDM演算法用來尋找曲線的週期性分布,並將之延伸在遞增/遞減分佈的曲線上修正為LPDT/LPDM演算法,最後我們根據序列型樣對於時間間隔的曲線分布特徵將上述方法歸納為PIM (Periodical Intervals Mining Algorithm)演算法,並將時間間隔的週期起伏挖掘出來,藉由所有產品之間序列的銷售週期比較出最佳推薦產品的銷售時間點以提供產品行銷的最有利資訊。 |
英文摘要 |
In processing huge transaction data analysis, we often use Association Rules Mining and Sequential Patterns Mining techniques to discover the buying behaviors of customers. However, by sequential patterns, we are hard to find out the time intervals of related items purchased. In this paper, we develop a set of algorithms to analysis the periodical properties of time intervals over sequential patterns. The first, we introduce PDT/PDM algorithms to discover periodical distributions for common cases. Then, we extend them as LPDT/LPDM algorithms to overcome linearly trend components of curves. Finally, we combine those algorithms and sequential patterns’ distribution property as PIM (Periodical Intervals Mining) algorithm. By experiment, we use PIM algorithm to analysis the periodical distributions and use them to point out the best choice of products from sequential patterns by compare the periodical intervals. |
第三語言摘要 | |
論文目次 |
第一章 緒論 ………………………………………………………… 1 1.1 研究動機 ………………………………………………… 1 1.2 研究內容 ………………………………………………… 3 1.3 組織架構 ………………………………………………… 5 第二章 背景知識 …………………………………………………… 7 2.1 關聯式法則分析 ………………………………………… 7 2.1.1 購物籃分析 ………………………………………… 7 2.1.2 關聯式法則介紹 …………………………………… 7 2.1.3 關聯式法則分析概念 ……………………………… 8 2.1.4 Apriori演算法 …………………………………… 10 2.1.5 範例說明 ………………………………………… 12 2.2 序列型樣分析 ………………………………………… 16 2.2.1 序列型樣介紹 …………………………………… 16 2.2.2 序列型樣分析步驟 ……………………………… 17 2.2.3 範例說明 ………………………………………… 21 2.3 迴歸分析 ……………………………………………… 26 2.3.1 簡單線性迴歸 …………………………………… 26 2.3.2 多重迴歸模型 …………………………………… 28 2.3.3 範例說明 ………………………………………… 29 第三章 研究方法 …………………………………………………… 31 3.1 週期分佈曲線 ………………………………………… 31 3.1.1 簡介 ……………………………………………… 31 3.1.2 PDT與PDM演算法 ………………………………… 31 3.2 遞減型週期分佈曲線 ………………………………… 42 3.2.1 簡介 ……………………………………………… 42 3.2.2 LPDT與LPDM演算法 ……………………………… 44 3.2.3 多項式討論 ……………………………………… 54 3.3 序列型樣之週期性間隔分布 ………………………… 60 3.3.1 簡介 ……………………………………………… 60 3.3.2 PIM演算法 ………………………………………… 62 第四章 實驗分析 …………………………………………………… 64 4.1 資料準備處理 ………………………………………… 64 4.1.1 簡介 ……………………………………………… 64 4.1.2 資料準備 ………………………………………… 64 4.1.3 挖掘序列型樣 …………………………………… 65 4.2 分析週期分佈 ………………………………………… 68 4.2.1 挖掘週期分佈 …………………………………… 68 4.2.2 驗證週期分佈 …………………………………… 76 第五章 結論 ………………………………………………………… 79 參考文獻 …………………………………………………………… 81 英文稿 ……………………………………………………………… 84 |
參考文獻 |
[1]. R. Agrawal, T. Imielinski, A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proceedings of the 1993 ACM SIGMOD Conferencen on Management of Data, Washington D.C., pp. 207-216, May 1993. [2]. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., AAAI/MIT Press, pp. 307-328, 1996. [3]. S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rule for Market Basket Data,” Proceedings of the 1997 SIGMOD Conference on Management of Data, pp. 255-264, 1997. [4]. J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, Texas, USA, pp. 1-12, May 2000. [5]. J. Hipp, U. Güntzer, and G. Nakhaeizadeh, “Algorithms for Association Rule Mining – A General Survey and Comparison,” SIGKDD Explorations, Vol. 2, Issue 1, pp. 58-64, 2000. [6]. J. S. Park, M. S. Chen, and P. S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Transactions on Knowledge and Data Engineering, Vol. 9, No. 5, pp. 813-825, 1997.167. [7]. R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14, March 1995. [8]. F. Masseglia, F. Cathala, and P. Poncelet, “The PSP Approach for Mining Sequential Patterns,” Proceedings of 1998 2nd European Symposium on Principles of Data Mining and Knowledge Discovery, Vol. 1510, Nantes, France, pp. 176-184, Sep. 1998. [9]. R. Srikant and R. Agrawal, “Mining Sequential Patterns: Generalizations and Performance Improvements,” Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, pp. 3-17, 1996. (An extended version is the IBM Research Report RJ 9994) [10]. C. Bettini, X. S. Wang, and S. Jajodia, “Mining Temporal Relationships with Multiple Granularities in Time Sequences,” Data Engineering Bulletin, Vol. 21, pp. 32-38, 1998. [11]. H. Mannila, H. Toivonen, and A. I. Verkamo, "Discovering Frequent Episodes in Sequences," Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95), pp. 210-215, Montreal, Canada, 1995. [12]. T. Oates, M. D. Schmill, D. Jensen, and P. R. Cohen, “A Family of Algorithms for Finding Temporal Structure in Data,” Proceedings of the 6th International Workshop on AI and Statistics, Fort Lauderdale, Florida, pp. 371-378, 1997. [13]. J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal and M.-C. Hsu, “PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-projected Pattern Growth,” Proceedings 168 of 2001 International Conference on Data Engineering, pp. 215-224. [14]. P. Rolland, “FlExPat: Flexible Extraction of Sequential Patterns,” Proceedings of the IEEE International Conference on Data Mining 2001, pp. 481-488, 2001. [15]. M. J. Zaki, "Fast Mining of Sequential Patterns in Very Large Databases,"Technical Report 668, The University of Rochester, New York, Nov. 1997. [16]. M. J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,”Machine Learning Journal, Vol. 42, No. 1/2, pp. 31-60, 2001. [17]. M. J. Zaki, “Efficient enumeration of frequent sequences,” Proceedings of the 7th International Conference on Information and Knowledge Management, Washington, USA, pp. 68-75, Nov.1998. [18]. Rakesh Agrawal & Ramakrishman Srikant, “Mining Sequential Patterns,” Research Report, IBM Research Division. [19]. Jiawei Han & Micheline, “Data Mining Concepts and Techniques,” MORGAN KAUFMANN Publishers, ISBN 1-55860-489-8. [20]. Neter, Kutner, Nachtsheim, and Wasserman, “Applied Linear Statistical Models, 4th Edition, ” McGRAW-HILL, ISBN 0-07-116616-5. |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信