||Analysis on Repeat-Buying Patterns
||Department of Computer Science and Information Engineering
Sequential Pattern Mining
Temporal Data Mining
||在商業環境裡，客戶是整個產品價值鏈最後付錢購買的人，因此對於客戶的理解，以及對於顧客價值管理與顧客貢獻度的研究，是企業獲利分析的最重要環節之一。消費品市場的特色包括產品項目多、顧客數目多、消費筆數多、在一定時間間隔內持續購買、消費金額不大、重複購買相同產品等；而且，不同的顧客群體，因為購買力不同，使用習慣不同，購買金額、購買間隔也會不同。因此，時間軸的資訊，是分析顧客消費行為的關鍵要素之一。本研究以時間性資料探勘技術(Temporal Data Mining)，建立重複購買序列型樣的數學模型，將序列型樣中各事件的次序、間隔、頻率轉換為一個離散數值的數學結構。藉著分析函數的數學特性，找出實際消費行為的規律與變化關係，包括：是否有重複購買現象、重複購買是否有週期、反覆購買同一項目次數、特定消費行為的延續時間等。透過數學模型的建構，結合產業知識 (industry know-how) ，得到更豐富、準確的關於顧客的知識。根據顧客消費行為的知識，企業經營者可以採取更有效率、更即時、更有針對性的行銷策略，以獲得最佳收益。
||Consumer market has several characteristics in common such as repeat-buying over the relevant time frame, a large number of customers, and a wealth of information detailing past customer purchases. Analyzing the characterizations of repeat-buying is necessary to understand and adapt to dynamics of customer behavior for company to survive in a continuously changing environment. The aim of this research is to develop a methodology to detect the existence of repeat-buying behavior and discover the potential period of the repeat-buying behavior. A mathematical model to capture the characteristics of repeat-buying behavior is devised. The algorithms based on our previous works then proposed to provide a scheme to discover periodicity and trends of the purchase. Two fundamental repeat-buying types has been identified and analyzed. Any repeat-buying scenarios can be expressed as the combination of the two fundamental types. The proposed mathematical model coupled with our works on repeat-buying modeling form a process to uncover the characteristics of repeat-buying phenomenon. Coupled with industry domain knowledge and marketers’ expertise, the constructed model helps to predict likely buying behavior, then the corresponding actions can be taken to maximize enterprise's revenue.
||Table of Contents
1. Introduction 1
1.1 Background 1
1.2 Motivation and Research Objectives 4
1.3 Organization of this Dissertation 6
2. Literature Review 7
2.1 Temporal Data Mining 7
2.2 Pattern Discovery 10
2.2.1 Sequential Patterns 13
2.2.2 Frequent Episode 20
2.2.3 Patterns with explicit time constraints 22
2.3 Periodic Mining 25
3. Repeat-Buying Analysis 30
3.1 Repeat-Buying Sequence 30
3.2 Repeat-Buying Distribution Function 34
3.3 General Periodicity Detection (GPD) 36
3.4 Repeat-Buying Sequence Analyzer (RBSA) 41
3.5 Repeat-Buying Modeler (RBM) 45
4. Experimental Results 49
4.1 Data Sources 49
4.2 Experiment Setup 50
4.3 Comparison between RBM and CMA 51
4.3.1 Comparing RBDF and TDF 51
4.3.2 Applying RBM on 2-rb-sequence 53
4.4 Repeat-Buying Phenomenon in Consumer Market 55
4.4.1 Repeat-Buying of physical goods 55
4.4.2 Repeat-Buying of digital services 58
4.5 Lessoned learned from repeat-buying analysis 62
4.5.1 Histogram as a helper 62
4.5.2 Study on segmentation 64
4.5.3 RBM as a descriptive model 67
5. Conclusions 70
6. References 72
List of Figures
Fig. 1 Overview of Temporal Patterns 12
Fig. 2 The Apriori-based algorithm GSP takes the breadth-first approach (Source: Pei et al., 2004) 16
Fig. 3 Projection-based algorithm takes the depth-first approach (Source: Dong, 2009) 16
Fig. 4 An example of event sequence 20
Fig. 5 Types of episodes 21
Fig. 6. A sample of timed finite automata with granularities (Source: Bettini, Wang & Jajodia, 1996 ) 23
Fig. 7 Universal Formulation of Sequential Patterns (Source: Joshi, Karypis & Kumar, 2001) 24
Fig. 8 A time-interval sequential pattern (a, i1, b, i2, c) 27
Fig. 9 Pseudocode for computing TDF 28
Fig. 10 Two fundamental types of repeat-buying sequences 31
Fig. 11. The plot of f(x) = (sin(x)+ 1.5)(24 - x)/18. 37
Fig. 12 The pseudocode of GPD 39
Fig. 13 Vertical dotted line drawn at an interval 2π, solid lines drawn at an interval of 6.39 40
Fig. 14. The pseudocode of RBSA 42
Fig. 15 The plot the sequence <108510> is a linear decreasing function 43
Fig. 16 The domain is divided into two sub-domains at the point Xmax = 112 44
Fig. 17 The plot of the RBDF f1(x) defined on [1, 111] 44
Fig. 18 The plot of the RBDF f2(x) defined on [112, 360] 45
Fig. 19. The pseudocode of algorithm RBM 46
Fig. 20 There are 3889 transactions in which product 3103 and 3801 are purchased together 47
Fig. 21. The output of modified AprioriAll Algorithm 51
Fig. 22 Analysis result <3301, 3304> based on CMA (Chiang, Wang, Chen & Chen, 2009) 53
Fig. 23 The plot of the model of <3301, 3304> established by RBSA 53
Fig. 24 There 10,323 transactions such that customers purchase 3301 and 3304 simultaneously 54
Fig. 25 The counts of interval between 3302 and 3103 in year 2000 56
Fig. 26 The counts of interval between 3302 and 3103 in year 2001 56
Fig. 27 The plot of repeat-buying sequence <3302, 3103> in year 2000 57
Fig. 28 The plot of repeat-buying sequence <3302, 3103> in year 2001 57
Fig. 29 The plot of RBDF of <1004> obtained from 2007 database 59
Fig. 30 The repeat-buying models of <1004> obtained during 2007-2009 59
Fig. 31 The plot of RBDF of <67466> obtained from 2007 database 60
Fig. 32 The repeat-buying models of <67466> obtained during 2007-2009 60
Fig. 33 The plot of RBDF of <72175> obtained from 2007 database 61
Fig. 34 The repeat-buying models of <72175> obtained during 2007-2009 61
Fig. 35 The number of points of <58089> are too little to plot a curve 63
Fig. 36 The sequence <81624> obtained from 2009 database is similar to <58089> 63
Fig. 37 The plot of <2001> obtained from Company A's 2001 database 65
Fig. 38 The result of GPD applied to Group A 66
Fig. 39 The result of GPD applied to Group A 66
Fig. 40 Each year, the sales of <1004> slumps steeply 68
Fig. 41 Stacked bar chart of <1004>'s acquisition channels 69
List of Tables
Table 1 Transactions conducted by five customers in two months 14
Table 2 Multi-dimensional sequence database 18
Table 3 Company A's Database Size 49
Table 4 The transactions conducted by Company B's members during 2007-2009 50
Table 5 The acquisition channels of <1004> 68
||Adomavicius, G. and Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6):734-749.
Agrawal, R., Imielinski, T., and Swami, A. N. (1993). Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, 207-216.
Agrawal, R. and Srikant, R. (1995). Mining sequential patterns. In Yu, P. S. and Chen, A. S. P., editors, Eleventh International Conference on Data Engineering, 3-14.
Ale, J. M. and Rossi, G. H. (2000). An approach to discovering temporal association rules. Proceedings of the 2000 ACM symposium on Applied computing (SAC '00), pages 294-300.
Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of ACM, 26(11):832-843.
Antunes, C. M. and Oliveira, A. L. (2001) Temporal data mining: An overview. Workshop on Temporal Data Mining, 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'01).
Atallah, M., Gwadera, R., and Szpankowski, W. (2004). Detection of significant sets of episodes in event sequences. Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM '04), 3-10.
Baeza-Yates, R. A. (1991). Searching subsequences. Theor. Comput. Sci., 78(2):363-376.
Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison Wesley, 1st edition.
Bay, S. D. and Pazzani, M. J. (2001). Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery, 5(3):213-246.
Berberidis, C., Aref, W. G., Atallah, M., Vlahavas, I., and Elmagarmid, A. K. (2002). Multiple and partial periodicity mining in time series databases. In Proc. of the 15th Euro. Conf. on Artificial Intelligence.
Berry, M. J. A. and Linoff, G. S. (2004). Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley Computer Publishing, 2 edition.
Bettini, C., Wang, X. S., and Jajodia, S. (1996). Testing complex temporal relationships involving multiple granularities and its application to data mining (extended abstract). Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems (PODS '96), pages 68-78, New York, NY, USA. ACM.
Bettini, C., Wang, X. S., Jajodia, S., and Lin, J. L. (1998). Discovering frequent event patterns with multiple granularities in time sequences. IEEE Transactions on Knowledge and Data Engineering, 10(2):222-237.
Box, G. E. P., Jenkins, G. M., and Reinsel, G. C. (2008). Time Series Analysis: Forecasting and Control (Wiley Series in Probability and Statistics). Wiley, 4 edition.
Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction, 12(4):331-370.
Casas-Garriga, G. (2003). Discovering unbounded episodes in sequential data. Proceedings of 7th Eur. COnf. on Principles and Practice of Knowledge Discovery in Databases (PKDD '03), 83-94.
Chandola, V., Banerjee, A., and Kumar, V. (2009). Anomaly detection: A survey. ACM Comput. Surv., 41(3):1-58.
Chang, C. Y., Chen, M. S., and Lee, C. H. (2002). Mining general temporal association rules for items with different exhibition periods. Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM '02), pages 59-66.
Chakrabarti, S., Sarawagi, S., and Dom, B. (1998). Mining surprising patterns using temporal description length. Proceedings of the 24rd International Conference on Very Large Data Bases (VLDB '98), pages 606-617.
Chatfield, C. (2003). The Analysis of Time Series: An Introduction, Sixth Edition (Texts in Statistical Science). Chapman & Hall/CRC, 6 edition.
Chen, Y. L., Chen, J. M., and Tung, C. W. (2006). A data mining approach for retail knowledge discovery with consideration of the effect of shelf-space adjacency on sales. Decision Support Systems, 42(3):1503-1520.
Chen, Y. L., Chiang, M. C., and Ko, M. T. (2003). Discovering time-interval sequential patterns in sequence databases. Expert Systems with Applications, 25(3):343-354.
Chen, R. S. and Hu, Y. C. (2003). A novel method for discovering fuzzy sequential patterns using the simple fuzzy partition method. J. Am. Soc. Inf. Sci. Technology., 54(7):660-670.
Chen, Y. L. and Hu, Y. H. (2006). Constraint-based sequential pattern mining: The consideration of recency and compactness. Decision Support Systems, 42(2):1203-1215.
Chen, Y.L. and Huang, T. C. K. (2005). Discovering fuzzy time-interval sequential patterns in sequence databases. IEEE Transactions on System, Man and Cybernetics: Part B, 35(5):959-972.
Chen, M. S., Park, J. S., and Yu, P. S. (1998). Efficient data mining for path traversal patterns. Knowledge and Data Engineering, 10(2):209-221.
Chen, R.-S., Tzeng, G.-H., Chen, C. C., and Hu, Y.-C. (2001). Discovery of fuzzy sequential patterns for fuzzy partitions in quantitative attributes. Proc. of Joint 9th IFSA World Congress and 20th NAFIPS International Conference, pages 1317-1321.
Chiang, D.-A., Lee, S.L., Chen, C.C., and Wang, M.H. (2005). Mining interval sequential patterns. Int. J. Intell. Syst., 20(3):359-373.
Chiang, D. -A., Wang, C. T., Chen, S. P., and Chen, C. C. (2009). The cyclic model analysis on sequential patterns. IEEE Transactions on Knowledge and Data Engineering, 21(11):1617-1628.
Chiang, D.-A., Wang, Y.-H., and Chen, S.-P. (2010). Analysis on repeat-buying patterns. Knowledge-Based Systems, In Press, Accepted Manuscript.
Chiang, D. -A., Wang, Y. F., Lee, S. L., and Lin, C. J. (2003). Goal-oriented sequential pattern for network banking churn analysis. Expert Systems with Applications, 25(3):293-302.
Cong, S., Han, J., and Padua, D. (2005). Parallel mining of closed sequential patterns. Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (KDD '05), 562-567.
Dasgupta, D. and Forrest, S. (1996). Novelty detection in time series data using ideas from immunology. Proceedings of the International Conference on Intelligence Systems.
Dean, J. and Ghemawat, S. (2008). Mapreduce: simplified data processing on large clusters. Communications of ACM, 51(1):107-113.
Dibb, S. and Simkin, L. (1996). The Market Segmentation Workbook: Target Marketing for Marketing Managers. Cengage Learning Business Press, 1 edition.
Ding, Y., Li, X., and Orlowska, M. E. (2006). Recency-based collaborative filtering. Proceedings of the 17th Australasian Database Conference, pages 99-107.
Dong, G. and Li, J. (1999). Efficient mining of emerging patterns: discovering trends and differences. In Proc. 5th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '99), pages 43-52.
Dong, G. (2009). Sequence Data Mining. Springer US.
Ehrenberg, A. S. C. (1968). The practical meaning and usefulness of the nbd/lsd theory of repeat-buying. Journal of the Royal Statistical Society. Series C (Applied Statistics), 17(1):17-32.
Elfeky, M. G., Aref, W. G., and Elmagarmid, A. K. (2005). Periodicity detection in time series databases. IEEE Transactions on Knowledge and Data Engineering, 17(7):875-887.
El-Sayed, M., Ruiz, C., and Rundensteiner, E. A. (2004). FS-miner: efficient and incremental mining of frequent sequence patterns in web logs. Proceedings of the 6th annual ACM international workshop on Web information and data management (WIDM '04), 128-135.
Fan, H. and Ramamohanarao, K. (2003). Efficiently mining interesting emerging patterns. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2762:189-201.
Fayyad, U. M., Piatetsky-Shapiro, G., and Smyth, P. (1996). The KDD process for extracting useful knowledge from volumes of data. Commun. ACM, 39(11):27-34.
Felfernig, A., Friedrich, G., and Schmidt-Thieme, L. (2007). Guest editors' introduction: Recommender systems. Intelligent Systems, 22(3):18-21.
Fiot, C., Laurent, A., and Teisseire, M. (2007). From crispness to fuzziness: Three algorithms for soft sequential pattern mining. IEEE Transactions on Fuzzy Systems, 15(6):1263-1277.
Freksa, C. (1992). Temporal reasoning based on semi-intervals. Artif. Intell., 54(1-2):199-227.
Garofalakis, M. N., Rastogi, R., and Shim, K. (1999). Spirit: Sequential pattern mining with regular expression constraints. In VLDB '99: Proceedings of the 25th International Conference on Very Large Data Bases, 223-234.
Haberman, B. and Averbuch, H. (2002). The case of base cases: why are they so difficult to recognize? student difficulties with recursion. Proceedings of the 7th annual conference on Innovation and technology in computer science education (ITiCSE '02), pages 84-88, New York, NY, USA. ACM.
Han, J., Dong, G., and Yin, Y. (1999). Efficient mining of partial periodic patterns in time series database. In Data Engineering, 1999. Proceedings., 15th International Conference on, pages 106-115.
Han, J. and Kamber, M. (2006). Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, 2 edition.
Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M.C. (2000). Freespan: frequent pattern-projected sequential pattern mining. In KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 355-359.
Hand, D. J., Mannila, H., and Smyth, P. (2001). Principles of Data Mining (Adaptive Computation and Machine Learning). The MIT Press.
Hawkins, D. (1980). Identification of Outliers (Monographs on Statistics and Applied Probability). Springer, 1 edition.
Hirao, M., Inenaga, S., Shinohara, A., Takeda, M., and Arikawa, S. (2001). A practical algorithm to find the best episode patterns. Proceedings of the 4th International Conference on Discovery Science (DS '01), 435-440.
Hong, P. and Huang, T. S. (2002). Automatic temporal pattern extraction and association. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), volume 2, pages 2005-2008.
Hu, Y. C., Chen, R. S., Tzeng, G. H., and Shieh, J. H. (2003). A fuzzy data mining algorithm for finding sequential patterns. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE BASED SYSTEMS, 11(2):173-194.
Hu, Y.H., Huang, T. C., Yang, H.R., and Chen, Y.-L. (2009). On mining multi-time-interval sequential patterns. Data & Knowledge Engineering, 68(10):1112-1127.
Huang, C.-L. and Huang, W.-L. (2009). Handling sequential pattern decay: Developing a two-stage collaborative recommender system. Electronic Commerce Research and Applications, 8(3):117-129.
Kargupta, H., Han, J., Yu, P. S., Motwani, R., and Kumar, V., editors (2008). Next Generation of Data Mining (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series). Chapman and Hall/CRC, 1 edition.
Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S. (2001). Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems, 3(3):263-286.
Keogh, E., Lonardi, S., and chi' Chiu, B. Y. (2002). Finding surprising patterns in a time series database in linear time and space. Proc.of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD '02), 550-556.
Koren, Y. (2009). Collaborative filtering with temporal dynamics. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '09), pages 447-456, New York, NY, USA. ACM.
Korner, V. and Zimmermann, H. D. (2000). Management of Customer Relationship in Business Media-The Case of the Financial Industry. Proceedings of the 33rd Hawaii International Conference on System Sciences, volume 6, pages 1-10.
Kriegel, H. P., Borgwardt, K. M., Kroger, P., Pryakhin, A., Schubert, M., and Zimek, A. (2007). Future trends in data mining. Data Mining and Knowledge Discovery, 15(1):87-97.
Kum, H. C., Chang, J., and Wang, W. (2006). Sequential pattern mining in multi-databases via multiple alignment. Data Mining and Knowledge Discovery, 12(2-3):151-180.
Kum, H. C., Chang, J. H., and Wang, W. (2007). Benchmarking the effectiveness of sequential pattern mining methods. Data & Knowledge Engineering, 60(1):30-50.
Laxman, S., Sastry, P. S., and Unnikrishnan, K. P. (2005). Discovering frequent episodes and learning hidden markov models: A formal connection. IEEE Transactions on Knowledge and Data Engineering, 1505-1517.
Laxman, S. and Sastry, P. (2006). A survey of temporal data mining. Sadhana, 31(2):173-198.
Laxman, S., Sastry, P. S., and Unnikrishnan, K. P. (2007). A fast algorithm for finding frequent episodes in event streams. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '07), 410-41.
Lee, C. H., Lin, C. R., and Chen, M. S. (2002). On mining general temporal association rules in a publication database. Proceedings IEEE International Conference on Data Mining (ICDM 2001),337-344.
Lee, C. H., Lin, C. R., and Chen, M. S. (2005). Sliding window filtering: an efficient method for incremental mining on a time-variant database. Information Systems, 30(3):227-244.
Lee, M. L., Ling, T. W., Lu, H., and Ko, Y. T. (1999). Cleansing data for mining and warehousing. Proceedings of the 10th International Conference on Database and Expert Systems Applications (DEXA '99), 751-760.
Li, Y., Ning, P., Wang, X. S., and Jajodia, S. (2002). Discovering calendar-based temporal association rules. Proceedings of 8th International Symposium on Temporal Representation and Reasoning (TIME'01), pages 111-118.
Lin, M. (2004). Incremental update on sequential patterns in large databases by implicit merging and efficient counting. Information Systems, 29(5):385-404.
Lin, M. Y., Hsueh, S., and Chang, C. (2008). Mining closed sequential patterns with time constraints. Journal of Information Science and Engineering, 24(1):33.
Lin, F., Huang, K., and Chen, N. (2006). Integrating information retrieval and data mining to discover project team coordination patterns. Decision Support Systems, 42(2):745-758.
Lin, M. Y. and Lee, S. Y. (2004). Incremental update on sequential patterns in large databases by implicit merging and efficient counting. Information Systems, 29(5):385-404.
Lin, M. Y. and Lee, S. Y. (2005a). Efficient mining of sequential patterns with time constraints by delimited pattern growth. Knowledge and Information Systems, 7(4):499-514.
Lin, M. Y. and Lee, S. Y. (2005b). Fast discovery of sequential patterns through memory indexing and database partitioning. Journal of Information Science and Engineering, 21(1):109-128.
Liu, B., Ma, Y., and Lee, R. (2001). Analyzing the interestingness of association rules from the temporal dimension. IEEE International Conference on Data Mining (ICDM-2001), pages 377-384.
Luo, J. and Bridges, S. M. (2000). Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. International Journal of Intelligent Systems, 15(8):687-703.
Luo, J., Bridges, S. M., and Vaughn, R. B. (2001). Fuzzy frequent episodes for real-time intrusion detection. Proceedings of the 10th IEEE International Conference on Fuzzy Systems, pages 368-371.
Ma, S. and Hellerstein, J. L. (2001). Mining partially periodic event patterns with unknown periods. In Data Engineering, 2001. Proceedings. 17th International Conference, 205-214.
Maletic, J. I. and Marcus, A. (2000). Data cleansing: Beyond integrity analysis. Proceedings of the Conference on Information Quality, 200-209.
Mannila, H., Toivonen, H., and Verkamo, I. A. (1997). Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259-289.
Masseglia, F., Cathala, F., and Poncelet, P. (1998). The PSP approach for mining sequential patterns. Proc. Second European Symp. Principles Data Mining Knowledge Discovery (PKDD '98), 176-184.
Masseglia, F., Poncelet, P., and Teisseire, M. (2003). Incremental mining of sequential patterns in large databases. Data and Knowledge Engineering, 46(1):97-121.
Masseglia, F., Poncelet, P., and Teisseire, M. (2009). Efficient mining of sequential patterns with time constraints: Reducing the combinations. Expert Systems with Applications, 36(2):2s677-2690.
Ozden, B., Ramaswamy, S., and Silberschatz, A. (1998). Cyclic association rules. In Data Engineering, 1998. Proceedings., 14th International Conference, 412-421.
Parthasarathy, S., Zaki, M. J., Ogihara, M., and Dwarkadas, S. (1999). Incremental and interactive sequence mining. Proceedings of the eighth international conference on Information and knowledge management (CIKM '99), pages 251-258, New York, NY, USA. ACM.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. Proceedings of the 7th International Conference on Database Theory (ICDT '99), 398-416.
Padmanabhan, B. and Tuzhilin, A. (2005). On characterization and discovery of minimal unexpected patterns in rule discovery. IEEE Transactions on Knowledge and Data Engineering, 18(2):202-216.
Piatetsky-Shapiro, G. (2007). Data mining and knowledge discovery 1996 to 2005: overcoming the hype and moving from university to business and analytics. Data Mining and Knowledge Discovery, 15(1):99-105.
Pei, J., Han, J., Mortazavi-Asl, B., and Zhu, H. (2000). Mining access patterns efficiently from web logs. Knowledge Discovery and Data Mining. Current Issues and New Applications, pages 396-407.
Pei, J., Han, J., Asl, B. M., Pinto, H., Chen, Q., Dayal, U., and Hsu, M. (2001). Prefixspan: Mining sequential patterns by prefix-projected growth. In Proceedings of the 17th International Conference on Data Engineering, 215-224.
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. (2004). Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Transactions on Knowledge and Data Engineering, 16(11):1424-1440.
Peng, W. C. and Liao, Z. X. (2009). Mining sequential patterns across multiple sequence databases. Data & Knowledge Engineering, 68(10):1014-1033.
Peppers, D., Rogers, M., and Sengupta, S. (1995). The one to one future. International Business Review, 4(4):541-543.
Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., and Dayal, U. (2001). Multi-dimensional sequential pattern mining. Proceedings of the tenth international conference on Information and knowledge management (CIKM '01), pages 81-88, New York, NY, USA. ACM.
Qin, M. and Hwang, K. (2004) Frequent episode rules for Internet anomaly detection. Proceedings of Third IEEE International Symposium on Network Computing and Applications, pages 161-168.
Ramakrishnan, R., Agrawal, R., Freytag, J.-C., Bollinger, T., Clifton, C. W., Dzeroski, S., Hipp, J., Keim, D., Kramer, S., Kriegel, H.-P., Leser, U., Liu, B., Mannila, H., Meo, R., Morishita, S., Ng, R., Pei, J., Raghavan, P., Spiliopoulou, M., Srivastava, J., and Torra, V. (2005). Data mining: The next generation. Perspectives Workshop: Data Mining: The Next Generation, number 04292 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany. Internationales Begegnungs- und Forschungszentrum fur Informatik (IBFI), Schloss Dagstuhl, Germany.
Robertson, S. (2002). Threshold setting and performance optimization in adaptive filtering. Information Retrieval, 5(2):239-256.
Roddick, J. F. and Spiliopoulou, M. (2002). A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering, 14(4):750-767.
Rygielski, C. (2002). Data mining techniques for customer relationship management. Technology in Society, 24(4):483-502.
Salton, G. (1988). Automatic Text Processing: The Transformation Analysis and Retrieval of Information by Computer (Addison-Wesley series in computer science).
Schwartz, B. (2005). The Paradox of Choice: Why More Is Less. Harper Perennial.
Shenk, D. (1998). Data Smog: Surviving the Information Glut Revised and Updated Edition. HarperOne, rev upd edition.
Sichel, H. S. (1982). Repeat-buying and the generalized inverse gaussian-poisson distribution. Journal of the Royal Statistical Society. Series C (Applied Statistics), 31(3):193-204.
Song, H. S., Kim, J. K., and Kim, S. H. (2001). Mining the change of customer behavior in an internet shopping mall. Expert Systems with Applications, 21(3):157-168.
Srikant, R. and Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. Proceedings of the 5th International Conference on Extending Database Technology (EDBT '96), 3-17.
Suzuki, E. and Żytkow, J. M. (2005). Unified algorithm for undirected discovery of exception rules. Int. J. Intell. Syst., 20(7):673-691.
Swift, R., Accelerating Customer Relationships, Prentice Hall, 2001.
The Economist's special report on managing information: data, data everywhere. (2010, Februray 25). The Economist. Retrieved May 5, 2010 from http://www.economist.com/surveys/displaystory.cfm?story_id=15557443
Toroslu, I. H. (2003). Repetition support and mining cyclic patterns. Expert Systems with Applications, 25(3): 303-311.
Terlecki, P. and Walczak, K. (2007). On the relation between rough set reducts and jumping emerging patterns. Information Sciences, 177(1):74-83.
Tsai, C. and Shieh, Y. (2009). A change detection method for sequential patterns. Decision Support Systems, 46(2):501-511.
Tronicek, Z. (2001). Episode matching. Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching (CPM '01), 143-146.
Tsiptsis, K. and Chorianopoulos, A. (2010). Data Mining Techniques in CRM: Inside Customer Segmentation. Wiley.
Tzvetkov, P., Yan, X., and Han, J. (2005). TSP: Mining top-k closed sequential patterns. Knowledge and Information Systems, 7(4):438-457.
Wang, K. (1997). Discovering patterns from large and dynamic sequential data. Journal of Intelligent Information Systems, 9(1):33-56.
Wang, J. T. L., Chirn, G. W., Marr, T. G., Shapiro, B., Shasha, D., and Zhang, K. (1994). Combinatorial pattern discovery for scientific data: some preliminary results. Proceedings of the 1994 ACM SIGMOD international conference on Management of data (SIGMOD '94), 115-125.
Wang, J. and Han, J. (2004). BIDE: Efficient mining of frequent closed sequences. Proceedings of the 20th International Conference on Data Engineering (ICDE '04), 79-90.
Wang, J., Han, J., and Pei, J. (2003). Closet+: searching for the best strategies for mining frequent closed itemsets. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '03), pages 236-245, New York, NY, USA. ACM.
Wedel, M. and Kamakura, W. A. (2000). Market Segmentation: Conceptual and Methodological Foundations (International Series in Quantitative Marketing). Kluwer Academic Publishers, 2nd. edition.
Whitehead, B. A. and Hoyt, W. A. (1993). A function approximation approach to anomaly detection in propulsion system test data. Proceedings of 29th AIAA/SAE/ASME/ASEE Joint Propulsion Conference.
Wu, P. H., Peng, W. C., and Chen, M. S. (2001). Mining sequential alarm patterns in a telecommunication database. Proceedings of the VLDB 2001 International Workshop on Databases in Telecommunications II (DBTel '01), 37-51.
Xu, Y. and Yin, H. (2008). Novelty and topicality in interactive information retrieval. Journal of the American Society for Information Science and Technology, 59(2):201-215.
Yan, X., Han, J., and Afshar, R. (2003). Clospan: Mining closed sequential patterns in large datasets. Proc. 2003 Int. SIAM Conf. on Data Mining (SDM '03), 166-177.
Yang, J., Wang, W., and Yu, P. S. (2003). Mining asynchronous periodic patterns in time series data. IEEE Transactions on Knowledge and Data Engineering, 15(3):613-628.
Yang, J., Wang, W., and Yu, P. S. (2004). Mining surprising periodic patterns. Data Mining and Knowledge Discovery, 9(2):189-216.
Yun, C.H. and Chen, M.S. (2007). Mining mobile sequential patterns in a mobile commerce environment. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 37(2):278-295.
Zaki, M. J. (2000). Sequence mining in categorical domains: incorporating constraints. Proceedings of the ninth international conference on Information and knowledge management (CIKM '00), 422-429.
Zaki, M. J. (2001). Spade: An efficient algorithm for mining frequent sequences. Machine Learning, 42(1):31-60.
Zhang, Y., Callan, J., and Minka, T. (2002). Novelty and redundancy detection in adaptive filtering. Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval(SIGIR '02), 81-88.