系統識別號 | U0002-2406201021375700 |
---|---|
DOI | 10.6846/TKU.2010.00836 |
論文名稱(中文) | 資料探勘技術於WIFLY客戶行為模式之研究 |
論文名稱(英文) | Analyzing Behaviors of WIFLY Customers Using Data Mining Techniques |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系博士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 98 |
學期 | 2 |
出版年 | 99 |
研究生(中文) | 林正榮 |
研究生(英文) | Cheng-Jung Lin |
學號 | 893190024 |
學位類別 | 博士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2010-05-28 |
論文頁數 | 118頁 |
口試委員 |
指導教授
-
蔣定安
委員 - 蔣定安 委員 - 葛煥昭 委員 - 郭經華 委員 - 王亦凡 委員 - 謝楠楨 |
關鍵字(中) |
顧客關係管理 資料探勘 群集 決策樹 流失(保留) |
關鍵字(英) |
Customer Relationship Management Data Mining Clustering Decision Tree Churn |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
隨著通訊技術的演進, 網際網路的蓬勃發展,過去以個人電腦以及區域網路技術所主導的網際網路形態逐步轉變為通訊、娛樂、電子商務主導的網際網路型態。不再受限於有線網路連結環境的限制,透過上網熱點的連結,使得無線網路的連結更具有便利及即時性質,而多元化的時空網路連結環境使得網際網路的運用更具備實用及商業價值。 WIFLY建構於台北市,採用Wi-Fi無線網路為基礎網路架構,因取得World Teleport Association, WTA旗下的Intelligent Community Forum 所頒發的智慧城市首獎,並獲得國際認證機構JiWire評定為全球最大之無線寬頻網路城市,在2006年成為全球最大公共無線寬頻網路城市。 近年來,由於企業與顧客之關係由過去之產品發展導向,逐漸轉變為由顧客個人行為發展導向的經營模式。隨著無線寬頻網路Wi-Fi使用人數增加,提升顧客價值、維持顧客忠誠度、避免顧客流失以及增加新客源為網路經營業者之重要發展目標。區隔不同顧客的行為差異性,有助於制定適當策略預先避免顧客流失、保留既有顧客、並減少耗費不必要之企業資源成本。 本論文採用資料探勘技術進行WIFLY顧客行為分析,先運用資料探勘技術進行顧客行為區隔分析,由WIFLY顧客使用行為差異性,找出具有顯著價值之顧客族群。再根據分析之結果,執行決策樹演算分析以探討WIFLY顧戶行為特徵,提供WIFLY經營業者市場及決策部門做為擬定保留顧客以及防止顧客流失策略之參考依據。 |
英文摘要 |
As communication techniques improve, Internet continues to boom. Internet leading shape gradually shifted focus from personal computer and local area network for Internet service to these used in communications, recreation, e-commerce etc. Because of wireless networks is not restricted the wired networks link environment, while broadband speeds up Internet connectivity, the wireless ratio-based networks known as Wi-Fi service untether it through hotspots to link Internet. Hence, wireless network is not only more convenient and immediate properties, but also possesses practical and commercial value. WIFLY is a Wi-Fi service infrastructure in Taipei city, which has begun since December 20 of 2005. The Intelligent Community Forum named Taipei city as recipient of its 2006 Intelligent Community Forum of the year award and presented awards to the Intelligent Building of the year, Intelligent Community Technology and Intelligent Community Visionary of the Year as well Taipei is on the way to develop a stable wireless city following the completions of many of its related infrastructure. JiWire assessed the Taipei is the global largest city of wireless broadband network. In recent years, the guidance of customer relationship management between enterprises and customers gradually evolved from people into product. There is huge competition for wireless networks operators. As Internet subscribers in wireless networks field are continually growing, wireless network operators have to understand the rising consumer demand and develop suitable strategies, critical to attracting repeat customers, consolidate customers loyalty and satisfaction, with continually prospects new clients etc., Therefore, the marketing strategy and CRM are becoming very important. This dissertation applied data mining techniques to analyze WIFLY users’ behavior, and identifies the significant value of WIFLY subscribers to predict customer churn in CRM and build predicting model to prevent customer churn. The experimental evaluation results show that customer churn model is effective and efficient. It can help enterprise in predicting the customer churn, building customer loyalty and maximizing enterprise profitability. |
第三語言摘要 | |
論文目次 |
Table of Contents Abstract Ⅱ Table of Contents Ⅳ List of Figures Ⅵ List of Tables Ⅷ Chapter 1 Introduction 1 1.1 Research Motivation of this Dissertation 4 1.2 Research Objectives of this Dissertation 5 1.3 Organization of this Dissertation 6 Chapter 2 Fundamental Theory 8 2.1 Data Mining Technology 8 2.2 Clustering Algorithm 9 2.2.1 Hierarchical Clustering Algorithms 11 2.2.2 Non-hierarchical Clustering Algorithms 14 2.3 Decision Tree Algorithm 16 2.3.1 ID3 (Interactive Dichtometer 3) 17 2.3.2 CART (Classification and Regression Tree) 17 2.3.3 C4.5 19 2.3.4 CHAID (Chi-Square Automatic Interaction Detector) 20 2.4 WIFLY 21 2.5 Customer Relationship Management 22 Chapter 3 Methods (Material and Methods) 24 3.1 CRISP-DM 24 3.2 Procedure of Analysis 26 3.3 Data Preparation 29 3.4 Raw Data set Description 30 Chapter 4 Implement and Experimental Results 36 4.1 Clustering Analysis Result 36 4.2 Decision Tree Analysis Result 52 Chapter 5 Evaluation 55 Chapter 6 Conclusion and Future Work 58 Bibliography 59 Appendix A 67 Appendix B 74 Appendix C 80 Appendix D 99 Appendix E 110 List of Figures Figure 2-1: Hierarchical Agglomerative Algorithm 12 Figure 2-2: Hierarchical Divisive Algorithm 14 Figure 2-3: K-means Algorithm 15 Figure 3-1: Data Mining Analysis Procedure 28 Figure 3-2: Monthly Statistics of WIFLY Connectivity 29 Figure 3-3: Monthly Statistics of WIFLY Subscribers 30 Figure 4-1: Statistics by Bivariate Method for Monthly Rental Subscribers 37 Figure 4-2: Attribute Overview of Target Customer 38 Figure 4-3: Attribute Overview of Churned Customer 39 Figure 4-4: The Results from Clustering Analysis 40 Figure 4-5: Overview of Attributes in Cluster [2]1 41 Figure 4-6: FREQUENCY 42 Figure 4-7: HIST_AVG_FREQUENCY 42 Figure 4-8: DURATION_HOUR 43 Figure 4-9: HIST_AVG_DURATION_HOUR 43 Figure 4-10: HIST_CON_GAP_MAX 44 Figure 4-11: HIST_AVG_FREQ_RATIO 45 Figure 4-12: HIST_OUT_IN_AP_RATIO 45 Figure 4-13: TOTAL_LOC_CNT_WORKDAY_2 46 Figure 4-14: Cluster [4]3, 4.87% 47 Figure 4-15: Cluster [6]5, 4.44% 49 Figure 4-16: Cluster [7]6, 2.69% 51 Figure 4-17: Decision Tree Analyze by Attribute of Gap_Final_To_End 52 Figure 4-18: Figure 4-20: Decision Tree Analyze by Attribute of Frequency 53 Figure 4-19: Figure 4-21: Decision Tree Analyze by Attribute of Drain_Flag 54 Figure 5-1: The Results from Decision Tree Analysis 55 List of Tables Table 2-1 Computation of Clustering Distance Formula 13 Table 3.1 Analytical Attributes of interval Time Used in Clustering 32 Table 3.2 Analytical Attributes Used in Clustering 33 Table 3.3 Analytical Attributes of Supplementary Used in Clustering 34 Table 3-4 Supplied Analytical Attributes Used in Decision Tree 35 Table 5-1: Rules Predicting Absence of Churn 56 Table 5-2: To Compare the Total Churn Rate with Five Rules Prediction Churn Rate During Three Months 57 |
參考文獻 |
[1]. http://www.worldteleport.org/ [2]. http://www.intelligentcommunity.org/ [3]. http://www.jiwire.com/ [4]. Hwang, H., T. Jung and E. Suh, “An LTV model and customer segmentation based on customer value: A case study on the wireless telecommunication industry”(2004), Expert Syst. Appl., 26: 181-188 [5]. Dirk Van den Poel *, Wouter Buckinx, “Interfaces with Other Disciplines Predicting online-purchasing behavior”, European Journal of Operational Research 166 (2005) 557–575 [6]. Bart Larivière and Dirk Van den Poel, “Predicting customer retention and profitability by using random forests and regression forests techniques” Expert Systems with Applications, Volume 29, Issue 2, August 2005, Pages 472-484 [7]. Yu, W., JD.N. Jutla and S.C. Sivakumar, A churn strategy alignment model for managers in mobile telecom”, Proceedings of the 3rd Annual Communication Networks and Services Research Conference, May 2005, 16-18, IEEE Computer Society Washington, DC, USA, PP: 48-53 [8]. Sung Min Bae, Sung Ho Ha and Sang Chan Park, “A web-based system for analyzing the voices of call center customers in the service industry”, Expert Systems with Applications Volume 28, Issue 1, January 2005, Pages 29-41 [9]. Jang Hee Lee, and Sang Chan Park, “Intelligent profitable customers segmentation system based on business intelligence tools”, Expert Syst. With Appl., volume 29, Issue 1, July 2005, Pages 145-152 [10]. Hung, S.Y., D.C. Yen and H.Y. Wang, 2006. “Applying data mining to telecom churn management”, Expert System Application, 31: 515-524 [11]. Su-Yeon Kim a, Tae-Soo Jung b, Eui-Ho Suh c, Hyun-Seok Hwang, “Customer segmentation and strategy development based on customer lifetime value: A case study” Expert Systems with Applications 31 (2006) 101–107 [12]. Pons, A. P., “Biometric marketing: Targeting the online consumer” Communication of the ACM 2004, 49, 61-65 [13]. Ricardo Chalmeta, “Methodology for customer relationship management”, Journal of Systems and Software Volume 79, Issue 7, July 2006, Pages 1015-1024 [14]. Chen, Y., G. Zhang, D. Hu and C. Fu, 2007, “Customer segmentation based on survival character”, J. Int. Manuf., 18: 513-517 [15]. Luo, B., P. Shao and J. Liu, 2007. Customer churn prediction based on the decision tree in personal handyphone system service. Proceedings of IEEE International Conference on Service Systems and Service Management, June 9-11, Chengdu, PP: 1-5 [16]. Hsu, W. J., D. Dutta and A. Helmy, 2007. Mining behavioral Mining behavioral groups in large wireless lans”, Reference International Proceedings of the 13th annual ACM international conference on Mobile computing and networking, Pages: 338 - 341 , Year of Publication: 2007, ISBN:978-1-59593-681-3 [17]. . Zhang, Y., J. Qi, H. Shu andJ. Cao, 2007. “A hybrid KNN-LR classifier and its application in customer churn prediction”, Proceedings of the IEEE International Conference on Sytems, Man and Cybernetics, Oct. 7-10, University of Posts and Telecommunication., Beijing, PP: 3265-3269 [18]. G. Adomavicius and J. Bockstedt, “C-TREND: A New Technique for Identifying Trends in Transactional Data”, 2007 Winter Conference on Business Intelligence [19]. Dasgupta, K., R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A.A. Nanavati and A. Joshi, 2008. “Social ties and their relevance to churn in mobile telecom networks”, ACM Int. Conf. Proc. Series, 261:668-677. [20]. P. Sulikowski, Technical University of Szczecin, Poland, “Mobile Operator Customer Classification in Churn Analysis”, SAS GLOBAL Forum 2008, Paper 344-2008 [21]. Chen, X. and I. Bose, 2009. Hybrid models using unsupervised clustering for predicition of customer churn. J., Org. Computer Electronic Commerce, 19:133-51. [22]. Ali Tamaddoni Jahromi, Predicting Customer Churn in Telecommunications. 2009-052 ISSN:1653-0187 ISRN: LTU-PB-EX 09/052-SE [23]. Chang. G.. M.J. Healey. J.A.M. McHugh and J.T.L. Wang, 2001. Mining the World Wide Web: An information Search Approach. Kluwer Academic Publishers, Boston [24]. Peter Cabena, “Discovering Data mining: From Concept to Implementation”, ISBN13:9780137439805 [25]. Michael J. A. Berry and Gordon S. Linoff, 2000. Mastering Data Mining – The art and science of customer relationship management., 2nd Edn., Wiley Computer Publishing, New York, ISBN 0471-33123-6 [26]. Michael J. A. Berry and Gordon S. Linoff, 2004. Data Mining Techniques: for Marketing, Sales, and Customer Relationship Management. 2nd Edn., Wiley Computer Publishing New York. ISBN-IO 0471470643 [27]. Jain, A. K. and Dube, R. C. (1988). Algorithms for clustering data, Englewood Cliffs: Prentice Hall. [28]. G Karypis, E.H. Han, and V. Kumar, “CHAMELEON: Hierarchical Clustering Using Dynamic Modeling,” IEEE Computer, Vol. 32, No. 8, 1999, pp. 68-75 [29]. J. Han, M. Kamber and A.K. H. Tung, “Spatial Clustering Methods in Data Mining: A Survey”, School of Computing Science / Simon Fraser University, Burnaby, BC Canada VSA 1S6. [30]. S. Guha, R. Rastogi, and K. Shim, “ROCK: A Robust Clustering Algorithm for Categorical Attribute”, In Proceedings of 1999 International Conference on Data Engineering, 1999, pp. 512-521. [31]. G Karypis, E.H. Han, and V. Kumar, ”CHAMELEON: Hierarchical Clustering Using Dynamic Modeling”, IEEE Computer, Vol. 32, No. 8, 1999,pp 68-75 [32]. R. Ng and J. Han, “Efficient and Effective Clustering Method for Spatial Data Mining”, In Proceedings of International Conference on Very Large Databases, 1994, pp. 144-155 [33]. Ester. M.. H.P. Kriegel. J. Sander and X. Xu, 1996. “A density-based algorithm for discovering clusters in large spatial databases with noise” Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. (ICKDDM'96). Portland, pp: 226-231 [34]. Zhang. T.. R. Ramakrishmn and M. Livny. 1996. “BIRCH: An efficient data clustering method for very large data bases”, Proceedings of 1he ACM SIGMOD International Conference on Management of Data, June 4-6. Canada, PP: 103-114 [35]. Wang. W.. J. Yang and R. Muntz. 1997. ‘STING: A statistical information grid approach to spatial data mining”, Proceedings of the International Conference on VeryLarge Data Bases. Aug. 25-29, Athens, Greece. pp: 86-195 [36]. S. Guha, R. Rastogi, and K. Shim, “CURE: An Efficient Clustering Algorithm for Large Databases”, In Proceedings of 1998 ACM-SIGMOD International Conference on Management of Data, 1998, pp. 73-84. [37]. S. Guha, R. Rastogi, and K. Shim, “ROCK: A Robust Clustering Algorithm for Categorical Attribute”, In Proceedings of 1999 International Conference on Data Engineering, 1999, pp. 512-521 [38]. G Karypis, E.H. Han, and V. Kumar, “CHAMELEON: Hierarchical Clustering Using Dynamic Modeling”, IEEE Computer, Vol. 32, No. 8, 1999, pp. 68-75. [39]. Agrawal. R.. J. Gehrke. D. Gunopulos and P. Raghavan. 1998. “Automatic subspace clustering of high dimensional data for data mining applications:, ACM SIGMOD Rec.27: 94-105 [40]. J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 1967, pp. 281-297. [41]. Quinlan, JR., 1986. “Introduction of decision trees. Machine Learn”., 1:81-106 [42]. Quinlan, J., 1993. “Programs for machine Learning” 1st Edn., Morgan Kufmann, SAN Francisco, ISBN: 1-55860-238-0 [43]. Russell, S.J. and P. Norving, 1995.”Informed Search Methods. Artificial Intelligence: A Modern Approach”, 2nd Edn, Upper Saddle River, Prentice-Hall, New Jersey, pp:96-98. [44]. Chen J. U.M. Fayyad, K.B. Irani and Z. Qian, 1998 Improved decision trees: A generalized version of ID3. Proceedings of the 5th International Conference on Machine Learning. (ICML'98). Morgan Kaufman, pp: 100-106 [45]. Breiman, L., J.H. Friedman. R.A. Olshen and J. Stone. 1984. Classification and Regression Trees. 1st Edn., Wadsworth International Group, Belmont, CA, ISBN: 978-0412048418. pp: 102-116 [46]. Magidson, J. and J..K. Vermunt, 2004. “An extension of the CHAID tree-based segmentation algorithm to multiple dependent variables classification the ubiquitous challenge”, Proceedings of the 28th Annual Conference of the Gesellschaft für Klassifikation e.V., March 9-11 , University of Dortmund. pp: 1-8 [47]. J.Ross Quinlan. “Discovering rules from large collections of examples: a case study”, In Michie, D”, editor, Expert Systems in the Microelectronic Age. Edinburgh University Press, Edinburgh Scotland, 1979. [48]. Kass G. V., “An exploratory technique for investigating large quantities of categorical data. Applied Statistics”, 29(2):119-127, 1980 [49]. Bligh, Philip; Douglas Turk (2004). “CRM unplugged – releasing Customer Relationship Management (CRM)'s strategic value”. Hoboken: John Wiley & Sons. ISBN 0-471-48304-4. [50]. Jutla, D., 2001. “Enabling and measuring electronic customer relationship management readiness”.Proceedings of 34th IEEE Hawaii International Conference on System Science, Jan 3 -6,IEEE Computer Society Washington, DC, USA., pp:7023-7032 [51]. Bueren, A., R. Schierholz, L. Kolbe and W. Brenner, 2004. “CKM-improving performance of customer relationship management with knowledge management”. Proceeding of the 37th IEEE Internatiional Conference on System Science, Jan. 5-8, Hawaii, USA., pp:1-10 [52]. http://www.crisp-dm.org/ |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信