§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2406201021375700
DOI 10.6846/TKU.2010.00836
論文名稱(中文) 資料探勘技術於WIFLY客戶行為模式之研究
論文名稱(英文) Analyzing Behaviors of WIFLY Customers Using Data Mining Techniques
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系博士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 98
學期 2
出版年 99
研究生(中文) 林正榮
研究生(英文) Cheng-Jung Lin
學號 893190024
學位類別 博士
語言別 英文
第二語言別
口試日期 2010-05-28
論文頁數 118頁
口試委員 指導教授 - 蔣定安
委員 - 蔣定安
委員 - 葛煥昭
委員 - 郭經華
委員 - 王亦凡
委員 - 謝楠楨
關鍵字(中) 顧客關係管理
資料探勘
群集
決策樹
流失(保留)
關鍵字(英) Customer Relationship Management
Data Mining
Clustering
Decision Tree
Churn
第三語言關鍵字
學科別分類
中文摘要
隨著通訊技術的演進, 網際網路的蓬勃發展,過去以個人電腦以及區域網路技術所主導的網際網路形態逐步轉變為通訊、娛樂、電子商務主導的網際網路型態。不再受限於有線網路連結環境的限制,透過上網熱點的連結,使得無線網路的連結更具有便利及即時性質,而多元化的時空網路連結環境使得網際網路的運用更具備實用及商業價值。

WIFLY建構於台北市,採用Wi-Fi無線網路為基礎網路架構,因取得World Teleport Association, WTA旗下的Intelligent Community Forum 所頒發的智慧城市首獎,並獲得國際認證機構JiWire評定為全球最大之無線寬頻網路城市,在2006年成為全球最大公共無線寬頻網路城市。

近年來,由於企業與顧客之關係由過去之產品發展導向,逐漸轉變為由顧客個人行為發展導向的經營模式。隨著無線寬頻網路Wi-Fi使用人數增加,提升顧客價值、維持顧客忠誠度、避免顧客流失以及增加新客源為網路經營業者之重要發展目標。區隔不同顧客的行為差異性,有助於制定適當策略預先避免顧客流失、保留既有顧客、並減少耗費不必要之企業資源成本。

    本論文採用資料探勘技術進行WIFLY顧客行為分析,先運用資料探勘技術進行顧客行為區隔分析,由WIFLY顧客使用行為差異性,找出具有顯著價值之顧客族群。再根據分析之結果,執行決策樹演算分析以探討WIFLY顧戶行為特徵,提供WIFLY經營業者市場及決策部門做為擬定保留顧客以及防止顧客流失策略之參考依據。
英文摘要
As communication techniques improve, Internet continues to boom. Internet leading shape gradually shifted focus from personal computer and local area network for Internet service to these used in communications, recreation, e-commerce etc. Because of wireless networks is not restricted the wired networks link environment, while broadband speeds up Internet connectivity, the wireless ratio-based networks known as Wi-Fi service untether it through hotspots to link Internet. Hence, wireless network is not only more convenient and immediate properties, but also possesses practical and commercial value.

WIFLY is a Wi-Fi service infrastructure in Taipei city, which has begun since December 20 of 2005. The Intelligent Community Forum named Taipei city as recipient of its 2006 Intelligent Community Forum of the year award and presented awards to the Intelligent Building of the year, Intelligent Community Technology and Intelligent Community Visionary of the Year as well Taipei is on the way to develop a stable wireless city following the completions of many of its related infrastructure. JiWire assessed the Taipei is the global largest city of wireless broadband network. 
 
In recent years, the guidance of customer relationship management between enterprises and customers gradually evolved from people into product. There is huge competition for wireless networks operators. As Internet subscribers in wireless networks field are continually growing, wireless network operators have to understand the rising consumer demand and develop suitable strategies, critical to attracting repeat customers, consolidate customers loyalty and satisfaction, with continually prospects new clients etc., Therefore, the marketing strategy and CRM are becoming very important.

This dissertation applied data mining techniques to analyze WIFLY users’ behavior, and identifies the significant value of WIFLY subscribers to predict customer churn in CRM and build predicting model to prevent customer churn. The experimental evaluation results show that customer churn model is effective and efficient. It can help enterprise in predicting the customer churn, building customer loyalty and maximizing enterprise profitability.
第三語言摘要
論文目次
Table of Contents

Abstract	Ⅱ
Table of Contents	Ⅳ
List of Figures	Ⅵ
List of Tables	Ⅷ
Chapter 1 Introduction	1
1.1 Research Motivation of this Dissertation	4
1.2 Research Objectives of this Dissertation	5
1.3 Organization of this Dissertation	6
Chapter 2 Fundamental Theory	8
2.1 Data Mining Technology	8
2.2 Clustering Algorithm	9
2.2.1 Hierarchical Clustering Algorithms	11
2.2.2 Non-hierarchical Clustering Algorithms	14
2.3 Decision Tree Algorithm	16
2.3.1 ID3 (Interactive Dichtometer 3)	17
2.3.2 CART (Classification and Regression Tree)	17
2.3.3 C4.5	19
2.3.4 CHAID (Chi-Square Automatic Interaction Detector)	20
2.4 WIFLY	21
2.5 Customer Relationship Management	22
Chapter 3 Methods (Material and Methods)	24
3.1 CRISP-DM	24
3.2 Procedure of Analysis	26
3.3 Data Preparation	29
3.4 Raw Data set Description	30
Chapter 4 Implement and Experimental Results	36
4.1 Clustering Analysis Result	36
4.2 Decision Tree Analysis Result	52
Chapter 5 Evaluation	55
Chapter 6 Conclusion and Future Work	58
Bibliography	59
Appendix A	67
Appendix B	74
Appendix C	80
Appendix D	99
Appendix E	110

List of Figures

Figure 2-1: Hierarchical Agglomerative Algorithm	12
Figure 2-2: Hierarchical Divisive Algorithm	14
Figure 2-3: K-means Algorithm	15
Figure 3-1: Data Mining Analysis Procedure	28
Figure 3-2: Monthly Statistics of WIFLY Connectivity	29
Figure 3-3: Monthly Statistics of WIFLY Subscribers	30
Figure 4-1: Statistics by Bivariate Method for Monthly Rental Subscribers	37
Figure 4-2: Attribute Overview of Target Customer	38
Figure 4-3: Attribute Overview of Churned Customer	39
Figure 4-4: The Results from Clustering Analysis	40
Figure 4-5: Overview of Attributes in Cluster [2]1	41
Figure 4-6: FREQUENCY 	42
Figure 4-7: HIST_AVG_FREQUENCY	42
Figure 4-8: DURATION_HOUR	43
Figure 4-9: HIST_AVG_DURATION_HOUR	43
Figure 4-10: HIST_CON_GAP_MAX	44
Figure 4-11: HIST_AVG_FREQ_RATIO	45
Figure 4-12: HIST_OUT_IN_AP_RATIO 	45
Figure 4-13: TOTAL_LOC_CNT_WORKDAY_2	46
Figure 4-14: Cluster [4]3, 4.87%	47
Figure 4-15: Cluster [6]5, 4.44%	49
Figure 4-16: Cluster [7]6, 2.69%	51
Figure 4-17: Decision Tree Analyze by Attribute of Gap_Final_To_End	52
Figure 4-18: Figure 4-20: Decision Tree Analyze by Attribute of Frequency	53
Figure 4-19: Figure 4-21: Decision Tree Analyze by Attribute of Drain_Flag	54
Figure 5-1: The Results from Decision Tree Analysis 	55

List of Tables

Table 2-1 Computation of Clustering Distance Formula	13
Table 3.1 Analytical Attributes of interval Time Used in Clustering	32
Table 3.2 Analytical Attributes Used in Clustering	33
Table 3.3 Analytical Attributes of Supplementary Used in Clustering	34
Table 3-4 Supplied Analytical Attributes Used in Decision Tree	35
Table 5-1: Rules Predicting Absence of Churn	56
Table 5-2: To Compare the Total Churn Rate with Five Rules Prediction Churn Rate During Three Months	57
參考文獻
[1].	http://www.worldteleport.org/
[2].	http://www.intelligentcommunity.org/
[3].	http://www.jiwire.com/
[4].	Hwang, H., T. Jung and E. Suh, “An LTV model and customer segmentation based on customer value: A case study on the wireless telecommunication industry”(2004), Expert Syst. Appl., 26: 181-188
[5].	Dirk Van den Poel *, Wouter Buckinx, “Interfaces with Other Disciplines Predicting online-purchasing behavior”, European Journal of Operational Research 166 (2005) 557–575
[6].	Bart Larivière and Dirk Van den Poel, “Predicting customer retention and profitability by using random forests and regression forests techniques” Expert Systems with Applications, Volume 29, Issue 2, August 2005, Pages 472-484
[7].	Yu, W., JD.N. Jutla and S.C. Sivakumar, A churn strategy alignment model for managers in mobile telecom”, Proceedings of the 3rd Annual Communication Networks and Services Research Conference, May 2005, 16-18, IEEE Computer Society Washington, DC, USA, PP: 48-53
[8].	Sung Min Bae, Sung Ho Ha and Sang Chan Park, “A web-based system for analyzing the voices of call center customers in the service industry”, Expert Systems with Applications Volume 28, Issue 1, January 2005, Pages 29-41
[9].	Jang Hee Lee, and Sang Chan Park, “Intelligent profitable customers segmentation system based on business intelligence tools”, Expert Syst. With Appl., volume 29, Issue 1, July 2005, Pages 145-152
[10].	Hung, S.Y., D.C. Yen and H.Y. Wang, 2006. “Applying data mining to telecom churn management”, Expert System Application, 31: 515-524
[11].	Su-Yeon Kim a, Tae-Soo Jung b, Eui-Ho Suh c, Hyun-Seok Hwang, “Customer segmentation and strategy development based on customer lifetime value: A case study” Expert Systems with Applications 31 (2006) 101–107
[12].	Pons, A. P., “Biometric marketing: Targeting the online consumer” Communication of the ACM 2004, 49, 61-65
[13].	Ricardo Chalmeta, “Methodology for customer relationship management”, Journal of Systems and Software Volume 79, Issue 7, July 2006, Pages 1015-1024
[14].	Chen, Y., G. Zhang, D. Hu and C. Fu, 2007, “Customer segmentation based on survival character”, J. Int. Manuf., 18: 513-517
[15].	Luo, B., P. Shao and J. Liu, 2007. Customer churn prediction based on the decision tree in personal handyphone system service. Proceedings of IEEE International Conference on Service Systems and Service Management, June 9-11, Chengdu, PP: 1-5
[16].	Hsu, W. J., D. Dutta and A. Helmy, 2007. Mining behavioral Mining behavioral groups in large wireless lans”, Reference International Proceedings of the 13th annual ACM international conference on Mobile computing and networking, Pages: 338 - 341 ,  Year of Publication: 2007, ISBN:978-1-59593-681-3

[17].	. Zhang, Y., J. Qi, H. Shu andJ. Cao, 2007. “A hybrid KNN-LR classifier and its application in customer churn prediction”, Proceedings of the IEEE International Conference on Sytems, Man and Cybernetics, Oct. 7-10, University of Posts and Telecommunication., Beijing, PP: 3265-3269
[18].	G. Adomavicius and J. Bockstedt, “C-TREND: A New Technique for Identifying Trends in Transactional Data”, 2007 Winter Conference on Business Intelligence
[19].	Dasgupta, K., R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A.A. Nanavati and A. Joshi, 2008. “Social ties and their relevance to churn in mobile telecom networks”, ACM Int. Conf. Proc. Series, 261:668-677.
[20].	P. Sulikowski, Technical University of Szczecin, Poland, “Mobile Operator Customer Classification in Churn Analysis”, SAS GLOBAL Forum 2008, Paper 344-2008
[21].	Chen, X. and I. Bose, 2009. Hybrid models using unsupervised clustering for predicition of customer churn. J., Org. Computer Electronic Commerce, 19:133-51.
[22].	Ali Tamaddoni Jahromi, Predicting Customer Churn in Telecommunications. 2009-052 ISSN:1653-0187 ISRN: LTU-PB-EX 09/052-SE
[23].	Chang. G.. M.J. Healey. J.A.M. McHugh and J.T.L. Wang, 2001. Mining the World Wide Web: An information Search Approach. Kluwer Academic Publishers, Boston
[24].	Peter Cabena, “Discovering Data mining: From Concept to Implementation”, ISBN13:9780137439805
[25].	Michael J. A. Berry and Gordon S. Linoff, 2000. Mastering Data Mining – The art and science of customer relationship management., 2nd Edn., Wiley Computer Publishing, New York, ISBN 0471-33123-6
[26].	Michael J. A. Berry and Gordon S. Linoff, 2004. Data Mining Techniques: for Marketing, Sales, and Customer Relationship Management. 2nd Edn., Wiley Computer Publishing New York. ISBN-IO 0471470643
[27].	Jain, A. K. and Dube, R. C. (1988). Algorithms for clustering data, Englewood Cliffs: Prentice Hall.
[28].	G Karypis, E.H. Han, and V. Kumar, “CHAMELEON: Hierarchical Clustering Using    Dynamic Modeling,” IEEE Computer, Vol. 32, No. 8, 1999, pp. 68-75
[29].	J. Han, M. Kamber and A.K.  H. Tung, “Spatial Clustering Methods in Data Mining: A Survey”, School of Computing Science / Simon Fraser University, Burnaby, BC Canada VSA 1S6.
[30].	S. Guha, R. Rastogi, and K. Shim, “ROCK: A Robust Clustering Algorithm for Categorical Attribute”, In Proceedings of 1999 International Conference on Data Engineering, 1999, pp. 512-521.
[31].	G Karypis, E.H. Han, and V. Kumar, ”CHAMELEON: Hierarchical Clustering Using Dynamic Modeling”, IEEE Computer, Vol. 32, No. 8, 1999,pp 68-75
[32].	R. Ng and J. Han, “Efficient and Effective Clustering Method for Spatial Data Mining”, In Proceedings of International Conference on Very Large Databases, 1994, pp. 144-155
[33].	Ester. M.. H.P. Kriegel. J. Sander and X. Xu, 1996. “A density-based algorithm for discovering clusters in large spatial databases with noise” Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. (ICKDDM'96). Portland, pp: 226-231
[34].	Zhang. T.. R. Ramakrishmn and M. Livny. 1996. “BIRCH: An efficient data clustering method for very large data bases”, Proceedings of 1he ACM SIGMOD International Conference on Management of Data, June 4-6. Canada, PP: 103-114
[35].	Wang. W.. J. Yang and R. Muntz. 1997. ‘STING: A statistical information grid approach to spatial data mining”, Proceedings of the International Conference on VeryLarge Data Bases. Aug. 25-29, Athens, Greece. pp: 86-195
[36].	S. Guha, R. Rastogi, and K. Shim, “CURE: An Efficient Clustering Algorithm for Large Databases”, In Proceedings of 1998 ACM-SIGMOD International Conference on Management of Data, 1998, pp. 73-84.
[37].	S. Guha, R. Rastogi, and K. Shim, “ROCK: A Robust Clustering Algorithm for Categorical Attribute”, In Proceedings of 1999 International Conference on Data Engineering, 1999, pp. 512-521
[38].	G Karypis, E.H. Han, and V. Kumar, “CHAMELEON: Hierarchical Clustering Using Dynamic Modeling”, IEEE Computer, Vol. 32, No. 8, 1999, pp. 68-75.
[39].	Agrawal. R.. J. Gehrke. D. Gunopulos and P. Raghavan. 1998. “Automatic subspace clustering of high dimensional data for data mining applications:, ACM SIGMOD Rec.27: 94-105
[40].	J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 1967, pp. 281-297.
[41].	Quinlan, JR., 1986. “Introduction of decision trees. Machine Learn”., 1:81-106
[42].	Quinlan, J., 1993. “Programs for machine Learning” 1st Edn., Morgan Kufmann, SAN Francisco, ISBN: 1-55860-238-0
[43].	Russell, S.J. and P. Norving, 1995.”Informed Search Methods. Artificial Intelligence: A Modern Approach”, 2nd Edn, Upper Saddle River, Prentice-Hall, New Jersey, pp:96-98.
[44].	Chen J. U.M. Fayyad, K.B. Irani and Z. Qian, 1998 Improved decision trees: A generalized version of ID3. Proceedings of the 5th International Conference on Machine Learning. (ICML'98). Morgan Kaufman, pp: 100-106
[45].	Breiman, L., J.H. Friedman. R.A. Olshen and J. Stone. 1984. Classification and Regression Trees. 1st Edn., Wadsworth International Group, Belmont, CA, ISBN: 978-0412048418. pp: 102-116
[46].	Magidson, J. and J..K. Vermunt, 2004. “An extension of the CHAID tree-based segmentation algorithm to multiple dependent variables classification the ubiquitous challenge”, Proceedings of the 28th Annual Conference of the Gesellschaft für Klassifikation e.V.,  March 9-11 , University of Dortmund. pp: 1-8
[47].	J.Ross Quinlan. “Discovering rules from large collections of examples: a case study”, In Michie, D”, editor, Expert Systems in the Microelectronic Age. Edinburgh University Press, Edinburgh Scotland, 1979.
[48].	Kass G. V., “An exploratory technique for investigating large quantities of categorical data. Applied Statistics”, 29(2):119-127, 1980
[49].	Bligh, Philip; Douglas Turk (2004). “CRM unplugged – releasing Customer Relationship Management (CRM)'s strategic value”. Hoboken: John Wiley & Sons. ISBN 0-471-48304-4.
[50].	Jutla, D., 2001. “Enabling and measuring electronic customer relationship management readiness”.Proceedings of 34th IEEE Hawaii International Conference on System Science, Jan 3 -6,IEEE Computer Society Washington, DC, USA., pp:7023-7032
[51].	Bueren, A., R. Schierholz, L. Kolbe and W. Brenner, 2004. “CKM-improving performance of customer relationship management with knowledge management”. Proceeding of the 37th IEEE Internatiional Conference on System Science, Jan. 5-8, Hawaii, USA., pp:1-10
[52].	http://www.crisp-dm.org/
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信