電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2010-08-05起於校外公開使用
本論文紙本於2010-08-05起公開使用

系統識別號	U0002-2707200916081000
DOI	10.6846/TKU.2009.01020
論文名稱(中文)	設計一考慮時間因素之網頁推薦系統
論文名稱(英文)	The Design of a Webpage Recommender System which Incorporates Time Factor
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系資訊網路與通訊碩士班
系所名稱(英文)	Master's Program in Networking and Communications, Department of Computer Science and Information En
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	97
學期	2
出版年	98
研究生(中文)	林谷鴻
研究生(英文)	Ku-Hong Lin
學號	695420124
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2009-06-11
論文頁數	55頁
口試委員	指導教授 - 郭經華(chkuo@mail.tku.edu.tw) 委員 - 郭經華(chkuo@mail.tku.edu.tw) 委員 - 陳孟彰(mcc@iis.sinica.edu.tw) 委員 - 楊接期(yang@cl.ncu.edu.tw)
關鍵字(中)	網頁推薦系統機率模型網站使用探勘技術貝氏分類
關鍵字(英)	Web page recommender system probabilistic model Web Usage Mining Naive Bayes Model
第三語言關鍵字
學科別分類
中文摘要	在本論文中為網站的使用者提出一種具有時間因素的機率模型網頁推薦系統，能預測出使用者接下來會點擊的網頁有哪些，希望提出建議讓使用者能快速找到他們想要的網頁，避免在找尋標的網頁的過程中，花費太多不必要的時間研究主要貢獻在於驗證出時間因素對網頁推薦影響為何，並將時間因素的概念加入機率模型，應用在網頁推薦系統上，當使用者瀏覽網站時，可即時提供精確的網頁建議，簡化使用者找尋特定網頁的瀏覽路徑，以提高再次瀏覽該網站的興趣。系統會根據各時間區段使用者的瀏覽路徑行為，利用加入時間因素的機率模型去量化不同時間區段對網頁推薦系統的影響力，找出對預測目前路徑最有幫助的時間區段。由於本系統會隨著使用者的瀏覽行為，即時動態調整各時間區段的權重值，對於預測使用者未來瀏覽路徑有著不錯的準確率。
英文摘要	This proposal incorporates the idea of time factor into the probabilistic model in order to be utilized in a web page recommender system. When a user browses through a website, the system will immediately be able to recommend related web pages, thereby simplifying and shortening the paths that the user needs to take, and increasing the enjoyment of browsing websites. The system will employ this type of time-factored probabilistic model, as well as the browsing history of the user, to determine the ef fect of time sectors on webpage recommendation system and to find the most helpful time sector for the current path. Since the system takes into consideration the browsing habits of the user, adjusting each time sector’s value is beneficial to ensuring the accuracy of the system’s prediction of the user’s browsing paths
第三語言摘要
論文目次	目錄第1章緒論 1 1.1研究動機與目的 2 1.2研究內容 4 1.3論文內容大綱 6 第2章背景知識與相關研究 7 2.1 網站探勘(WEB MINING) 8 2.1.1 網站內容探勘(Web Content Mining) 9 2.1.2 網站結構探勘(Web Structure Mining) 9 2.1.3 網站使用探勘(Web Usage Mining) 10 2.2 網站使用探勘流程 11 2.2.1離線處理 12 2.2.1.1資料收集 12 2.2.1.2資料前置處理 13 2.2.1.3行為特徵探勘 14 2.2.2線上流程 14 2.3資料探勘相關技術 16 2.3.1關聯法則(Association Rules) 16 2.3.1.1 Apriori演算法 17 2.3.1.2 Apriori演算法的缺點 18 2.3.2時間序列分析(Time Series Analysis） 19 2.3.3資料分群(Data Clustering) 19 2.3.4資料分類(Data Classification) 20 2.4 相關研究 21 第3章研究方法 22 3.1網站日誌檔 23 3.1.1網站日誌檔格式 23 3.1.2網站日誌檔處理 24 3.2機率模型 29 3.3時間因素(TIME INFLUENCE) 30 3.3.1相關性(correlation) 30 3.3.2時間區段(Time sector) 32 3.3.3時間權重值計算(Time weight) 33 第4章實驗結果與討論 35 4.1實驗環境與步驟 35 4.2時間單位比較 37 4.2.1以準確率來分析 37 4.2.2以相關性來分析 38 4.3實驗結果與討論 40 4.3.1 方法比較(1) 40 4.3.2 方法比較(2) 41 第5章結論與未來研究方向 43 5.1結論 43 5.2未來研究方向 44 參考文獻 45 附錄 47 圖目錄圖 2.1-1 web mining分類 8 圖 2.2-1 網站使用探勘的一般架構[6] 11 圖 2.2.1.2-1 網站使用探勘前處理細部流程[6] 13 圖 2.3.1-1 Apriori 演算法 18 圖 3.1.1-1 網站日誌檔片段 24 圖 3.1.2-1 session U瀏覽路徑 27 圖 3.1.2-2 網頁依其內容的分類圖[17] 27 圖 3.1.1-1 無相關性圖示 31 圖 3.1.1-2 正相關性圖示 31 圖 3.1.1-1 負相關性圖例 31 圖 3.3.3-1 EM演算法步驟 34 圖 4.2-1 時間區段準確率之比較(一) 37 圖 4.2-2 時間區段準確率之比較(二) 37 圖 4.2-3 時間區段準確率之比較(三) 38 圖 4.3.1-1 時間區段準確率之比較 41 表目錄表 2.3.1-1 Support值和Confidence值代表的意義[9] 17 表 3.3.2-1 各時間區段的準確率(使用關聯法則,時間區段為30天) 33 表 4.1-1 各時間區段的準確率(30天) 36 表 4.2.2-1 各時間單位的相關係數(關聯法則) 39 表 4.3.1-1 各時間區段的準確率(機率模型) 40
參考文獻	[1]M.-S. Chen, J.Han, and P.S. Yu, “Data Mining : An Overview from a Database Perspective”, IEEE Transactions on Knowledge and Data Engineering, Vol.8, No.6, pp.866-883, 1996. [2]O. Etzioni, "The world wide web: Quagmire or gold mine", Communications of the ACM, vol. 39, pp.65-68, 1996. [3] R. Cooley, B. Mobasher and J. Srivastava, “Web Mining: Information and Pattern Discovery on the World Wide Web”, In Proceedings of International Conference on Tools with Artificial Intelligent, pp. 558-567, Newport Beach,CA, 1997. [4] O. Zaiane, J. Han, Z. Li , S. Chee and J. Chiang, MultimediaMiner: a system prototype for multimedia data mining. Proc. ofACMSlGMOD, 1998. [5]Jaideep Srivastava, Robert Cooley, Mukund Deshpande, Pang-Ning Tan, ”Web Usage Mining:Discovery and Application of Usage Patterns from Web Data”,2000. [6]Bamshad Mobasher, Robert Cooley, Jaideep Srivastava, "Automatic Personalization Based on Web Usage Mining", Communications of The ACM, vol.43, pp.142-151, August , 2000. [7]R. Agrawal and R. Srikant, “Fast Algorithm for Mining Association Rules”, In Proceeding of the 20thVLDB Conferenc , pp.487-499, 1994. [8]J. S. Park, M. S. Chen and P. S. Yu, “An Effective Hash Based Algorithm for Mining Association Rules”, IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No. 2, pp.209-221, 1998. [9]楊煜愷，以完全項目集合演算法挖掘與分析使用者瀏覽行為，暨南大學，碩士論文，2001 [10]L. Kaufman, and P.J Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley and Sons, 1990. [11]G. Gautam, and B.B Chaudhuri, “A Novel Genetic Algorithm for Automatic Clustering,” Pattern Recognition Letters, Vol. 25 ,pp .173–187 , 2004. [12J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, pp. 281-297 ,1967. [13]林傑斌，劉明德，陳湘，,”數據挖掘與 OLAP理論與實務 ,2002. [14]Jiawei Han, Micheline Kamber, Data Mining : Concepts and Techniques, Morgan Kaufmann Publishers ,2001。 [15]Mathias Géry, Hatem Haddad, “Evaluation of web usage mining approaches for user's next request prediction”, in Proceedings of the 5th ACM international workshop on Web information and data management , 74-81, 2003 [16]陳仕昇, “以可重複序列挖掘網路瀏覽規則之研究”, 國立㆗央大學資訊管理研究所碩士論文, 2000。 [17]楊昇宏, “資料挖掘應用於找尋瀏覽網頁之型樣”, 私立逢甲大學資訊工程研究所碩士論文, 2000。 [18]Y.J. Su, H.C. Jiau, and S.R. Tsai, “Using the moving average rule in a dynamic web recommendation system,” International Journal of Intelligent Systems, vol. 22, no. 6, pp. 621-639, June 2007. [19]R. Cooley, B. Mobasher and J. Srivastava,“Data preparation for mining World Wide Web browsing patterns”, Journal of Knowledge and Information Systems 1, 1, 1999. [20]Andrew McCallum , Ronald Rosenfeld , Tom M. Mitchell , Andrew Y. Ng, “Improving Text Classification by Shrinkage in a Hierarchy of Classes”, in Proceedings of the Fifteenth International Conference on Machine Learning, p.359-367, July 24-27, 1998
論文全文使用權限	校內：紙本論文於授權書繳交後1年公開同意電子論文全文授權校園內公開校內電子論文於授權書繳交後1年公開校外：同意授權校外電子論文於授權書繳交後1年公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信