電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2008-01-01起於校外公開使用
本論文紙本於2007-07-30起公開使用

系統識別號	U0002-2407200715073700
DOI	10.6846/TKU.2007.00736
論文名稱(中文)	STPN網頁架構模型中的馬可夫模式分析
論文名稱(英文)	Markov Analysis for STPN Web Structure Model
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	95
學期	2
出版年	96
研究生(中文)	王秉弘
研究生(英文)	Bin-Hong Wang
學號	694190728
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2007-06-15
論文頁數	58頁
口試委員	指導教授 - 陳伯榮(pozung@mail.tku.edu.tw) 委員 - 趙景明委員 - 徐郁輝
關鍵字(中)	馬可夫鏈網頁使用模式探勘
關鍵字(英)	Markov Chain Web Usage Mining
第三語言關鍵字
學科別分類
中文摘要	網頁探勘的方法只要分成三類：網頁使用模式探勘、網頁結構探勘及網頁內文探勘。其中網頁使用模式探勘分成三個步驟：前置處理、模式發掘、模式分析。本論文則在前置處理中先建構SPTN網頁架構模型，並利用此STPN網頁架構模型來填補網站日誌檔中的使用者瀏覽路徑。在模式發掘中，統計使用者連結網頁次數與瀏覽時間，利用馬可夫矩陣求得其極限機率，與每個網頁的瀏覽率與所佔的時間比例。最後藉由我們所預估的值與實際從網站日誌檔中統計的資料做比較，以提供網站管理者可適時地調整網頁架構或網頁內容，使得使用者瀏覽更方便，且達到網站負載之最大效益。
英文摘要	The approaches of web mining have been classified into three types, web usage mining, web structure mining, and web content mining. The web usage mining consists of three phases: data preprocessing, pattern discovery, and pattern analysis. In this paper, we use STPN web structure model to solve path completion and use the link graph to calculate a transition matrix containing one-step transition probabilities between the states in the Markov model. Finally, we compare the probability of linking each page by using Markov model and the probability that we predict. We provide the administrators adjusting the web structure and content to make the navigation of user more convenient.
第三語言摘要
論文目次	第一章緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 論文架構 3 第二章背景知識 4 2.1網頁探勘及其相關研究 4 2.2隨機時間過程派翠網路 13 2.3建立STPN關聯矩陣協助路徑填補 15 2.4馬可夫鏈 20 2.5網站日誌檔 22 第三章研究方法 26 3.1研究對象說明 27 3.2馬可夫鏈之轉移機率 30 3.3網頁瀏覽率 33 3.4網頁瀏覽時間比例 36 第四章統計分析 40 4.1資料前置處理 40 4.2比較瀏覽率與瀏覽網頁時間比例 42 4.3分析瀏覽網頁平均時間分佈 46 4.4調整時間參數 48 第五章結論 49 5.1結論 49 5.2未來研究 50 5.3研究限制 50 參考文獻 51 附錄—英文論文 54 圖目錄圖2.1網頁探勘的分類 4 圖2.2 Srivastava的網頁使用模式探勘流程 8 圖2.3前置處理流程 11 圖2.4 STPN網站架構表示法 17 圖2.5網站架構圖（directed graph） 18 圖2.6網站日誌檔片段 23 圖3.1原始網站架構 30 圖3.2網站架構 31 圖4.1網站日誌檔片段 40 圖4.2圖4-1之使用者瀏覽路徑 41 表目錄表2.1網頁的主要內容 16 表2.2對應圖2-4網頁內容範圍的STPN關聯矩陣 17 表2.3關聯矩陣 [Cij]7X10 17 表2.4 W3C擴充記錄檔案格式 24 表3.1網頁主要內容 28 表3.2網址和網頁代碼對應表 28 表3.3 STPN關聯矩陣 29 表3.4網頁連結次數(預估) 32 表3.5兩步轉移後的網頁瀏覽率 35 表3.6長時間轉移後的網頁瀏覽率(預估) 36 表3.7網頁瀏覽率之標準化(預估) 37 表3.8網頁平均瀏覽時間(預估) 38 表3.9瀏覽網頁時間比例(預估) 39 表4.1網址和網頁代碼對應表 40 表4.2網頁連結次數(實際) 42 表4.3網頁平均瀏覽時間(實際) 42 表4.4連結機率矩陣 43 表4.5長時間轉移後的網頁瀏覽率(實際) 43 表4.6網頁瀏覽率之標準化(實際) 44 表4.7瀏覽網頁時間比例(實際) 44 表4.8網頁瀏覽率之比較 45 表4.9瀏覽時間比例之比較 45 表4.10指數分佈 47 表4.11網頁I至網頁C之連結次數與指數分佈 47 表4.12調整後之瀏覽網頁時間比例 48
參考文獻	[1] Achim Kraiss and Gerhard WeiKum, “Integrated document caching and prefetching in storage hierarchies based on Markov-chain predictions”, The VLDB Journal, 7, pp.141-162, 1998. [2] Magdalini Eirinaki and Michalis Vazirgiannis, “Web mining for web personalization”, ACM Transactions on Internet Technology, Vol. 3, No. 1, pp.1-27, 2003. [3] Robert Cooley,“The Use of Web Structure and Content to Identify Subjectively Interesting Web Usage Patterns”, ACM Transactions on Internet Technology, Vol.3, No.2, pp.93-116, May 2003. [4] 陳清祥, 應用隨機時間過程派翠網路建立網頁架構模型, 淡江大學資訊工程學系碩士論文, 2005. [5] 季振忠, 利用隨機過程時間派翠網路，以協助網頁探勘中的資料前置處理過程, 淡江大學資訊工程學系碩士論文, [6] Sanjay Kumar Madria, Sourav S. Bhowmick, W.K. Ng, E.P. Lim, “Research Issues in Web Data Mining”, In Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery, Vol.1676, pp.303-312, 1999. [7] M. Spiliopoulou, “Data mining for the Web”, In Proceedings of Principles of Data Mining and Knowledge Discovery, PKDD’99, P588-589, 1999. [8] Jaideep Srivastava, Robert Cooley, Mukund Deshpande, and Pan-Ning Tan, “Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data”, SIGKDD Explorations, Vol.1, Issue 2, pp12-23, Jan. 2000. [9] Robert Cooley, Bamshad Mobasher, Jaideep Srivastava, “Data Preparation for mining World Wide Web Browsing Patterns”, In Journal of Knowledge and Information Systems, 1999. [10] M.S. Chen, J.S. Park, and P.S. Yu, “Data mining for path traversal patterns in a Web environment”, In Proc. of the 16th International Conference on Distributed Computing Systems, pp.385-392, 1996. [11] Federico Michele Facca and Pier Luca Lanzi, “Mining Interesting Knowledge from Weblogs: A Survey”, Journal of Data and Knowledge Engineering, Vol. 53, Issue 3, pp. 225-241, 2005. [12] Birgit Hay, Geert Wets, and Koen Vanhoof, “Clustering Navigation patterns on a website using a sequence alignment method”, In Intelligent techniques for web personalization: IJCAI 2001 17th International Joint Conference on Artificial Intelligence, August 4, Seattle, Wash., USA, l., pp. 1-6, 2001. [13] Shigeru Oyanagi, Kazuto Kubota, and Akihiko Nakase, “Application of matrix clustering to web log analysis and access prediction”, In WEBKDD 2001-Mining Web Log Data Across All Customers Touch Points, Third International Workshop, 2001. [14] Tadao Murata, “Petri Nets: Properties, Analysis and Applications,” Proceedings of the IEEE, Vol. 77, No. 4, 1989. [15] Robert Cooley “The Use of Web Structure and Content to Identify Subjectively Interesting Web Usage Patterns”, ACM Transactions on Internet Technology, Vol.3, No.2, pp.93-116, May 2003. [16] 陳伯榮、楊士央、何仁中，”應用隨機過程時間派翠網路來強化網頁使用者習性探勘”， 2004數位生活與網際網路科技研討會，國立成功大學光復校區，2004. [17] Pierre Bremaud, “MARKOV CHAINS”, Springer, 2001.
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文延後至2008-01-01公開校內書目立即公開校外：同意授權予資料庫廠商校外電子論文延後至2008-01-01公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信