§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2308202112005700
DOI 10.6846/TKU.2021.00612
論文名稱(中文) 基於PrefixSpan演算法的時間序列特徵模式社群網路探勘
論文名稱(英文) Social network mining of time sequence pattern base on PrefixSpan algorithm
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 109
學期 2
出版年 110
研究生(中文) 翁愷澤
研究生(英文) Kai-Ze Weng
學號 607410205
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2021-07-20
論文頁數 32頁
口試委員 指導教授 - 王英宏(inhon@mail.tku.edu.tw)
委員 - 陳以錚(ycchen@mgt.ncu.edu.tw)
委員 - 惠霖(121678@mail.tku.edu.tw)
關鍵字(中) 資料探勘
時序性資料
社群網路
特徵模式探勘
關鍵字(英) Data Mining
Time Sequence Data
Social Network
Sequence Pattern Mining
第三語言關鍵字
學科別分類
中文摘要
時序性資料是常見且龐大的資料型態,無時無刻都會有新的資料生成且被記錄。社群網路是近年因網路快速發展所興起的線上虛擬社群,也因這種發展狀況,社群網路已經成為我們生活幾乎不可或缺的一部分,人與人之間的網絡關係也成為另一項重點研究項目,在社群內的各個使用者之間的連結與互動將會形成一個龐大而複雜的社群網路,並將記錄下在社群中所有用戶的的活動狀態、發言記錄、使用者之間的互動狀況及活動發生的時間,然而對於這些龐大且雜亂的社群網路的時間序列資料集合,要如何有效且迅速的篩選並處理這些資料將是本次要探討的重點。本次研究所提出的概念是結合數據分析與社群網路資料集,利用大量的時序性資料在經過篩選去除多餘資訊之後,經過標記起始時間點及結束時間點後,再利用PrefixSpan演算法去找出頻繁出現的使用者時序性序列資訊,再將其輸出結果加以記錄來實現。
英文摘要
Time Sequence is a common and enormous data type. Every second will generate a new set of data sequences then be recorded. Social network is a virtual online community that has developed in the last decades. Since this kind of development status, social network has become a part of our daily life. The connection between people has become another important research project. As the connection between people, the interaction and connectivity between users will form a humongous and complicated social network. This will record every user in this network’s activity status, speech record, interactive between users and record interactive time. Facing this kind of enormous and messy social network sequence datasets. How can we filter and process these datasets efficiently? We proposed a concept that uses a huge amount of time sequence data after filter them without extra information. Marked the starting-time point and end-time point. Then we used PrefixSpan algorithm to find the frequency of the user’s time sequence data, then recorded them for further usage.
第三語言摘要
論文目次
中文摘要 I
英文摘要  II
圖目錄: V
表目錄: VI
第一章. 緒論 1
1.1. 引言 1
1.2. 論文架構 2
第二章. 文獻探討 3
2.1. 資料探勘 3
2.2. 時間模式 6
2.3. 社群網路 8
第三章. 系統架構 9
第四章. 實驗設計與結果 18
4.1. 實驗資料設定 18
4.2. 實驗結果 20
第五章. 結論 22
參考文獻 23
附錄 英文論文 28

圖目錄:
Figure 1. Allen’s 13 Temporal Pattern 7
Figure 2. System architectures 10
Figure 3. Program Describe 14
Figure 4. Algorithm of Pre-Processing 15
Figure 5. Algorithm of Add Time Tag 16
Figure 6. Algorithm of PrefixSpan 17
Figure 7. User Data After Filter 19
Figure 8. Pre-PrefixSpan result 20
Figure 9. Result After PrefixSpan 21
Figure 10. Final Result of Frequent User List 21

表目錄:
Table 1. Example of PrefixSpan 5
Table 2. Program Describe 11
Table 3. User sample Datasets 19
參考文獻
[1] R. Agrawal and R. Srikant, "Fast algorithms for mining association rules," in Proc. 20th int. conf. very large data bases, VLDB, 1994, vol. 1215: Citeseer, pp. 487-499.
[2] R. Agrawal and R. Srikant, "Mining sequential patterns," in Proceedings of the eleventh international conference on data engineering, 1995: IEEE, pp. 3-14.
[3] J. F. Allen, "Maintaining knowledge about temporal intervals," Communications of the ACM, vol. 26, no. 11, pp. 832-843, 1983.
[4] J. Chen, "An updown directed acyclic graph approach for sequential pattern mining," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 7, pp. 913-928, 2009.
[5] Y.-C. Chen, J.-C. Jiang, W.-C. Peng, and S.-Y. Lee, "An efficient algorithm for mining time interval-based patterns in large database," in Proceedings of the 19th ACM international conference on Information and knowledge management, 2010, pp. 49-58.
[6] Y.-C. Chen, W.-C. Peng, and S.-Y. Lee, "Mining temporal patterns in time interval-based data," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 12, pp. 3318-3331, 2015.
[7] J. Han, G. Dong, and Y. Yin, "Efficient mining of partial periodic patterns in time series database," in Proceedings 15th International Conference on Data Engineering (Cat. No.
24
99CB36337), 1999: IEEE, pp. 106-115.
[8] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu, "FreeSpan: frequent pattern-projected sequential pattern mining," in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000, pp. 355-359.
[9] J. Han et al., "Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth," in proceedings of the 17th international conference on data engineering, 2001: Citeseer, pp. 215-224.
[10] J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation," ACM sigmod record, vol. 29, no. 2, pp. 1-12, 2000.
[11] F. Höppner and F. Klawonn, "Finding informative rules in interval sequences," in International Symposium on Intelligent Data Analysis, 2001: Springer, pp. 125-134.
[12] P.-S. Kam and A. W.-C. Fu, "Discovering temporal patterns for interval-based events," in International Conference on Data Warehousing and Knowledge discovery, 2000: Springer, pp. 317-326.
[13] S. Laxman, P. Sastry, and K. Unnikrishnan, "Discovering frequent generalized episodes when events persist for different durations," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 9, pp. 1188-1201, 2007.
[14] M. Leleu, C. Rigotti, J.-F. Boulicaut, and G. Euvrard, "Constraint-based mining of sequential patterns over datasets with consecutive repetitions," in European Conference
25
on Principles of Data Mining and Knowledge Discovery, 2003: Springer, pp. 303-314.
[15] Y. Li, J. Bailey, L. Kulik, and J. Pei, "Mining probabilistic frequent spatio-temporal sequential patterns with gap constraints from uncertain databases," in 2013 IEEE 13th International Conference on Data Mining, 2013: IEEE, pp. 448-457.
[16] M.-Y. Lin and S.-Y. Lee, "Fast discovery of sequential patterns by memory indexing," in International Conference on Data Warehousing and Knowledge Discovery, 2002: Springer, pp. 150-160.
[17] H. Mannila, H. Toivonen, and A. I. Verkamo, "Discovery of frequent episodes in event sequences," Data mining and knowledge discovery, vol. 1, no. 3, pp. 259-289, 1997.
[18] F. Masseglia, F. Cathala, and P. Poncelet, "The PSP approach for mining sequential patterns," in European Symposium on Principles of Data Mining and Knowledge Discovery, 1998: Springer, pp. 176-184.
[19] F. Mörchen and D. Fradkin, "Robust mining of time intervals with semi-interval partial order patterns," in Proceedings of the 2010 SIAM international conference on data mining, 2010: SIAM, pp. 315-326.
[20] F. Mörchen and A. Ultsch, "Efficient mining of understandable patterns from multivariate interval time series," Data mining and knowledge discovery, vol. 15, no. 2, pp. 181-215, 2007.
[21] M. Muzammal and R. Raman, "Mining sequential patterns from probabilistic databases,"
26
in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2011: Springer, pp. 210-221.
[22] P. Papapetrou, G. Kollios, S. Sclaroff, and D. Gunopulos, "Discovering frequent arrangements of temporal intervals," in Fifth IEEE International Conference on Data Mining (ICDM'05), 2005: IEEE, p. 8 pp.
[23] D. Patel, W. Hsu, and M. L. Lee, "Mining relationships among interval-based events for classification," in Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 2008, pp. 393-404.
[24] J. Pei et al., "Mining sequential patterns by pattern-growth: The prefixspan approach," IEEE Transactions on knowledge and data engineering, vol. 16, no. 11, pp. 1424-1440, 2004.
[25] R. Srikant and R. Agrawal, "Mining sequential patterns: Generalizations and performance improvements," in International conference on extending database technology, 1996: Springer, pp. 1-17.
[26] S. J. Van Schaik, D. Olteanu, and R. Fink, "Enframe: A platform for processing probabilistic data," arXiv preprint arXiv:1309.0373, 2013.
[27] R. Villafane, K. A. Hua, D. Tran, and B. Maulik, "Knowledge discovery from series of interval events," Journal of Intelligent Information Systems, vol. 15, no. 1, pp. 71-89, 2000.
27
[28] A. K. Wong, D. Zhuang, G. C. Li, and E.-S. A. Lee, "Discovery of delta closed patterns and noninduced patterns from sequences," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 8, pp. 1408-1421, 2011.
[29] S.-Y. Wu and Y.-L. Chen, "Mining nonambiguous temporal patterns for interval-based events," IEEE transactions on knowledge and data engineering, vol. 19, no. 6, pp. 742-758, 2007.
[30] J. Yang, W. Wang, and P. S. Yu, "Infominer: mining surprising periodic patterns," in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 395-400.
[31] J. Yang, W. Wang, P. S. Yu, and J. Han, "Mining long sequential patterns in a noisy environment," in Proceedings of the 2002 ACM SIGMOD international conference on Management of data, 2002, pp. 406-417.
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信