系統識別號 | U0002-2407201721532400 |
---|---|
DOI | 10.6846/TKU.2017.00860 |
論文名稱(中文) | 基於時間間隔特徵樣式的分類方法 |
論文名稱(英文) | A Classification Method Based on Interval Pattern Mining |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系碩士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 105 |
學期 | 2 |
出版年 | 106 |
研究生(中文) | 王藝筌 |
研究生(英文) | Yi-Chuan Wang |
學號 | 604410166 |
學位類別 | 碩士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2017-07-19 |
論文頁數 | 29頁 |
口試委員 |
指導教授
-
陳以錚
委員 - 施國琛 委員 - 惠霖 |
關鍵字(中) |
資料探勘 序列樣式探勘 時間間隔樣式探勘 關聯法則分類器 |
關鍵字(英) |
Data mining Interval-based Mining Sequential Pattern Mining Classification Association Classification |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
為了能夠挖掘並且瞭解大量資料中所隱含的資訊,其中是以關聯法則為基礎的方法最為著名且廣泛地運用,而最廣為人知的應用則是藉由找出大量資料間的特殊關係來進行分類以達到預測的目的。大多數研究都著重於利用時間點的資料進行序列探勘的分類法則,然而,在實際應用的例子中,不可忽略事件與事件發生時相互呼應而產生時間的關聯性、順序性,例如:電器的使用時間、患者病症發作時間。事件資料之間具有順序性的序列資料,因其大量出現於生活中,於是我們相當注重這一議題。在本篇研究當中,我們利用時間端點表示法來表示事件與事件之間的關係,採用P-TPMiner (Probabilistic Temporal Pattern Miner)來找尋所有頻繁之時間序列,整合樣式探勘與分類方法之模型,藉此制訂分類規則的計算機制,進行預測序列資料所屬之類別。從實驗結果可得知,此序列資料分類方法不只有效率且可擴性高,並且預測結果具有可靠的正確率。 |
英文摘要 |
Most classification methods on sequential pattern mining are revolved about time point-based event data. Few researches utilize discovered temporal pattern for classifying. However, in many real world applications, there are relationships between events. In this paper, these relationships are simulated using a coincidence representation that extends Allen’s interval algebra. Moreover, we employ an efficient pattern mining algorithm called P-TPMiner (Probabilistic Temporal Pattern Miner) is designed to discover frequent time-interval based patterns. Exploiting the discovered temporal patterns, we proposed a classification method which is based on interval pattern. Experiments result is not only efficient and scalable, but also the accuracy is great on both synthetic and real datasets. |
第三語言摘要 | |
論文目次 |
TABLE OF CONTENT CHINESE ABSTRACT III ABSTRACT IV CHAPTER 1 INTRODUCTION 1 CHAPTER 2 RELATED WORK 5 CHAPTER 3 PRELIMINARIES 9 CHAPTER 4 PROBABILISTIC TEMPORAL PATTERN MINER 13 CHAPTER 5 METHOD 16 CHAPTER 6 EXPERIMENTAL RESULTS 23 CHAPTER 7 CONCLUSION 26 REFERENCES 27 TABLE OF FIGURE FIGURE 1: ILLUSTRATION OF TEMPORAL INTERVALS …………………...2 FIGURE 2: ARCHITECTURE OF THE INTEGRATED PROCESSES ………….3 FIGURE 3: EXAMPLE EVENT SEQUENCE WITH FOUR EVENTS ………….6 FIGURE 4: P-TPMINER ALGORITHM TO DISCOVER THE OPP …………....13 FIGURE 5: FUNCTION P-TPSPAN …………………………………………....15 FIGURE 6: THE STRUCTURE OF THE CLASSIFICATION PROCEDURES ...16 FIGURE 7: ALGORITHM TO ASSIGN CLASS LABEL BASED ON MAJORITY_VOTE …………………………………………………………...18 FIGURE 8: ALGORITHM TO ASSIGN CLASS LABEL BASED ON BEST_SELECTION …………………………………………………………...19 FIGURE 9: ALGORITHM TO ASSIGN CLASS LABEL BASED ON HYBRID 21 FIGURE 10: EXAMPLES OF GRADING THE RAW DATA ………………….24 TABLE OF TABLE TABLE 1: ENDPOINT AND ENDTIME REPRESENTATIONS OF ALLEN’S 13 TEMPORAL RELATIONS ……………………………………………………10 TABLE 2: THE CLASSIFYING CONDITION EXAMPLES OF ONE UNKNOWN SEQUENCE ……………………………………………………..20 TABLE 3: EXAMPLES OF THE DATABASE WHICH IS REPRESENTED BY ENDPOINT REPRESENTATION ……………………………………………..24 TABLE 4: TESTING ACCURACY ……………………………………………24 |
參考文獻 |
[1] R. Agrawal and R. Srikant. “Mining Sequential Patterns,” Proceeding of 11th International Conference on Data Engineering (ICDE’95), pp.3-14, 1995 [2] J. Allen, “Maintaining knowledge about temporal intervals,” Communications ACM, vol. 26, no. 11, pp. 832–843, 1983. [3] J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and MC. Hsu. “Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach,” IEEE Transactions on Knowledge and Data Engineering 16(11), pp.1424-1440, 2004 [4] SY. Wu and YL. Chen. “Mining Nonambiguous Temporal Patterns for Interval-Based Events,” IEEE Transactions on Knowledge and Data Engineering 19(6), pp.742-758, 2007 [5] VS. Tseng and CH. Lee. “CBS: A New Classification Method by Using Sequential Pattern,” Proceeding of the 2005 SIAM International Conference on Data Mining (SDM’05), pp.596-600, 2005 [6] YC. Chen, JC. Jiang, WC. Peng, and SY. Lee. “An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases,” Proceeding of 19th ACM International Conference on Information and Knowledge Management (CIKM’10), pp.49-58, 2010 [7] D. Patel, W. Hsu, ML. Lee. “Mining Relationships Among Interval-based Events for Classification,” Proceeding of 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD’08), pp.393-404, 2008 [8] C. Liangboonprakong and O. Sornil. “Classification of Malware Families Based on N-grams Sequential Pattern Features,” IEEE 8th Conference on Industrial Electronic and Application (ICIEA’13), pp.777-782, 2013 [9] B. Liu, W. Hsu, and Y. Ma. “Integrating Classification and Association Rule Mining,” Proceeding of 4th International Conference on Knowledge Discovery and Data Mining (KDD’98), pp.80-86, 1998 [10] C. Zhou, B. Cule, and B Goethals. “Pattern Based Sequence Classification,” IEEE Transactions on Knowledge and Data Engineering 28(5), pp.1285-1298, 2016 [11] J. Han and JR. Wen. “Mining Frequent Neighborhood Patterns in Large Labeled Graphs,” Proceeding of 22th ACM International Conference on Information and Knowledge Management (CIKM’13), pp.259-268, 2013 [12] J. Han, JR. Wen, and J Pei. “Within-Network Classification Using Radius-Constrained Neighborhood Patterns,” Proceeding of 23th ACM International Conference on Information and Knowledge Management (CIKM’14), pp.1539-1548, 2014 [13] G. Ruan, H. Zhang, and B Plale. “Parallel and Quantitative Sequential Pattern Mining for Large-scale Interval-based Temporal Data,” IEEE International Conference on Big Data, pp.32-39, 1995 [14] Q. Gu, Z. Li, and J. Han. “Generalized Fisher Score for Feature Selection,” Proceeding of 27th International Conference on Uncertainty in Artificial Intelligence. (UAI’11), pp.266-273, 2011 [15] H. Cheng, X. Yen, J. Han, and CW. Hsu. “Discriminative Frequent Pattern Analysis for Effective Classification,” Proceeding of 11th International Conference on Data Engineering. (ICDE’07), pp.716-725, 2007 [16] D. Lo, H. Cheng, J. Han, SC. Khoo, and C. Sun. “Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach,” Proceeding of 15th ACM SIGMOD International Conference on Knowledge Discovery and Data Mining. (KDD’09), pp.557-566, 2009 [17] P. Kam and AW. Fu. “Discovering Temporal Patterns for Interval-Based Events,” Proceeding of 2rd International Conference on Data Warehousing and Knowledge Discovery. (UAI’11), pp.266-273, 2000 [18] Y.Zhao, H. Zhang, S.Wu, J. pei, L. Cao, C. Zhang, and H. Bohlscheid. “Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns,” Proceeding of the European Conference on Machine Learning and Knowledge Discovery in Database. (DaWaK’2000), pp.317-326, 2000 [19] I. Batal, H. Valizadegan, G. F. Cooper, and M. Hauskrecht. “A Pattern Mining Approach for Classifying Multivariate Temporal Data,” IEEE International conference on Bioinformatics and Biomedicine (BIBM), pp.358-365, 2011 [20] CC. Chang and CJ. Lin. “LIBSVM: A Library for Support Vector Machines,” ACM Transactions on Intelligent systems and Technology., Vol. 2, Issue. 3, 2011 [21] YC. Chen, WC. Peng, and SY. Lee. “CEMiner – An Efficient Algorithm for Mining Closed Patterns from Time Interval-Based Data,” IEEE 11th International Conference on Data Mining (ICDM), pp.121-130, 2011 [22] YC. Chen, WC. Peng, and SY. Lee. “Mining Temporal Patterns in Time Interval-Based Data,” IEEE Transactions on Knowledge and Data Engineering, pp.3318-3331, 2015 [23] V. Jakkula and DJ.Cook. “Temporal Pattern Discovery for Anomaly Detection in a Smart Home,” Proceeding of 3rd IET International Conference on Intelligent Environments (IE), pp.339-345, 2007 [24] JR. Quinlan. “C4.5: Programs for machine learning,” Morgan Kaufmann Publishers, 1993 |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信