§ 瀏覽學位論文書目資料
  
系統識別號 U0002-1908201413162700
DOI 10.6846/TKU.2014.00741
論文名稱(中文) 基於時間序列探勘之適性化數位學習元件管理暨檢索機制
論文名稱(英文) An Adaptive Learning Object Management and Search Mechanism based on Time-Series Mining
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系博士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 102
學期 2
出版年 103
研究生(中文) 嚴昱文
研究生(英文) Yu-Wen Yen
學號 897410105
學位類別 博士
語言別 英文
第二語言別
口試日期 2014-06-23
論文頁數 57頁
口試委員 指導教授 - 趙榮耀
委員 - 洪啟舜
委員 - 施國琛
委員 - 許輝煌
委員 - 顏淑惠
委員 - 趙榮耀
關鍵字(中) 使用者生成資料
資料探勘
資訊檢索
時間序列
社群網路分析
數位學習
關鍵字(英) User-generated data
Data mining
Information retrieval
Time-series
Social network analysis
E-learning
第三語言關鍵字
學科別分類
中文摘要
近年來,資訊科技的蓬勃發展促使網際網路(World Wide Web)變成了互動的平台。雖然互動的參與者,尤指使用者與其相關的事件,在各個方面皆彼此相異,但我們很確定地可以預見大量且複雜的資訊量。 這個現象的確造成了在資訊管理、取得以及重複使用上的困難,同時也降低了這些資訊本身的價值。在本論文中,我們嘗試提出有效的方法來管理使用者生成資料(User-generated Data)與其衍生之資訊,更試著藉以經驗來實作使用者中心的服務。
本論文著重於有意義的管理與重複使用使用者生成資料,尤其是其對於數位學習活動進行時之支援。首先,我們提出了一套用以管理使用者生成資料的狀態機,它主要用以明確地記錄此類資料相互間的關係,以及其衍生資訊間之關係。為了增加資料模型的準確度,我們再狀態機的設計之上,提出了一套時間序列的探勘演算法,用以針對特定時間區段內的資料之互動,進行處理。最後,在此基礎之上,我們實作了一套資料庫管理系統及資料檢索服務,以簡化使用者於數位學習資源檢索時之複雜度。我們蒐集了500位使用者在過去五年中於其使用之社群媒體(如Facebook, Twitter等)所創建出的數據,並用以進行效能與可行性之評量。實驗結果證實,本研究所提出之資料處理方法暨檢索服務,能有效支援數位學習活動中,資訊檢索之複雜度。
英文摘要
Recent advances in information technology have turned out World Wide Web to be the main platform for interactions where participants – users and corresponding events – are triggered. Although the participants vary in accordance with scenarios, a considerable size of data will be generated. This phenomenon indeed causes the complexity in information retrieval, management, and reuse, and meanwhile, turns down the value of this data. In this thesis, we attempt to achieve efficient management of user-generated data and its derivative contexts for human supports.
This thesis concentrates on the meaningful reuse of user-generated data, especially its usage for learning purpose, through an efficient and purpose-built data management process. First, an intelligent state machine, which is the essence to the scenario of user-generated data processing, was developed to identify, especially those frequently-accessed and with timely manner, relations of data and its derivative contexts. To accelerate the accuracy in data correlation modeling, a temporal mining algorithm is then defined. This algorithm is applied to highlight the event that a data item is being accessed, and further examines its relative attributes with other correlated items. Last, but not the least, we present a conceptual scenario of human-centric search to demonstrate the proposed approach. The performance and feasibility can be revealed by the experiments that were conducted on the data collected from open social networks (e.g., Facebook, Twitter, etc.) in the past few years with size around 500 users and 8,000,000 shared contents from them.
第三語言摘要
論文目次
TABLE OF CONTENT
CHAPTER I.    Introduction	1
1.1 Background	2
1.2 Motivation and Contributions	4
1.3 Thesis Organization	5
CHAPTER II.    Literature Review	7
2.1 Design and Applications of State Machine	8
2.2 Social Data Analysis and Extraction	10
2.3 Temporal Information Mining	12
2.4 Summary	14
CHAPTER III.  Intelligent State Machine	15
3.1 Definition	16
3.2 Formulation of Intelligent State Machine	17
3.3 Execution of ISM	22
3.4 Exception Control in ISM	25
3.5 Quantification of Connections	32
3.5.1 Adding Temporal Information	32
3.5.2 Considering Usage (Co-usage) Information	33
CHAPTER IV.  ISM-based Search	35
4.1 Facilitating the Search Process	36
4.1.1 The Weight Function	36
4.1.2 The Rank Function	38
4.2 Query Revision and Suggestion	39
CHAPTER V.   The Experiments	43
5.1 The Data Set	44
5.2 Accuracy of ISM-based Data Management System	45
5.3 Reuse Rate of ISM-based System	48
5.4 Performance of ISM-based Search Support	50
CHAPTER VI.  Conclusions	52
6.1 Summary of Thesis	53
6.2 Future Work	54
Bibliography	55

LIST OF FIGURES
Figure 1. Concept of data management	16
Figure 2. Basic Elements of ISM	18
Figure 3. A transition with empty event	25
Figure 4. A decision transition with non-empty event	25
Figure 5. Illustration of search scenario	39
Figure 6. A P-R performance comparison between ISM-empowered search system and Google Customized-based search system	51

LIST OF TABLES
Table 1. Algorithm of query revision	40
Table 2. Average and standard deviation on the accuracy of implemented classifiers (raw dataset)	45
Table 3. Average and standard deviation on the accuracy of implemented classifiers (pre-processed dataset)	46
Table 4. Average accuracy of classifier-in-parallel in an ISM-based system (pre-processed dataset)	46
Table 5. User feedbacks of applied search service	48
參考文獻
Aarts, F.; Jonsson, B.; Uijen, J. (2010) “Generating Models of In-finite-State Communication Protocols Using Regular Inference with Abstraction,” Testing Software and Systems, 6435, 188-204
Bose, I.; Mahapatra, R.K. (2001) “Business data mining – a machine learning perspective,” Information and Management, 39, 3, 211-225
Carpineto, C.; Osinski, S.; Romano, G.; Weiss, D. (2009) “A sur-vey of Web clustering engines,” ACM Computing Sur-veys, 41, 3, 17
Cavalli, A.; Gervy, C.; Prokopenko, S. (2003) “New approaches for passive testing using an Extended Finite State Ma-chine specification,” Information and Software Technology, 45, 12, 837-852
Chen, Y.; Dong, G.; Han, J.; Wah, B.W.; Wang, J. (2002) “Multi-dimensional regression analysis of time-series data streams,” Proceedings of the 28th international conference on Very Large Data Bases, 323-334
Cheng, K.T.; Krishnakumar, A.S. (1996) “Automatic generation of functional vectors using the extended finite state ma-chine model,” ACM Transactions on Design Automation of Electronic Systems, 1, 1, 57-79
Culotta, A.; Bekkerman, R.; McCallum, A. (2004) “Extracting social networks and contact information from email and the Web,” In Proceedings of CEAS-1
Erickson, T.; Kellogg, W.A. (2000) “Social translucence: an approach to designing systems that support social processes,” ACM Transactions on Computer-Human Interaction, 7, 1, 59-83
Esling, P.; Agon, C. (2012) “Time-series data mining,” ACM Computing Surveys, 45, 1, 12
Fagni, T.; Perego, R.; Silvestri, F.; Orlando, S. (2006) “Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data,” ACM Transactions on Information Systems, 24, 1, 51-78
Faloutsos, C.; McCurley, K.S.; Tomkins, A. (2004) “Fast discovery of connection subgraphs,” In Proc. ACM SIGKDD 2004
Gaber, M.M.; Zaslavsky, A.; Krishnaswamy, S. (2005) “Mining data streams: a review,” ACM SIGMOD, 34, 2, 18-26
Glynn Mangold, W.; Faulds, D.J. (2009) “Social media: The new hybrid element of the promotion mix,” Business Horizons, 52, 4, 357-365
Guralnik, V.; Srivastave, J. (1999) “Event detection from time series data,” Proceedings of the fifth ACM SIGKDD International conference on Knowledge discovery and data mining, 33-42
Harada, M., Sato, S.; Kazama, K. (2004) “Finding authoritative people from the web,” In Proc. Joint Conference on Digital Libraries
Hoheisel, A.; Alt, M. (2007) “Petri Nets,” Workflows for e-Science, 190-207
Hong, J.E.; Bae, D.H. (2000) “Software modeling and analysis using a hierarchical object-oriented Petri net,” Information Sciences, 130, 1-4, 133-164
Jensen, K.; Kristensen, L.M.; Wells, L. (2007) “Coloured Petri Nets and CPN Tools for modelling and validation of concurrent systems,” International Journal on Software Tools for Technology Transfer, 9, 3, 213-254
Joachims, T.; Granka, L.; Pan, B.; Hembrooke, H.; Gay, G. (2005) “Accurately interpreting clickthrough data as implicit feedback,” Proceeding of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 154-161
Karsai, M.; Kivela, M.; Pan, R.K.; Kaski, K.; Kertesz, J.; Barabasi, A.-L.; Saramaki, J. (2011) “Small but slow world: How net-work topology and burstiness slow down spreading,” Physical Review E, 83, 2
Keogh, E.; Kasetty, S. (2003) “On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration,” Data Mining and Knowledge Discovery, 7, 4, 349-371
Kozłowski, T.; Dagless, E.; Saul, J.; Adamski, M.; Szajna, J. (1995) “Parallel controller synthesis using Petri nets,” IEE Pro-ceedings – Computers and Digital Techniques, 142, 4, 263-271
Lee, K.; Agrawal, A.; Choudhary, A. (2013) “Real-time disease surveillance using Twitter data: demonstration on flu and cancer,” Proceeding of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1474-1477
Li, L.; Hadjiicostis, C.N.; Sreenivas, R.S. (2008) “Designs of Bi-similar Petri Net Controllers With Fault Tolerance Capabilities,” IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 38, 1, 207-217
Liu, B.; Liu, Y.K. (2002) “Expected value of fuzzy variable and fuzzy expected value models,” IEEE Transactions on Fuzzy Systems, 10, 4, 445-450
Mandal, S.N.; Choudhury, J.P.; Chaudhuri, S.R.B.; De, D. (2008) “Soft Computing Approach in Prediction of A Time Series Data,” Journal of Theoretical & Applied Information Technology, 4, 12, 1131-1141
Mika, P. (2005) “Ontologies are us: A unified model of social networks and semantics,” In Proc. ISWC2005
Milanovic, N.; Malek, M. (2004) “Current solutions for Web service composition,” IEEE Internet Computing, 8, 6, 51-59
Mitra, S.; Pal, S.K.; Mitra, P. (2002) “Data mining in soft compu-ting framework: a survey,” IEEE Transactions on Neural Networks, 13, 1, 3-14
Pais, R.; Gomes, L.; Paulo Barros, J. (2011) “From UML State Machines to Petri nets: History Atribute Translation Strategies,” The 37th Annual Conference on IEEE Industrial Electronics Society, 3776-3781
Rocchio, J. (1971) “Relevance Feedback Information Retrieval. The Smart Retrieval System – Experiments,” Automatic Document Processing, 313-323
Roya, M.; Chang, R.; Qi, X. (2007) “Learning From Relevance Feedback Sessions Using A K-Nearest-Neighbor-Based Semantic Repository,” Proc. of IEEE International Conference on Multimedia and Expo, 1994-1997
Salimifard, K.; Wright, M. (2001) “Petri net-based modeling of workflow systems: An overview,” European Journal of Operational Research, 134, 3, 664-676
Schadt, E.E.; Linderman, M.D.; Soreson, J.; Lee, L.; Nolan, G.P. (2010) “Computational solutions to large-scale data management and analysis,” Nature Reviews Genetics 11, 647-657
Shtykh, R.Y.; Jin, Q. (2011) “A human-centric integrated ap-proach to web information search and sharing,” Human-centric Computing and Information Sciences, 1:2
Steyvers, M.; Tenenbaum, J.B. (2005) “The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth,” Cognitive Science, 29, 1, 41-78
Tay, F.E.H.; Cao, L. (2001) “Application of support vector ma-chines in financial time series forecasting,” Omega, 29, 4, 309-317
Thelwall, M. (2001) “A web crawler design for data mining,” Journal of Information Science, 27, 5, 319-325
van der Aalst, W.M.P.; Song, M. (2004) “Mining Social Net-works: Uncovering Interaction Patterns in Business Processes,” Business Process Management, LNCS 3080, 244-260
Yen, N.Y.; Shih, T.K.; Jin, Q. (2013) “LONET: An Interactive Search Network for Intelligent Path Generation,” ACM Transactions on Intelligent Systems and Technology, 4, 2, 30
Zhang, J.; Chang, C.K.; Chung, J.Y.; Kim, S.W. (2004) “WS-Net: a Petri-net based specification model for Web services,” IEEE International Conference on Web Services, 420-427
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信