§ 瀏覽學位論文書目資料
  
系統識別號 U0002-0307201819225200
DOI 10.6846/TKU.2018.00083
論文名稱(中文) 基於頻繁項目集之增量協同過濾推薦
論文名稱(英文) Incremental Collaborative Filtering based on Frequent Itemset mining
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系全英語碩士班
系所名稱(英文) Master's Program, Department of Computer Science and Information Engineering (English-taught program)
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 106
學期 2
出版年 107
研究生(中文) 劉融潞
研究生(英文) Jung-Lu Liu
學號 605780013
學位類別 碩士
語言別 英文
第二語言別
口試日期 2018-06-29
論文頁數 35頁
口試委員 指導教授 - 王英宏(inhon@mail.tku.edu.tw)
委員 - 陳以錚(ycchen@mgt.ncu.edu.tw)
委員 - 惠霖(121678@mail.tku.edu.tw)
委員 - 王英宏(inhon@mail.tku.edu.tw)
關鍵字(中) 增量式頻繁項目集探勘
規範序列樹
協同過濾
推薦系統
關鍵字(英) Incremental Frequent Pattern Mining
Canonical Order Tree
Collaborative Filtering
第三語言關鍵字
學科別分類
中文摘要
近年來,推薦系統的技術越來越蓬勃發展,透過協同過濾的方式,根據用戶的興趣或購買行為,對物品的「評分」或「偏好」,向用戶進行推薦,此技術也廣泛應用在各個領域, 例如:電影、書籍、美食…等。傳統的協同過濾是透過用戶相似的偏好,去預測你個人的偏好,進一步把其他跟你相似的人所喜愛的物品推薦給你,達到個人化的推薦效果。然而,對於新進的用戶,該如何進行有效的推薦,是一個值得思索的問題。因此本研究運用使用者對電影的喜好,透過樹的建構,將每筆資料依序插入樹中。當找出頻繁項目集後,再利用協同過濾,找出相似的使用者,進一步去做推薦。本文也將對於新使用者的加入,透過增量式挖掘,將採用規範序列樹Canonical-order tree (Can Tree)方法。將資料庫的資料都儲存於Can Tree中再以FP-growth的方式對Can Tree進行探勘,找出頻繁項目集。接著利用協同過濾的方式,計算使用者之間的相似性,找出要對使用者推薦的項目集清單,最後使用預測函數進行推薦。
英文摘要
In recent years, the technique of recommending systems has been booming. Through collaborative filtering, users are given "ratings" or "preferences" for items based on their interest or shopping behavior. This technique is also widely used in various parts such as movies, books or search queries.
Traditional collaborative filtering is based on the user's similar preferences, to predict your personal preferences, and recommend you the preferences as others like you, to achieve personalized recommendations. However, it is a worthwhile issue to consider how to make effective recommendation for new users. Therefore, in this study, we used the user's preference for the movie. Through the construction of the tree, each data will be inserted into the tree. After we find out the frequent itemsets, then we combined with collaborative filtering to find similar users, and further to do the recommendation. For the incremental data, we use the method called Canonical-order tree (Can Tree). It stores the data of the transaction database in the Can Tree, and then explore the Can Tree by FP-growth and find the frequent itemsets.
Then, we use collaborative filtering to calculate the similarity between users, and find a list of items to recommend to users. Finally, using the prediction function to make a recommendation.
第三語言摘要
論文目次
Table of Contents

Chinese Abstract………………………………………………………I
Abstract…………………………………………………………………II
Table of Contents……………………………………………………IV
List of Figures…………………………………………………………VI
List of Tables…………………………………………………………VII
Chapter 1………………………………………………………………1
Introduction……………………………………………………………1
Chapter 2…………………………………………………………5
Related Work……………………………………………………………5
2.1 Frequent Pattern Growth(FP-growth) algorithm……………………………………………………………5
2.2 AFPIM(Adjusting FP-tree for Incremental Mining)……………5
2.3 CATS Tree(Compressed and Arranged Transaction Sequences Tree)……………………………………………………………………6
2.4 Canonical-order tree (CanTree)……………7
2.5 Collaborative Filtering……………7
2.5.1 User-Based CF……………8
2.5.2 Item-Based CF……………8
2.6 Hybrid Work……………9
Chapter 3……………10
Preliminaries……………10
3.1 Definition……………10
Chapter 4……………11
Proposed Method……………11
4.1 Research Framework……………11
4.2 Build User/Item Matrix……………	12
4.3 Extract like/unlike patterns……………12
4.3.1 like patterns……………13
4.3.2 Unlike patterns……………14
4.4 Build Canonical Order Tree……………14
4.4.1 Mine Canonical Order Tree……………15
4.4.2 Incremental mining……………16
4.5 Similarity Cacluation……………16
4.5.1 Jaccard Coefficient……………17
4.5.2 User-based Collaborative Filtering(Pearson)……………18
4.5.3 Hybrid……………18
4.6 Predict……………18
Chapter 5……………20
Experiments……………20
5.1 Experiments evolution……………20
5.1.1 Data set……………20
5.1.2 Evaluation metrics……………20
5.2 Experimental: MAE, RMSE……………21
5.2.1 MAE and RMSE in different min support……………21
5.2.2 MAE in different data size with different α in Hybrid……………24
5.2.3 MAE and RMSE in different similarity calculation…24
5.2.4 MAE and RMSE in different training ratio…25
5.2.5 Precision, Recall and F-measure……………27
Chapter 6 Conclusion……………32
Reference……………33



List of Figures
Figure 1: The concept of user-based and item-based filtering……………2
Figure 2: Research framework……………11
Figure 3: Canonical Order Tree Established……………15
Figure 4: Mine Canonical Order Tree……………15
Figure 5: MAE using different min support(100K)……………22
Figure 6: RMSE using different min support(100K)……………22
Figure 7: MAE using different min support(1M)……………23
Figure 8: RMSE using different min support(1M)……………23
Figure 9: MAE in different data size with different α in Hybrid method……………24
Figure 10: MAE in different training ratio from 60% - 90%(100K)……………25
Figure 11: RMSE in different training ratio from 60% - 90%(100K)……………26
Figure 12: MAE in different training ratio from 60% - 90%(1M)……………26
Figure 13: RMSE in different training ratio from 60% - 90%(1M)……………27
Figure 14: Classification Accuracy Measure related figure	……………27
Figure 15: Top-k Precision(100K)……………29
Figure 16: Top-k Precision(1M)……………29
Figure 17: Top-k Recall(100K)……………30
Figure 18: Top-k Recall(1M)……………30
Figure 19: Top-k F-measure(100K)……………31
Figure 20: Top-k F-measure(1M)……………31
List of Tables
Table 1: User/ Item Matrix……………12
Table 2: Like/ Unlike Patterns Matrix……………13
Table 3:Like Pattern (a)Rating Score (b)Symbol……………13
Table 4: Unlike Patterns (a)Rating Score(b)Symbol……………14
Table 5: Update database……………16
Table 6: Jaccard Similarity Calculation……………17
Table 7: The Characteristics of MovieLens dataset……………20
Table 8: MAE and RMSE in different similarity calculation	……………24
參考文獻
[1] G. Shani, and A. Gunawardana, “Evaluating recommendation systems.” Recommender Systems Handbook, pp.257-297, 2011
[2] http://3png.com/a-11029303.html
[3] https://ibaotu.com/sucai/208159.html
[4] Q. Li, and B.M Kim, “An approach for combining content-based and collaborative filters”, Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages, pp. 17-24 July 2003.
[5] J. Han, J. Pei, Y. Yin. “Mining Frequent Patterns without Candidate Generation.” ACM-SIGMOD, pp. 1-12, 2000.
[6] K. Wang, L. Tang, J. Han, and J. Liu “Top down FP Growth for Association Rule Mining.” Proceedings of Pacific-Asia Conference, PAKDD, pp. 334-340, 2002.
[7] G. Grahne, J. Zhu. “Fast Algorithms for Frequent Itemset Mining Using FP-Trees.” Transactions on Knowledge and Data Engineering IEEE, pp. 1-16, 2005.
[8] J. L Koh, S. F Shieh. “An Efficient Approach for Maintaining Association Rules Based on Adjusting FP-tree Structures”. Proceedings of the Database Systems for Advanced Applications, pp. 417-424, 2004.
[9] W. Cheung and O. R. Zaïane. “Incremental Mining of Frequent Patterns without Candidate Generation or Support Constraint.” Proceedings of the Seventh International Database Engineering and Applications Symposium (IDEAS’03) IEEE, 2003.
[10] C. K.-S. Leung, Q. I. Khan, Z. Li, T. Hoque “CanTree: a canonical-order tree for incremental frequent-pattern mining.” Knowledge Information System, 2007.
[11] G. Grahne, J. Zhu. “Fast Algorithms for Frequent Itemset Mining Using FP-Trees.” Transactions on Knowledge and Data Engineering IEEE, pp. 1-16, 2005.
[12] M. D. Ekstrand, J. T. Riedl, and J. A. Konstan. “Collaborative filtering recommender systems.” Foundations and Trends in Human-Computer Interaction, pp. 81–173, 2010.
[13] B. Sarwar, G. Karypis, J. Konstan, J. Riedl: “Item-based Collaborative Filtering Recommendation Algorithms.” ACM, 2001.
[14] M.A Ghazanfar, and A.P Bennett: “A scalable, accurate hybrid recommender system”. In Knowledge Discovery and Data Mining, WKDD’10. Third International Conference on IEEE, pp. 94-98, 2010.
[15] J. Kelleher and D. Bridge. “Rectree centroid: An accurate, scalable collaborative recommender.” Proceeding of the Fourteenth Irish Conference on Artificial Intelligence and Cognitive Science, pp. 89–94, 2003.
[16] C. F. Ahmed, S. K. Tanbeer, B.S. Jeong, and Y.K. Lee, “Efficient tree structures for high utility pattern mining in incremental databases” Knowledge and Data Engineering, IEEE, pp. 1708–1721, 2009.
[17] J.S. Breese, D. Heckerman, and C. Kadie, “Empirical Analysis of Predictive Algorithms for Collaborative Filtering,” Proc. 14th Conf. Uncertainty in Artificial Intelligence, 1998.
[18] http://movielens.org
[19] J. Herlocker, J.A Konstan, L. Terveen, J. Riedl, “Evaluating collaborative filtering recommender systems”. ACM Transactions on Information Systems, 2004.
[20] T. Hofmann, “Collaborative filtering via Gaussian probabilistic latent semantic analysis.” Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2003.
[21] H.h Do, “Schema Matching and Mapping-Based Data Integration”, Architecture, Approaches and Evaluation, 2007.
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信