系統識別號 | U0002-2408202017485400 |
---|---|
DOI | 10.6846/TKU.2020.00710 |
論文名稱(中文) | 基於授課內容與學生學習偏好的課程推薦系統 |
論文名稱(英文) | A Course Recommendation System Based on Course Content and The Learning Preferences of Students |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 電機工程學系碩士班 |
系所名稱(英文) | Department of Electrical and Computer Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 108 |
學期 | 2 |
出版年 | 109 |
研究生(中文) | 魏齊佑 |
研究生(英文) | Qi-You Wei |
學號 | 607450110 |
學位類別 | 碩士 |
語言別 | 繁體中文 |
第二語言別 | 英文 |
口試日期 | 2020-07-11 |
論文頁數 | 57頁 |
口試委員 |
指導教授
-
衛信文(hwwei@mail.tku.edu.tw)
委員 - 李維聰(wtlee@mail.tku.edu.tw) 委員 - 朱國志(kuochih.chu@gmail.com) |
關鍵字(中) |
推薦系統 協同過濾 |
關鍵字(英) |
Recommendation System Collaborative Filtering |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
隨著科技的演變,資訊在網路流通迅速,在商業、媒體、學術…等,都有很大的突破,商業上以虛擬的平台,解決了買方與賣方之間的距離,媒體也在網路上將想傳遞的資訊更快速到人們眼中,在學術交流上更成為了一大福音,人們查詢資料不再需要在圖書館中才能找到資料,資訊的取得對人們無往不利,但同時也成為人們選擇資料的一大難題,因此有了推薦系統的誕生。 在多數的推薦系統當中使用的方法為基於個人偏好為或者基於熱門度的方式進行推薦,但在推薦上面往往不是最理想的答案。而這樣的問題同樣顯現在教育學習的領域上,在目前自主學習與無邊界學習的風潮下,學生如何在眾多的課程中,找到合適自己修習的課程,則成了本論文的研究動機。 因此,本論文想在課程學習上建立一個推薦系統,目的是透過分析出與此使用者相關聯的課程,進而推薦給找尋課程的使用者。本論文選擇了課程作為分析的資料庫,選擇的原因有幾個主要的目的,第一、在跨領域的課程,在選擇上都會因不熟悉而無法選擇,第二、由於課程教學方式不同,在相同的課程題目下,不同教師的教授情形,可能提供學生不同的學習能力,第三、學生的學習方式偏好與興趣,可能適合學習不同的課程,因此想藉此推薦系統協助學生找到合適之課程。 本論文首先透過前測問卷瞭解學生特質、學習方式與教學方式之間的關聯性,再進一步透過正式問卷的設計取得學生資料以及其對不同課程在各個層面上的評價。接著,利用KNN與SVD分析,將關連性及相似度進行評測,利用協同過濾的方式對使用者進行課程上的推薦。最後,我們利用MAE與RMSE來評估KNN與SVD在預測上的準確度。 對於未來的研究延伸,也可添加各個大學生討論平台的課程討論,如:Facebook、Dcard、PTT…等,對於該課程的評論進行褒貶的評價加入考量當中,藉由語意的辨識,更加強化推薦系統的精準度。 |
英文摘要 |
With the evolution of technology, information is circulated rapidly on the Internet. There have been great breakthroughs in the development of commerce, media, academia, etc. For example, in business, the distance between buyers and sellers has been shorted via virtual platforms; information is quickly exposed to people by emerging media technology, and knowledge is easy to share in the academic field. People no longer need to go to a library to find the data or information they want. Nowadays, it is very easy to get a large amount of data, however, how to find useful data or information that people really needed becomes a difficult issue. Recommendation system is one of the efficient solutions to the problem. The method used in most recommendation systems is to recommend based on personal preference or based on popularity, but the recommendation is often not the ideal answer. Such a problem also appears in the field of education and learning recommendation systems. Therefore, how to help students to find suitable courses among the many courses under the current trend of self-education (or autodidacticism) and borderless learning has become the research motivation of this thesis. To address the issue, this thesis intends to establish a course recommendation system to recommend the required and relevant courses to students who need it through the correlation analysis. There are three further reasons why we built a course recommendation system. First, in cross-domain courses, students will not be able to choose a suitable course because they are not familiar with the domain. Second, different teachers can provide different teaching skills for students. For the same course topic and content, students would learn different things and digest more knowledge about the topic via different teaching skills. Third, students’ learning preferences make them fit to study in different courses. Therefore, it is important to build a course recommendation system for students. In this thesis, we first design a pretest questionnaire to gather some data of students, for example, students’ major, year, learning preference…, and to gather some data of courses, for example, course topic, teaching techniques…. By analyzing the gathered data of the pretest questionnaire, we then further design a formal questionnaire to gather students’ ranking data for various courses in various aspects. After that, the relevance and similarity of the gathered data are analyzed, and the recommendation system is designed based on collaborative filtering methods with KNN and SVD. Finally, the prediction errors of KNN and SVD are evaluated and compared by MAE and RMSE. For future work, students’ discussions of courses and other evaluations (e.g. comments, critiques…) on various college student discussion platforms, such as Facebook, Dcard, PTT..., can also be considered in the recommendation system. Through semantic recognition of those comments may improve the accuracy and strengthen the efficiency of the recommendation system. |
第三語言摘要 | |
論文目次 |
目錄 致謝 I 中文摘要 II 英文摘要 IV 目錄 V 圖目錄 VII 表目錄 IX 第一章 緒論 1 1.1前言 1 1.2動機與目的 2 1.3論文章節架構 3 第二章 背景知識與相關文獻 4 2.1推薦系統的分類 4 2.2內容為基礎的過濾(Content Based Filtering) 5 2.3協同過濾(Collaborative Filtering, CF) 7 2.3.1以記憶為基礎的過濾(Memory Based Collaborative Filtering) 8 2.3.2以模型為基礎的過濾(Model Based Collaborative Filtering) 12 2.4 混合的推薦模型 (Hybrid Model) 13 第三章 基於學生興趣及課程內容評價之協同過濾 15 3.1資料蒐集與分析 15 3.2最近鄰居分類法(K-Nearest Neighbor,KNN) 22 3.2.1相似度計算 22 3.2.2 K值範圍與預測關係 24 3.3奇異值分解(Singular Value Decomposition,SVD) 24 3.4評估方法 28 3.5交叉驗證 29 3.5.1 Holdout CV 29 3.5.2 Leave-One-Out CV: 30 3.5.3 K-Fold CV 32 第四章 模擬與結果 34 4.1資料集前處理 34 4.1.1資料的篩選及預測的選擇 34 4.1.2餘弦相似度及皮爾森相關係數分析 35 4.1.3Surprise 套件 39 4.2最近鄰居法(KNN)實作 40 4.3SVD矩陣分解推薦 44 4.4SVD與KNN數值比較 46 第五章 貢獻與未來展望 47 5.1主要貢獻 47 5.2未來展望 48 參考文獻 49 附錄一 51 附錄二 54 圖目錄 圖2.1、推薦系統 5 圖2.2、內容為基礎的過濾流程 5 圖2.3、USER-USER相似推薦 9 圖2.4、ITEM - ITEM相似推薦 9 圖2.5、KNN示意圖 11 圖3.1 個人學習方式與教學方式分析 17 圖3.2 教學方式與課程專注度分析 17 圖3.3、課程分類圖 19 圖3.4、矩陣分解 25 圖3.5、隨機梯度下降步驟 27 圖3.6、HOLDOUT CV 30 圖3.7、LEAVE-ONE-OUT CV 31 圖3.8、3-FOLD CV 33 圖4. 1、各學院與學生性別評分比率 35 圖4. 2、RMSE結果 43 圖4. 3、MAE結果 43 圖4. 4、RMSE值 45 圖4. 5、MAE值 45 表目錄 表3.1分析結果1 20 表3.2分析結果2 20 表3.3分析結果3 21 表3.4、公式(4)係數 23 表3.5、公式(5)係數 24 表3.6、公式(8)係數 26 表4.1餘弦相似度中心群組 36 表4.2餘弦相似度分析 37 表4.3 皮爾森相關係數 38 表4.4 皮爾森相關係數顯著項目 39 表4.5、資料集 40 表4.6、RMSE交叉驗證結果 41 表4.7、MAE交叉驗證結果 41 表4.8、KNN預測結果 44 表4.9、SVD預測結果 46 表4.10、KNN與SVD比較結果 46 |
參考文獻 |
參考文獻 [1]. Eppler, Martin J., and Jeanne Mengis. "The concept of information overload: A review of literature from organization science, accounting, marketing, MIS, and related disciplines." The Information Society 20.5 (2004): 325-344. [2]. Ricci, Francesco and Rokach, Lior and Shapira, Bracha and Kantor, Paul B., "Recommender Systems Handbook(2010) ", Springer, Boston, MA [3]. Collaborative filtering, https://en.wikipedia.org/wiki/Collaborative_filtering [4]. Balabanović, Marko, and Yoav Shoham. "Fab: content-based, collaborative recommendation." Communications of the ACM 40.3 (1997): 66-72. [5]. Sarwar, Badrul, et al. "Item-based collaborative filtering recommendation algorithms" Proceedings of the 10th International Conference on World Wide Web. ACM, 2001 [6]. Daniel Lemire and Anna Maclachlan, "Slope One Predictors for Online Rating-Based Collaborative Filtering(2007) ", SIAM Data Mining (SDM'05), Newport Beach, California, April 21-23, 2005 [7]. Math Works and National Institute of Standards and Technology (NIST), "Weka", https://waikato.github.io/weka-wiki/ [8]. Rakesh Agrawal ,and Ramakrishnan Srikant "Fast Algorithms for Mining Association Rules, Proceedings of the 20th International Conference on Very Large Data Bases September" 1994 Pages 487–499 [9]. Jerome H. Friedman,and Nicholas Fisher, "Bump hunting in high-dimensional data", https://www.researchgate.net/publication/257818655_Bump_hunting_in_high-dimensional_data [10]. Koren, Yehuda, "Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model(2008) ", Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008 [11]. George, Thomas and Merugu, Srujana, " A Scalable Collaborative Filtering Framework Based on Co-Clustering(2005) ", Fifth IEEE International Conference on Data Mining (ICDM'05) [12]. Luo, Xin and Zhou, Mengchu and Xia, Yunni and Zhu, Qinsheng, "An Efficient Non-Negative Matrix Factorization-Based Approach to Collaborative Filtering for Recommender Systems(2014) ", IEEE Transactions on Industrial Informatics ( Volume: 10 , Issue: 2 , May 2014 ) [13]. Kohavi, Ron. "A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1995", 2 (12): 1137–1143.(Morgan Kaufmann, San Mateo) [14]. Chang, J., Luo, Y., and Su, K. 1992. "GPSM: a Generalized Probabilistic Semantic Model for ambiguity resolution". In Proceedings of the 30th Annual Meeting on Association For Computational Linguistics (Newark, Delaware, June 28 - July 02, 1992). Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 177-184 [15]. Devijver, P. A., and J. Kittler, "Pattern Recognition: A Statistical Approach, Prentice-Hall, London", 1982, Prentice Hall; 1st Edition[1982-1-1] [16]. Tutorial 12. Decision Trees Interactive Tutorial and Resources. [2006-06-21]., https://www.tutorialspoint.com/scikit_learn/scikit_learn_decision_trees.htm [17]. SurPRiSE, http://surpriselib.com/ (2020) [18]. Wang, Jun, Arjen P. De Vries, and Marcel JT Reinders. "Unifying user-based and item-based collaborative filtering approaches by similarity fusion." Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development inInformation Retrieval . ACM, 2006 [19]. Sheng Zhang, Weihong Wang, James Ford, and Fillia Makedon, "Learning from incomplete ratings using non-negative matrix factorization(1996) ", Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, MD, USA [20]. Daniel D. Lee and Seung, H. Sebastian, "Algorithms for Non-negative Matrix Factorization(2001) ", Advances in Neural Information Processing Systems 13 (NIPS 2000) [21]. Trevor Hastie, Robert Tibshirani, and Jerome Friedman Elements of Statistical Learning: data mining, inference, and prediction. 2nd Edition. web.stanford.edu. [2019-04-04] |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信