電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2020-09-29起於校外公開使用
本論文紙本於2020-09-29起公開使用

系統識別號	U0002-2909202009564200
DOI	10.6846/TKU.2020.00882
論文名稱(中文)	基於增強式學習在用戶偏好的推薦系統
論文名稱(英文)	Recommendation System Based on Reinforcement Learning in User Preference
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	資訊工程學系碩士班
系所名稱(英文)	Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	108
學期	2
出版年	109
研究生(中文)	曾博彥
研究生(英文)	Po-Yen Tseng
學號	607410338
學位類別	碩士
語言別	繁體中文
第二語言別	英文
口試日期	2020-07-10
論文頁數	43頁
口試委員	指導教授 - 黃連進委員 - 廖文華委員 - 張志勇
關鍵字(中)	推薦系統增強式學習奇異值分解個人化個人偏好
關鍵字(英)	Recommendation system Reinforcement learning Singular value decomposition Personalization Personal preference
第三語言關鍵字
學科別分類
中文摘要	近年來，推薦系統(Recommendation system)的需求與應用日漸增加，不管在哪個平台，諸如網路購物、影音平台、社群軟體，甚至是網路新聞，用戶所面對能選擇的東西越來越多，可能用戶只有幾十萬但是商品項目卻高達幾千幾億，因此在面對龐大的可選擇資訊量當中，該如何減少選擇時間與提高推薦給顧客喜歡東西的準確率已然成為一門重要的課題。在本論文中我們將在電影平台，透過協同過濾、基於內容、矩陣分解等的方式，根據使用者的興趣或過去行為去分析用戶的習慣，以及用戶對電影的評分，系統便會進一步把其他跟你相似的人所喜歡的物品推薦給你，並去找出用戶所喜歡的電影，達到個人化的推薦效果。在本論文中使用了增強式學習(Reinforcement Learning, RL)的推薦方法，在過去傳統的推薦方法是使用item-based與user-based，但用戶的行為模式不單單只有對電影的評分而已，用戶可能受到很多隱藏、次要因素影響評分標準，諸如導演、演員、劇情等等，相較於過去傳統單純比較使用者與其他用戶的評分相似度來作比較並推薦，使用增強式學習來讓模型學習並預測用戶的未來的行為習慣能為系統帶來更好、更長遠的效益，並且利用矩陣分解(Matrix Factorization, MF)技術中相當成熟的奇異值分解(Singular Value Decomposition, SVD)去做先前的處理，矩陣分解可以將用戶對項目的評分矩陣（User-item Matrix）拆解成維度較小的矩陣來作運算，並試圖找出用戶與電影之間的特徵矩陣，因為通常諸如各電商平台、電影平台抑或是電子書平台，User-item Matrix會是一個非常稀疏的矩陣，因為item 相對於user是較多的，在item數量過多的情下，用戶可能只對龐大item中少數的item作過評分，因此，矩陣分解將Rating矩陣與Item矩陣拆解並投射到較低維度的矩陣，並藉由用戶所評分過的電影中找出影響評分的主要因素，另外在推薦電影的這個領域，電影之間存在相關性，當電影數量增加，所含的訊息量則不會隨著電影數增加而線性增加，可以有效的降低計算複雜度與訓練時間，並結合主題模型(Topic Model, TM)去應用在SVD中的特徵矩陣來做特徵分析，讓SVD中被多個特徵結和在一起的特徵向量能有個明確的標籤分類，最後透過增強式學習來去模擬用戶的行為模式，讓模型去模仿用戶可能有的行為及評分習慣，並透過模型去了解用戶時間性的動態變化，來提供更好的預測準確度，這種方法能有效地處理資料稀疏、個人化、冷啟動等的問題。
英文摘要	In recent years, the demand and application of recommendation systems have increased. No matter which platform, such as online shopping, audio-visual platforms, social software, or even online news, users are facing more and more choices. There may be hundreds of thousands of users but hundreds of millions of products. Therefore, in the face of the huge amount of optional information, how to reduce the selection time and improve the accuracy of recommending things to customers has become an important Subject. In this paper, we will analyze the user’s habits based on the user’s interest or past behavior and the user’s rating of the movie on the movie platform through collaborative filtering, content-based, matrix decomposition, etc. The system will further Recommend items that people like you like to you, and find out the movies that users like to achieve a personalized recommendation effect. In this paper, the recommendation method of Reinforcement Learning (RL) is used. In the past, the traditional recommendation method was to use item-based and user-based, but the user's behavior pattern is not only the rating of the movie. It may be affected by many hidden and secondary factors, such as directors, actors, plots, etc., compared to the traditional comparison of the user’s rating similarity with other users for comparison and recommendation, using enhanced learning to make the model Learning and predicting the user’s future behavior and habits can bring better and longer-term benefits to the system, and use Matrix Factorization , MF) technology is quite mature singular value decomposition (Singular Value Decomposition, SVD) to do the previous processing, matrix decomposition can disassemble the user-item matrix (User-item Matrix) of the user to the item into a matrix with smaller dimensions. Calculation, and try to find the feature matrix between users and movies, because usually, such as e-commerce platforms, movie platforms, or e-book platforms, the User-item Matrix will be a very sparse matrix, because item is relatively If there are too many items, users may only rate a few items in a large item. Therefore, matrix decomposition disassembles the Rating matrix and the Item matrix and projects them to a lower-dimensional matrix. Find out the main factors that affect the ratings from the rated movies. In addition, in the field of recommended movies, there is a correlation between movies. When the number of movies increases, the amount of information contained will not increase linearly with the increase in the number of movies , It can effectively reduce the computational complexity and training time, and combine the topic model (Topic Model, TM) to apply the feature matrix in the SVD for feature analysis, so that the feature vector combined by multiple features in the SVD can be There is a clear label classification, and finally through enhanced learning to simulate the user's behavior pattern, let the model imitate the user's possible behavior and scoring habits, and use the model to understand the user's temporal dynamic changes to provide better predictions Accuracy, this method can effectively deal with data sparseness, personalization, cold start and other issues.
第三語言摘要
論文目次	目錄目錄 VII 圖目錄 IX 表目錄 X 第一章、簡介 1 第二章、相關研究 5 第三章、背景知識 8 3-1 基於內容技術（Content-Based） 8 3-2 基於熱門技術（Population-based） 8 3-3 協同過濾技術（Collaborative Filtering） 9 3-4 矩陣分解技術（Matrix Factorization） 10 3-5 主題模型技術 (Topic Model) 11 3-6 增強式學習技術 (Reinforcement Learning) 11 第四章、系統架構 13 4-1 環境與問題描述 13 4-2 系統架構 15 第五章、實驗分析 26 第六章、結論 28 參考文獻 29 附錄-英文論文 31 圖目錄圖1: 設計之推薦系統 14 圖2: 系統流程架構圖 15 圖 3: SVD矩陣分解圖(一) 17 圖 4: SVD矩陣分解圖(二) 17 圖 5: Topic Model 19 圖 6: 人工智慧領域圖 20 圖 7: 大腦與環境互動模型化圖 21 圖 8: 增強式學習模型 21 圖 9: 用戶動態時間興趣 22 圖 10: 用戶評分偏差值 23 圖 11: 電影評分偏差值 24 圖 12: 混合推薦列表 25 圖 13: 各算法模型RMSE之比較分析圖 26 圖 14: RF與RF-Topic Model之比較圖 27 表目錄表 1: 相關研究比較表 7
參考文獻	[1] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl, “Item-based Collaborative Filtering RecommendationAlgorithms”, WWW10, May 1-5, 2001, Hong Kong. [2] Yancheng Jia, Changhua Zhang, Qinghua Lu and Peng Wang, “Users' brands preference based on SVD++ in recommender systems”, 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA). [3] Shuiguang Deng, Longtao Huang, Guandong Xu, Xindong Wu and Zhaohu Wu, “On Deep Learning for Trust-Aware Recommendations in Social Networks”, IEEE Transactions on Neural Networks and Learning Systems, vol. 28, pp. 1164-1177, May. 2017. [4] Isshu Munemasa, Yuta Tomomatsu, Kunioki Hayashi and Tomohiro Takagi, “Deep reinforcement learning for recommender systems”, 2018 International Conference on Information and Communications Technology (ICOIACT). [5] Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti and Ed H. Ch, “Top-K Off-Policy Correction for a REINFORCE Recommender System”, Twelfth ACM International Conference on Web Search and Data Mining (WSDM’ 19), February 11-15, 2019, Melbourne, VIC, Australia. ACM, New York, NY, USA, 9 pages. [6] Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie and Zhenhui Li, “DRN: A Deep Reinforcement Learning Framework for News Recommendation”, Proceedings of The Web Conference 2018, Lyon, France, April 2018. [7] Xiangyu Zhao, Liang Zhang, Long Xia, Zhuoye Ding, Dawei Yin and Jiliang Tang, “Deep Reinforcement Learning for List-wise Recommendations”, 1st Workshop on Deep Reinforcement Learning for Knowledge Discovery (DRL4KDD ’19), August 5, 2019, Anchorage, AK, USA.ACM, New York, NY, USA, 9 pages. [8] M.G. Vozalis and K.G. Margaritis, “Applying SVD on item-based filtering”, 5th International Conference on Intelligent Systems Design and Applications (ISDA'05). [9] Marco Rossetti, Fabio Stella and Markus Zanker, “Towards Explaining Latent Factors with Topic Models in Collaborative Recommender Systems”, 2013 24th International Workshop on Database and Expert Systems Applications.
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信