系統識別號 | U0002-2707202022442200 |
---|---|
DOI | 10.6846/TKU.2020.00792 |
論文名稱(中文) | 以深度學習方法進行網路謠言偵測 |
論文名稱(英文) | Applying Deep Learning to Internet Rumor Detection |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 數位商務與經濟碩士學位學程 |
系所名稱(英文) | Master's Program in Digital Business and Economics |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 108 |
學期 | 2 |
出版年 | 109 |
研究生(中文) | 張耿豪 |
研究生(英文) | Geng-Hao Zhang |
學號 | 607880027 |
學位類別 | 碩士 |
語言別 | 繁體中文 |
第二語言別 | |
口試日期 | 2020-07-02 |
論文頁數 | 49頁 |
口試委員 |
指導教授
-
張昭憲
委員 - 壽大衛 委員 - 魏世杰 |
關鍵字(中) |
網路謠言 機器學習 深度學習 網路社群 |
關鍵字(英) |
Internet Rumor Machine Learning Deep Learning Online Community |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
網路謠言的危害有目共睹,對社會、經濟與政治造成巨大負面影響,有關當局與社群平台莫不投注大量心力,期能遏止其擴散。有鑑於此,本論文將發展有效的網路謠言預測方法,以抑制網路謠言的散播,維護網路社群的正常運作。首先,我們分析網路謠言的散布模式,發現謠言討論串的長度很低,但數量龐大,明顯以廣度優於深度方式來傳遞。其次,分析推特(twitter)中發文者的個人資訊與文章特徵,發展出15種偵測屬性。接下來,為配合不同學習方法,我們將來源資料集進行轉換,產生一維與二維資料集。最後,配合發文文字分析,以多種深度學習與非深度學習方法進行實驗。為驗證提出方法之有效性,本論文使用實際的謠言資料集進行實驗,結果顯示: 只考慮來源發文,以傳統文字剖析方式建立資料集,不論配合深度或非深度學習方法,可獲得較佳的偵測結果。當考慮來源發文及其回應串,偵測結果便明顯降低,且非深度學習方法明顯優於深度學習方法,顯示回應串可能干擾謠言的判讀,無助於偵測結果的提升。使用本研究提出之屬性集配合文字性特徵,並以Multi-Tasking Learning方法塑模,可獲得穩定之最佳結果。根據本論文的研究成果,為網路謠言偵測提供更深入的了解,除可提供相關單位有效的決策依據,亦有助於未來相關方法的研發。 |
英文摘要 |
The damage of Internet rumors is obvious to everyone. It has have large adverse effect on society, economy, and politics. The government departments and social media platforms have been dedicated to stopping its spreading for a long time. In light of this, this paper developed effective internet rumor detection method for stopping the spread of rumors and maintaining the good order on social media. First, we analyze the spreading pattern of rumors on social media and find the length of rumor-threads are not long but the quantity are large, so we consider the rumors spread is in related with “breadth” instead of “depth”. Second, according to the information of tweets, we design 15 attributes for detecting rumor. Moreover, to apply on our machine learning and deep learning, we process our dataset and develop 1-D and 2-D datasets. Finally, we make experiment by using the things mentioned above and text. The results show that if we just use the source tweets to make experiment with text, we can get better results. But if we affiliate the reaction tweets, the result become worse. It means that reaction tweet would hinder the detection of rumor. It’s not helpful for our outcome. With our attributes and text in paper, and build the model with the Multi-task Learning, we can get the best outcome. According our achievement on research, we can provide a better understanding of internet rumor detection. Not only will we provide the concerned department the basis of making the effective decision, but help them devise the new approach of internet rumor detection in the future. |
第三語言摘要 | |
論文目次 |
目錄 第一章 緒論 1 第二章 相關技術與文獻探討 5 2.1 網路謠言(Internet Rumors) 5 2.2 謠言偵測(Rumor Detection) 6 2.3 塑模方法: 非深度學習方法 8 2.4 謠言偵測之塑模方法: 深度學習方法 11 2.4.1 Multilayer Perceptron (MLP) 13 2.4.2 Recurrent Neural Network (RNN) 14 2.4.3 Long Short Term Memory(LSTM) 15 2.4.4 Multi-tasking learning 17 第三章 網路謠言偵測方法 18 3.1 謠言傳播模式分析 18 3.2 偵測屬性設計 24 3.3 謠言偵測方法 27 3.3.1 資料集的建立 27 3.3.2 謠言偵測流程 33 第四章 實驗結果 36 4.1 資料來源 36 4.2 評量指標 37 4.3 謠言偵測之實驗結果 38 4.3.1 非文字特徵資料集的實驗結果 38 4.3.2 文字特徵資料集之實驗結果 40 4.4 混合型資料集 43 第五章 結論與未來工作 45 參考文獻 47 圖目錄 圖2-1: 巢狀交叉驗證法 9 圖2-2: 決策樹範例 10 圖2-3: ReLU啟動函數 11 圖2-4: Sigmoid啟動函數 12 圖2-5: 多層感知機常見架構 14 圖2-6: RNN架構圖 14 圖2-7: LSTM結構圖 15 圖2-8: Multi-Tasking Learning架構圖 17 圖3-1: 討論串在推特上的示意圖 19 圖3-2: PHEME資料集中各事件在Twitter中對應之發文(tweets)總數 20 圖3-3: PHEME資料庫中所有謠言(rumor)討論串之長度統計 21 圖3-4: PHEME資料庫中所有非謠言(non-rumor)討論串之長度統計 22 圖3-5: Ferguson事件中所有謠言(rumor)討論串之長度統計 22 圖3-6: Ferguson事件中所有非謠言(non-rumor)討論串之長度統計 23 圖3-7: Ferguson事件中之一個長度超過10之討論串 24 圖3-8: 手寫數字辨識之二維表示格式 28 圖3-9: 多則討論串在推特上的示意圖 30 圖3-10: 1-D文字特徵資料集 32 圖3-11: 2-D文字特徵資料集 32 圖3-12: 本研究之謠言偵測流程 33 圖3-13: 分割資料集 33 表目錄 表2-1 謠言偵測之相關研究 7 表3-1: 討論串(Discussion Threads)之說明範例 19 表3-2: 在各事件中,謠言與非謠言討論串所佔的比例 21 表3-3: 本研究提出之謠言偵測屬性集 26 表3-4: 針對來源推特(s1...sn)產生的1D資料表範例 27 表3-5: 針對討論串(以Sm為來源推特)產生之2D資料集 29 表4-1: PHEME謠言資料集之統計資料 36 表4-2: 混淆矩陣(Confusion Matrix) 37 表4-3: 1-D非文字資料集(僅使用15個偵測屬性)之偵測結果 39 表4-4: 2-D非文字資料集(僅使用15個偵測屬性)之偵測結果 40 表4-5: 1-D文字資料集之偵測結果 41 表4-6: 二維(2D)文字特徵資料集之偵測結果 43 表4-7: 使用Multi-task Learning 對1-D與2-D資料集進行實驗 44 |
參考文獻 |
1. Aarts, O., et al. (2012), “Online Social Behavior in Twitter: A Literature Review,” the IEEE 12th International Conference on Data Mining Workshop, 2012, pp. 739-746. 2. Bazan, S. (2016), “A New Way to Win the War,” IEEE Internet Computing, Volume: 21, Issue: 4, 2017, pp.92-97. 3. BBC News (2016), “How can Facebook fix its fake news problem?”, http://www.bbc.com/news/technology-37974306 4. Castillo, C., et al. (2011), “Information Credibility on Twitter,” The ACM International WWW Conference, Mar. 28-Apr. 1, Hyderabad, India, 2011, pp. 675-684. 5. Chen, Weiling, et al.(2016), “Behavior Deviation: An Anomaly Detection View of Rumor Preemption,” IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2016, pp. 1-7. 6. Chen, Weiling, et al. (2017), “Unsupervised rumor detection based on users’ behaviors using neural networks,” Pattern Recognition Letters, 2017,p.1-8 (Article in Press). 7. Choi, J., et al. (2017), “Rumor Source Detection under Querying with Untruthful Answers,” IEEE Conference on Computer Communications, 2017, pp. 1-9. 8. Ehsanfar, A., and Mansouri, Mo (2017), “Incentivizing the dissemination of Truth and Fake News in Social Networks,” Proceedings of IEEE System of Systems Engineering Conference (SoSE), 12th, 2017; 9. Franks, H., Griffiths, N. and Anand, S. S. (2014), "Learning agent influence in MAS with complex social networks," Auton Agent Multi-Agent Syst (2014) 28:836-866. 10. Fanti, G., et al. (2017), “Hiding the Rumor Source,” IEEE Trans. on Information Theory, Vol. 63, No. 10, Oct. 2017, p. 6679-6613. 11. Fong, S., Zhuang, Y., He, J. (2012), “Not Every Friend on a Social Network Can be Trusted: Classifying Imposters Using Decision Trees,” 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Volume: 3, 2012, pp. 280 – 285. 12. Karamchandani, N., and Franceschetti, M. (2013), “Rumor Source Detection under Probabilistic Sampling,” 2013 IEEE international Symposium on Information Theory, pp. 2184-2188. 13. Kumar, A., et al. (2017), “Temporally Agnostic Rumor-Source Detection,” IEEE Trans. on Signal and Information Processing over Networks, Vol. 3, No. 2, June 2017, pp. 316-329. 14. Liang, G., et al. (2016), “Rumor Identification in Microblogging Systems Based on User’s Behavior,” IEEE Trans. on Computational Social Systems, Volume: 2, Issue: 3, 2016, pp.99-108. 15. PHEME rumour dataset, https://www.pheme.eu/2016/06/13/pheme-rumour-dataset-support-certainty-and-evidentiality/, last retrieved by 2019/8/31. 16. Sahana, V. P., et al. (2015), “Automatic Detection of Rumored Tweets and Finding its Origin,” 2015 International Conference on Computing and Network Communications, Dec. 16-19, 2015, Trivandrum, India, pp. 607-612. 17. Shafiq, M. Z., et al. (2013), "Identifying Leaders and Followers in Online Social Networks," IEEE Journal on Selected Areas in Communications/Supplement, Vol. 31, No. 9, Sep. 2013. pp. 618-628. 18. Shapiro C. and Hal R. Varian (1999), Information Rules, Harvard Business Press. 19. Subbian, K., et al. (2016), "Mining Influencers Using Information Flows in Social Streams," ACM Transactions on Knowledge Discovery from Data, Vol. 10, No. 3, Article 26, January 2016. 20. Tavakolifard, Mozhgan, and Kevin C. Almeroth. (2012), "Social computing: an intersection of recommender systems, trust/reputation systems, and social networks." Network, IEEE 26.4 (2012): 53-58. 21. Wang, S., and Terano, T. (2015), “Detecting Rumor Patterns in Streaming Social Media,” 2015 IEEE International Conference on Big Data, pp. 2709-2715. 22. Wu, Ke, Yang, S., and Zhu, K. Q. (2015), “False Rumors Detection on Sina Weibo by Propagation Structures,” the IEEE ICDE Conference, 2015, pp. 651-662. 23. Xu, W., and Chen H. (2015), “Scalable Rumor Source Detection under Independent Cascade Model in Online Social Networks,” the 11th International Conference on Mobile Ad-hoc and Sensor Networks, 2015, pp. 236-242. 24. Yang, Z., et al. (2015), “Emerging Rumor Identification for Social Media with Hot Topic Detection,” the 12th Information System and Application Conference, 2015, pp. 53-58. 25. Zhang, J., Aggarwal, C. C., Yu, Yu, P. S. (2017), “Rumor Initiator Detection in Infected Signed Networks,” 2017 IEEE 37th International Conference on Distributed Computing Systems, pp. 1900-1909. 26. Zhang, Y., Chen, Weiling, et al. (2016), “A Distance-based Outlier Detection Method for rumor Detection Exploiting User Behaviorial Differences,“ 2016 International Conference on Data and Software Engineering, pp. 1-6. 27. Zubiaga, A., Liakata M, Procter R, Wong Sak Hoi G, Tolmie P (2016) Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads. PLoS ONE 11(3): e0150989. https://doi.org/10.1371/journal.pone.0150989 28. 內政部警政署刑事警察局, https://www.cib.gov.tw/ , last retrieved on 2020/06/20. 29. 謠言終結站(自由時報), https://news.ltn.com.tw/topic/謠言終結站, last retrieved on 2020/06/20. 30. Yahoo新聞,https://tw.news.yahoo.com/誤信喝甲醇能抗武漢肺炎 伊朗逾700人喪命5千多人中毒-074218225.html, 2020. 31. Yahoo 新聞,https://tw.news.yahoo.com/假line-阿爾及利亞恐佈份子油罐車駛入機場跑道?遊戲動畫-054123356.html , 2019. 32. Keras 中文文檔,https://keras-cn.readthedocs.io/en/latest/preprocessing/text/ |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信