淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-1902202013093900
中文論文名稱 應用深度學習與自然語言處理於仇恨言論之自動偵測
英文論文名稱 Apply deep learning and natural language processing to hate speech automatic detection
校院名稱 淡江大學
系所名稱(中) 資訊管理學系碩士班
系所名稱(英) Department of Information Management
學年度 108
學期 1
出版年 109
研究生中文姓名 蔡坤利
研究生英文姓名 Kun-Li Tsai
學號 606630373
學位類別 碩士
語文別 中文
口試日期 2020-01-03
論文頁數 64頁
口試委員 指導教授-鄭啟斌
委員-陳穆臻
委員-徐煥智
委員-鄭啟斌
中文關鍵字 BERT  自然語言處理  仇恨言論  深度學習 
英文關鍵字 BERT  NLP  hate speech  deep learning 
學科別分類
中文摘要 隨著網路的發展,社群媒體使用人數也逐年攀升,網路仇恨言論的問題也伴隨著發生,這個問題的影響不僅僅存在於網路,甚至影響網路使用者的身心狀況。僅管社群媒體管理方已投入大量人力與金錢試圖解決這個問題,然而仍被使用者認為成效不彰。而本研究透過深度學習與自然語言處理,使用了兩個不同的資料集,皆為內含標記仇恨言論的Twitter推文,使用了兩個深度學習模型:BERT模型與Bi-LSTM模型,透過深度學習的方式去預測Twitter推文是否為仇恨言論。本研究結果顯示,使用BERT模型進行仇恨言論偵測的成效較優於使用Bi-LSTM模型,本研究也發現,資料集內仇恨言論所佔的比例,將會影響到使用深度學習模型預測的結果。
英文摘要 With the development of the Internet, the number of social media users has also increased year by year, and the problem of “hate speech” on the Internet has also occurred. The impact of this problem not only exists on the Internet, but also affects the physical and mental conditions of Internet users. Although social media companies have invested a lot of manpower and money in trying to solve this problem, they are still considered ineffective by users. This study uses deep learning and natural language processing to use two different data sets, both of which are Twitter tweets containing labeled hate speech. Two deep learning models are used: the BERT model and the Bi-LSTM model. Learn ways to predict whether the Twitter tweets are hate speech. The results of this study show that the performance of hate speech detection using the BERT model is better than that of the Bi-LSTM model. This study also found that the proportion of hate speech in the data set will affect the prediction results using the deep learning model.
論文目次 目錄
第一章 緒論 1
第二章 文獻探討 5
2.1 仇恨言論 5
2.2 仇恨言論偵測使用的特徵 8
2.2.1表面特徵(Surface Features) 8
2.2.2詞彙一般化(word generalization)9
2.2.3 情感分析(Sentiment Analysis)11
2.2.4詞彙資源(Lexical Resources)12
2.2.5語言學特徵(Linguistic Features)13
2.2.6知識庫(Knowledge-Based)特徵 14
2.2.7元資料(Meta-Information) 15
2.2.8非文字類仇恨言論 16
2.3 角色 16
2.4 預測社會事件 17
2.5分類方法(CLASSIFICATION METHODS)18
2.6 評估方式 23
2.7 資料集 25
第三章 研究方法與系統架構 28
3.1 前言 28
3.2 系統架構與流程 28
3.3 資料集 29
3.3.1 HatebaseTwitter資料集: 29
3.3.1 3000_tweets_hate_goldlabel 資料集: 30
3.4 資料前處理 32
3.5 詞向量 32
3.6 建立深度學習模型 33
3.6.1 處理資料集文字 34
3.6.2 深度學習模型訓練 36
3.6.3深度學習模型評估 39
3.7 實驗環境 41
第四章 資料分析與實驗結果 43
4.1 資料分配 43
4.1.1 HatebaseTwitter資料集 43
4.1.2 3000_tweets_hate_goldlabel資料集 44
4.2 實驗設定與說明 45
4.3實驗結果與分析 46
4.3.1 HatebaseTwitter資料集 46
4.3.2 3000_tweets_hate_goldlabel 資料集 49
4.3.3 綜合結果 55
第五章 結論 58
5.1 結論 58
5.2 研究限制 59
5.3 未來研究方向 59
參考文獻 60

圖目次
圖 1 研究流程 4
圖 2 Continuous Bag-of-Words(CBOW),Skip-gram(Mikolov et al., 2013) 11
圖 3 性別、LGBT刻板印象的知識庫範例 15
圖 4 Bi-LSTM model 21
圖 5 BERT語言模型之輸入(Devlin, Chang, Lee, & Toutanova, 2018) 23
圖 6 混淆矩陣 24
圖 7 深度學習與自然語言處理於仇恨言論之自動偵測研究 28
圖 8 HatebaseTwitter部分資料集 30
圖 9 3000_tweets_hate_goldlabel部分資料集 31
圖 10 詞向量空間可視化 33
圖 11 BERT語言模型流程圖 34
圖 12 處理資料集文字流程圖 36
圖 13 BERT微調情境(Devlin et al., 2018) 38
圖 14 Bi-LSTM模型架構 39

表目次
表 1全球社群媒體使用者人數 (Digital 2019: Global Digital Overview, 2019)1
表 2 美國青少年在網路經歷網路霸凌比例 2
表 3 其他研究對於相似詞彙的定義 6
表 4 歐盟和社群媒體對仇恨言論的定義 7
表 5 仇恨言論類別和範例目標(Silva et al., 2016) 17
表 6 「電腦科學與工程」類別使用的社群媒體(Fortuna & Nunes, 2018) 26
表 7 用於仇恨言論偵測的資料集與文本(Fortuna & Nunes, 2018) 27
表 8 2分類之混淆矩陣 40
表 9 3分類之混淆矩陣 40
表 10 HatebaseTwitter三分類資料集訓練集與測試集數量 43
表 11 HatebaseTwitter兩分類資料集訓練集與測試集數量 44
表 12 3000_tweets_hate_goldlabel馬來西亞資料集訓練集與測試集數量 44
表 13 3000_tweets_hate_goldlabel美國資料集訓練集與測試集數量 44
表 14 3000_tweets_hate_goldlabel澳洲資料集訓練集與測試集數量 45
表 15 3000_tweets_hate_goldlabel全部資料集訓練集與測試集數量 45
表 16 HatebaseTwitter資料集3分類正確率、loss值 47
表 17 BERT、HatebaseTwitter資料集3分類之混淆矩陣 47
表 18 Bi-LSTM、HatebaseTwitter資料集3分類之混淆矩陣 47
表 19 HatebaseTwitter資料集2分類正確率、loss值 48
表 20 BERT、HatebaseTwitter資料集2分類之混淆矩陣 48
表 21 Bi-LSTM、HatebaseTwitter資料集2分類之混淆矩陣 49
表 22 3000_tweets_hate_goldlabel中馬來西亞資料集正確率、loss值 50
表 23 BERT、3000_tweets_hate_goldlabel中馬來西亞資料集之混淆矩陣 50
表 24 Bi-LSTM、3000_tweets_hate_goldlabel中馬來西亞資料集之混淆矩陣 50
表 25 3000_tweets_hate_goldlabel中澳洲資料集正確率、loss值 51
表 26 BERT、3000_tweets_hate_goldlabel中澳洲資料集之混淆矩陣 51
表 27 Bi-LSTM、3000_tweets_hate_goldlabel中澳洲資料集之混淆矩陣 52
表 28 3000_tweets_hate_goldlabel中美國資料集正確率、loss值 53
表 29 BERT、3000_tweets_hate_goldlabel中美國資料集之混淆矩陣 53
表 30 Bi-LSTM、3000_tweets_hate_goldlabel中美國資料集之混淆矩陣 53
表 31 3000_tweets_hate_goldlabel資料集正確率、loss值 54
表 32 BERT、3000_tweets_hate_goldlabel資料集之混淆矩陣 54
表 33 Bi-LSTM、3000_tweets_hate_goldlabel資料集之混淆矩陣 54
表 34 HatebaseTwitter資料集綜合結果 55
表 35 3000_tweets_hate_goldlabel 資料集綜合結果 56
表 36 仇恨言論佔資料集比例 57
參考文獻 [1] Agarwal, S., & Sureka, A. (2015). Using knn and svm based one-class classifier for detecting online radicalization on twitter. Paper presented at the International Conference on Distributed Computing and Internet Technology.
[2] Badjatiya, P., Gupta, S., Gupta, M., & Varma, V. (2017). Deep learning for hate speech detection in tweets. Paper presented at the Proceedings of the 26th International Conference on World Wide Web Companion.
[3] Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155.
[4] Burnap, P., Rana, O. F., Avis, N., Williams, M., Housley, W., Edwards, A., . . . Sloan, L. (2015). Detecting tension in online communities with computational Twitter analysis. Technological Forecasting and Social Change, 95, 96-108.
[5] Burnap, P., & Williams, M. L. (2014). Hate speech, machine classification and statistical modelling of information flows on twitter: Interpretation and communication for policy decision making.
[6] Burnap, P., & Williams, M. L. (2015). Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet, 7(2), 223-242.
[7] Chau, M., & Xu, J. (2007). Mining communities and their relationships in blogs: A study of online hate groups. International Journal of Human-Computer Studies, 65(1), 57-70.
[8] Chen, Y. (2011). Detecting offensive language in social medias for protection of adolescent online safety.
[9] Chen, Y., Zhou, Y., Zhu, S., & Xu, H. (2012). Detecting offensive language in social media to protect adolescent online safety. Paper presented at the 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.
[10] Dadvar, M., Trieschnigg, D., Ordelman, R., & de Jong, F. (2013). Improving cyberbullying detection with user context. Paper presented at the European Conference on Information Retrieval.
[11] Davidson, T., Bhattacharya, D., & Weber, I. (2019). Racial Bias in Hate Speech and Abusive Language Detection Datasets. arXiv preprint arXiv:1905.12516.
[12] Davidson, T., Warmsley, D., Macy, M., & Weber, I. (2017). Automated hate speech detection and the problem of offensive language. Paper presented at the Eleventh international aaai conference on web and social media.
[13] Del Vigna, F., Cimino, A., Dell’Orletta, F., Petrocchi, M., & Tesconi, M. (2017). Hate me, hate me not: Hate speech detection on facebook.
[14] Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[15] Dinakar, K., Jones, B., Havasi, C., Lieberman, H., & Picard, R. (2012). Common sense reasoning for detection, prevention, and mitigation of cyberbullying. ACM Transactions on Interactive Intelligent Systems (TiiS), 2(3), 18.
[16] Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V., & Bhamidipati, N. (2015). Hate speech detection with comment embeddings. Paper presented at the Proceedings of the 24th international conference on world wide web.
[17] Fortuna, P., & Nunes, S. (2018). A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), 51(4), 85.
[18] Gambäck, B., & Sikdar, U. K. (2017). Using convolutional neural networks to classify hate-speech. Paper presented at the Proceedings of the first workshop on abusive language online.
[19] Gitari, N. D., Zuping, Z., Damien, H., & Long, J. (2015). A lexicon-based approach for hate speech detection. International Journal of Multimedia and Ubiquitous Engineering, 10(4), 215-230.
[20] Guermazi, R., Hammami, M., & Hamadou, A. B. (2007). Using a semi-automatic keyword dictionary for improving violent Web site filtering. Paper presented at the 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System.
[21] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[22] Hosseinmardi, H., Mattson, S. A., Rafiq, R. I., Han, R., Lv, Q., & Mishra, S. (2015). Detection of cyberbullying incidents on the instagram social network. arXiv preprint arXiv:1503.03909.
[23] Ji, S., Yun, H., Yanardag, P., Matsushima, S., & Vishwanathan, S. (2015). Wordrank: Learning word embeddings via robust ranking. arXiv preprint arXiv:1506.02761.
[24] Kwok, I., & Wang, Y. (2013). Locate the hate: Detecting tweets against blacks. Paper presented at the Twenty-seventh AAAI conference on artificial intelligence.
[25] Lomas, N. (2017). Facebook, google, twitter commit to hate speech action in germany. Last accessed: July.
[26] MacAvaney, S., Yao, H.-R., Yang, E., Russell, K., Goharian, N., & Frieder, O. (2019). Hate speech detection: Challenges and solutions. PloS one, 14(8).
[27] McNamee, L. G., Peterson, B. L., & Peña, J. (2010). A call to educate, participate, invoke and indict: Understanding the communication of online hate groups. Communication Monographs, 77(2), 257-280.
[28] Mehdad, Y., & Tetreault, J. (2016). Do characters abuse more than words? Paper presented at the Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue.
[29] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[30] Mishna, F., Cook, C., Saini, M., Wu, M.-J., & MacFadden, R. (2011). Interventions to prevent and reduce cyber abuse of youth: A systematic review. Research on Social Work Practice, 21(1), 5-14.
[31] Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., & Chang, Y. (2016). Abusive language detection in online user content. Paper presented at the Proceedings of the 25th international conference on world wide web.
[32] Nockleby, J. T. (2000). Hate speech. Encyclopedia of the American constitution, 3, 1277-1279.
[33] Park, J. H., & Fung, P. (2017). One-step and two-step classification for abusive language detection on twitter. arXiv preprint arXiv:1706.01206.
[34] Patchin, J. W., & Hinduja, S. (2012). Cyberbullying prevention and response: Expert perspectives: Routledge.
[35] Pavlopoulos, J., Malakasiotis, P., & Androutsopoulos, I. (2017). Deeper attention to abusive user content moderation. Paper presented at the Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.
[36] Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. Paper presented at the Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).
[37] Razavi, A. H., Inkpen, D., Uritsky, S., & Matwin, S. (2010). Offensive language detection using multi-level classification. Paper presented at the Canadian Conference on Artificial Intelligence.
[38] Schmidt, A., & Wiegand, M. (2017). A survey on hate speech detection using natural language processing. Paper presented at the Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media.
[39] Silva, L., Mondal, M., Correa, D., Benevenuto, F., & Weber, I. (2016). Analyzing the targets of hate in online social media. Paper presented at the Tenth International AAAI Conference on Web and Social Media.
[40] Sood, S., Antin, J., & Churchill, E. (2012). Profanity use in online communities. Paper presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
[41] Spertus, E. (1997). Smokey: Automatic recognition of hostile messages. Paper presented at the Aaai/iaai.
[42] Thompson, N. (2016). Anti-discriminatory practice: Equality, diversity and social justice: Macmillan International Higher Education.
[43] Van Hee, C., Lefever, E., Verhoeven, B., Mennes, J., Desmet, B., De Pauw, G., . . . Hoste, V. (2015). Detection and fine-grained classification of cyberbullying events. Paper presented at the Proceedings of the international conference recent advances in natural language processing.
[44] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., . . . Polosukhin, I. (2017). Attention is all you need. Paper presented at the Advances in neural information processing systems.
[45] Wang, X., Gerber, M. S., & Brown, D. E. (2012). Automatic crime prediction using events extracted from twitter posts. Paper presented at the International conference on social computing, behavioral-cultural modeling, and prediction.
[46] Warner, W., & Hirschberg, J. (2012). Detecting hate speech on the world wide web. Paper presented at the Proceedings of the second workshop on language in social media.
[47] Waseem, Z. (2016). Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. Paper presented at the Proceedings of the first workshop on NLP and computational social science.
[48] Waseem, Z., & Hovy, D. (2016). Hateful symbols or hateful people? predictive features for hate speech detection on twitter. Paper presented at the Proceedings of the NAACL student research workshop.
[49] Xiang, G., Fan, B., Wang, L., Hong, J., & Rose, C. (2012). Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. Paper presented at the Proceedings of the 21st ACM international conference on Information and knowledge management.
[50] Xu, J.-M., Jun, K.-S., Zhu, X., & Bellmore, A. (2012). Learning from bullying traces in social media. Paper presented at the Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies.
[51] Yuan, S., Wu, X., & Xiang, Y. (2016). A Two Phase Deep Learning Model for Identifying Discrimination from Tweets. Paper presented at the EDBT.
[52] Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., & Kumar, R. (2019). Predicting the Type and Target of Offensive Posts in Social Media. arXiv preprint arXiv:1902.09666.
[53] Zhang, Z., & Luo, L. (2018). Hate speech detection: A solved problem? the challenging case of long tail on twitter. Semantic Web(Preprint), 1-21.
[54] Zhong, H., Li, H., Squicciarini, A. C., Rajtmajer, S. M., Griffin, C., Miller, D. J., & Caragea, C. (2016). Content-Driven Detection of Cyberbullying on the Instagram Social Network. Paper presented at the IJCAI.
[55] 李洋, & 董红斌. (2018). 基于 CNN 和 BiLSTM 网络特征融合的文本情感分析. 计算机应用, 38(11), 3075-3080.
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2020-02-20公開。
  • 同意授權瀏覽/列印電子全文服務,於2020-02-20起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2486 或 來信