淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


系統識別號 U0002-3006201901284900
中文論文名稱 利用遺傳演算法以中文新聞與技術指標為基礎的股票趨勢預測之研究
英文論文名稱 A Study on Stock Trend Prediction based on Chinese News and Technical Indicators Using Genetic Algorithms
校院名稱 淡江大學
系所名稱(中) 資訊工程學系碩士在職專班
系所名稱(英) Department of Computer Science and Information Engineering
學年度 107
學期 2
出版年 108
研究生中文姓名 石平
研究生英文姓名 Ping Shih
學號 706410072
學位類別 碩士
語文別 英文
口試日期 2019-06-27
論文頁數 56頁
口試委員 指導教授-陳俊豪
委員-林威成
委員-陳洳瑾
中文關鍵字 中文文字探勘  基因演算法  交易策略  技術分析  股價漲跌分析 
英文關鍵字 Genetic Algorithms  Chinese Text Mining  Trading Strategy  Technical Analysis  Stock Trend Prediction 
學科別分類 學科別應用科學資訊工程
中文摘要   股票市場預測長久以來一直是一個很吸引人的題目,激勵了無數的學術研究相繼投入探討。過往研究中顯示利用財經新聞預測相關事件的效應、了解投資人情緒,並以此為據採取相應的投資決策為一合理且可實行的作法。本研究嘗試利用中文財經新聞來預測股價走勢,並新聞因素結合技術分析指標衍生出一交易策略。實驗結果發現本交易策略的表現優於簡單的買入持有策略,亦顯示中文財經新聞對於股市具有一定程度的預測能力。
  我們同時檢驗了2-word combination特徵抽取方法應用在中文上的效益。實驗結果證實,由於中文文字本身的語法結構以及文字前處理方式上的差異,此方法應用在中文上的表現並不如其在英文文字上的表現。
  在實驗過程中我們發現參數的設定在特徵選擇上扮演了重要角色。因此我們導入了基因演算法來提升本交易策略的表現。利用基因演算法來找出中文詞彙獨特性和預測力之間的最佳平衡。我們同時加入技術分析指標來找出結合新聞文字探勘及技術分析指標的最佳買點。
  實驗結果顯示,利用我們所提出的交易策略,不僅勝過買入持有策略,亦勝過最佳化前的新聞交易策略。
英文摘要   Stock market prediction is a very attractive topic that has inspired countless studies. Using financial news articles to forecast the effect of certain events, understand investors’ emotions, and react accordingly has been proved viable in existing literatures. In this study, we utilized Chinese financial news in attempt to predict the stock price movement and to derive a trading strategy based on news factors and technical indicators. The result shows that our proposed news-based trading strategy outperforms a simple buy-and-hold strategy, showing that Chinese financial news possesses a reasonable amount of predictive power on stock price movement.
We also examined the use of 2-word combination feature extraction on Chinese text. Our experiment shows that, comparing to English, the Chinese language does not benefit as much from applying the said technique due to its syntactical structure and text preprocessing method.
While conducting our experiment, we discovered that the setting of hyperparameters plays an important part in feature selection. Hence, we adopted a Genetic Algorithm approach to enhance the performance of feature selection in our input dataset by optimizing the balance between uniqueness and predictive power. We also included some technical indicators in the Genetic Algorithm in order to examine the optimal trading timing with technical indicators and financial news articles working in tandem.
The result shows that our proposed algorithm performs better than the simple buy-and-hold strategy as well as our original stock trend prediction algorithm.
論文目次 CONTENTS
CHAPTER 1 Introduction 1
1.1 Motivation 1
1.2 Contribution 3
1.3 Dissertation Overview 4
CHAPTER 2 Literature Review and General Context 5
2.1 Literature Review 5
2.1.1 Market Predictability 5
2.1.2 Source of Corpus 6
2.1.3 Textual Features 7
2.2 General Context 12
2.2.1 Text Mining 12
2.2.2 Text Pre-processing 12
2.2.3 Feature Extraction 14
2.2.4 Feature Selection 16
2.2.5 Machine Learning Algorithms 18
2.2.6 Genetic Algorithm (GA) 19
2.3 Summary 19
CHAPTER 3 Proposed Approaches 20
3.1 Proposed Stock Trend Prediction Approach Based on Traditional Chinese News and Technical Indicators 20
3.1.1 Stock Trend Prediction (STP) Framework 20
3.1.2 Pseudo Code of the STP Algorithm 21
3.1.3 Details of the STP Algorithm 23
3.2 GA-Based Stock Trend Prediction Approach 26
3.2.1 GA-Based STP (GASTP) Framework 26
3.2.2 Pseudo Code of the GASTP Algorithm 27
3.2.3 Components of the GASTP Algorithm 27
CHAPTER 4 Experimental Result 31
4.1 Data Description 31
4.2 Experimental Result for STP 34
4.3 Experimental Result for GASTP 42
CHAPTER 5 Conclusion and Future Work 50
References 51
Appendix A: Traditional Chinese Taiwan Stock Market Exchange User-Defined Dictionary 55

LIST OF FIGURES
Figure 1: Proposed Stock Trend Prediction framework. 21
Figure 2: Pseudo code of the STP algorithm. 22
Figure 3: GA-based STP (GASTP) framework. 26
Figure 4: Pseudo code of the GASTP algorithm. 27
Figure 5: Encoding scheme of GASTP. 28
Figure 6: 2317.TW stock trend. 32
Figure 7: 2327.TW stock trend. 32
Figure 8: 2330.TW stock trend. 32
Figure 9: 2474.TW stock trend. 33
Figure 10: 2497.TW stock trend. 33
Figure 11: 3008.TW stock trend. 33

LIST OF TABLES
Table 1: TWSE trading volume by type of investors. 2
Table 2: sources and length of financial corpus. 7
Table 3: Summary of training data. 34
Table 4: Summary of testing data. 34
Table 5: Experimental result of STP algorithm with SVM (with/ without technical indicator). 36
Table 6: Features captured by 2-word combinations. 38
Table 7: Experimental result of STP algorithm with NB (with/ without technical indicator). 41
Table 8: Obtained hyperparameters by GASTP. 42
Table 9: Top 20th selected features from GASTP. 44
Table 10: Results of GASTP (without Technical Indicators) 45
Table 11: Results of GASTP (with Technical Indicators). 45
Table 12: Result of GASTP with NB (without technical indicators). 47
Table 13: Result of GASTP with NB (with technical indicators). 47
Table 14: Compared result of the two classifiers. 48
Table 15: Compared result of STP and GASTP with Naive Bayes. 49
參考文獻 [1] 謝委霖,從財經新聞預測公司財報之營收走勢,國立中山大學資訊管理學系碩士論文,2015。
[2] 賀安平,從新聞文章預測股票走勢:使用SVM與LDA演算法,國立高雄應用科技大學,資訊管理系碩士論文,2016。
[3] 周紹文,探討文字指標對企業績效的影響,國立中山大學資訊管理學系碩士論文,2016。
[4] 王彥鈞,不同市場狀態下新聞情緒的預測能力:以台灣五十指數為例,國立中央大學財務金融學系碩士論文,2017。
[5] 林政修,文字探勘投資策略分析,雲林科技大學財務金融學系碩士論文,2017。
[6] J. Boudoukh, R. Feldman, S. Kogan and M. Richardson, “Which news moves stock prices? A textual analysis,” NBER Working Paper, No. 18725, 2013.
[7] M. Clatworthy and M. J. Jones, “Financial reporting of good news and bad news: evidence from accounting narratives,” Accounting and Business Research, Vol. 33, No. 3, pp. 171-185, 2003.
[8] R. Caruana and A. Niculescu-Mizil, “An empirical comparison of supervised learning algorithms.” Proc. 23rd International Conference on Machine Learning. CiteSeerX, 2006.
[9] E. Fama, “Random walks in stock market prices.” Financial Analysts Journal, Vol. 21, pp. 55–59, 1965.
[10] E. Fama, “Efficient capital markets: A review of theory and empirical work,” The Journal of Finance, Vol. 25, pp. 383–417, 1970.
[11] S. Feuerriegel and H. Prendinger, “News-based trading strategies,” Decision Support Systems, Vol. 90, pp. 65-74, 2016.
[12] T. Geva and J. Zahavi, “Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news,” Decision Support Systems, Vol. 57, pp. 212-223, 2014.
[13] A. Handler, M. J. Denny, H. Wallach and B. O’Connor, “Bag of what? Simple noun phrase extraction for text analysis.” Proceedings of the workshop on natural language processing and computational social science at the 2016 conference on empirical methods in natural language processing, 2016.
[14] M. Hagenau, M. Liebmann and D. Neumann, “Automated news reading: Stock price prediction based on financial news using context-capturing features,” Decision Support Systems, Vol. 55, No. 3, pp. 685-697, 2013.
[15] Y. Kim, S. R. Jeong and I. Ghani, “Text opinion mining to analyze news for stock market prediction,” Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, 2014.
[16] C. Lee, “A study of deep learning with different finance news providers for forecasting stock price trends,” Executive Master’s Program of Business Administration(EMBA) in Information Management of Tamkang University, 2016
[17] K. Lim, R. D. Brooks and J. H. Kim, “Financial crisis and stock market efficiency: Empirical evidence from Asian countries,” International Review of Financial Analysis, Vol. 17, No. 3, pp. 571-591, 2008.
[18] Q. Li, T. Wang, Q. Gong, Y. Chen, Z. Lin and S. Song, “Media-aware quantitative trading based on public Web information,” Decision Support Systems, Vol. 61, pp. 93-105, 2014.
[19] Q. Li, T. Wang, P. Li, L. Liu, Q. Gong and Y. Chen, “The effect of news and public mood on stock movements,” Information Sciences, Volume 278, pp. 826-840, 2014.
[20] E. Marsh and D. Perzanowski, “MUC-7 Evaluation of IE technology: Overview of results,” MUC-7, 1998.
[21] A. K. Nassirtoussi, S. Aghabozorgi, T. Wah and D. C. L. Ngo, “Text mining for market prediction: A systematic review,” Expert Systems with Applications, Vol. 41, pp. 7653–7670, 2014.
[22] A. K. Nassirtoussi, S. Aghabozorgi, T. Wah and D. C. L. Ngo, “Text mining of news-headlines for FOREX market prediction: A multi-layer dimension reduction algorithm with semantics and sentiment,” Expert Systems with Applications, Vol. 42, No. 1, pp. 306-324, 2015.
[23] W. Nuij, V. Milea, F. Hogenboom, F. Frasincar and U. Kaymak, “An automated framework for incorporating news into stock trading strategies,” IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 4, 2014.
[24] T. H. Nguyen, K. Shirai and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications, Volume 42, No. 24, pp. 9603-9611, 2015.
[25] V. Pestov, “Is the -NN classifier in high dimensions affected by the curse of dimensionality?” Computers and Mathematics with Applications, Vol. 65, pp. 1427–1437, 2013.
[26] T. Poibeau and L. Kosseim, “Proper name extraction from non-journalistic texts,” Language and Computers, Vol. 37, No. 1, pp. 144–157, 2001.
[27] R. P. Schumaker and H. Chen, “A Discrete Stock Price Prediction Engine Based on Financial News,” Computer, Volume 43, No. 1, 2010.
[28] R. P. Schumaker, Y. Zhang, C. Huang and H. Chen, “Evaluating sentiment in financial news articles,” Decision Support Systems, Volume 53, No. 3, pp. 458-464, 2012.
[29] T. T. Vu, S. Chang, Q. T. Ha and N. Collier, “An experiment in integrating sentiment features for tech stock prediction in twitter.” Proceedings of the workshop on information extraction and entity analytics on social media data, pp. 23–38, 2012.
[30] G. Wu, T. Hou and J. Lin, “Can economic news predict Taiwan stock market returns?” Asia Pacific Management Review, 2018.
[31] Y. Yu, W. Duan and Q. Cao, “The impact of social and conventional media on firm equity value: A sentiment analysis approach,” Decision Support Systems, Vol. 55, No. 4, pp. 919-926, 2013.
[32] Y. Yang, S. Mo, A. Liu and A. A. Kirilenko, “Genetic programming optimization for a sentiment feedback strength based trading strategy,” Neurocomputing, Vol. 264, pp. 29–41, 2017.
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2024-07-23公開。
  • 同意授權瀏覽/列印電子全文服務,於2024-07-23起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2487 或 來信 dss@mail.tku.edu.tw