§ 瀏覽學位論文書目資料
系統識別號 U0002-3006201901284900
DOI 10.6846/TKU.2019.01003
論文名稱(中文) 利用遺傳演算法以中文新聞與技術指標為基礎的股票趨勢預測之研究
論文名稱(英文) A Study on Stock Trend Prediction based on Chinese News and Technical Indicators Using Genetic Algorithms
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士在職專班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 107
學期 2
出版年 108
研究生(中文) 石平
研究生(英文) Ping Shih
學號 706410072
學位類別 碩士
語言別 英文
第二語言別
口試日期 2019-06-27
論文頁數 56頁
口試委員 指導教授 - 陳俊豪
委員 - 林威成
委員 - 陳洳瑾
關鍵字(中) 中文文字探勘
基因演算法
交易策略
技術分析
股價漲跌分析
關鍵字(英) Genetic Algorithms
Chinese Text Mining
Trading Strategy
Technical Analysis
Stock Trend Prediction
第三語言關鍵字
學科別分類
中文摘要
股票市場預測長久以來一直是一個很吸引人的題目,激勵了無數的學術研究相繼投入探討。過往研究中顯示利用財經新聞預測相關事件的效應、了解投資人情緒,並以此為據採取相應的投資決策為一合理且可實行的作法。本研究嘗試利用中文財經新聞來預測股價走勢,並新聞因素結合技術分析指標衍生出一交易策略。實驗結果發現本交易策略的表現優於簡單的買入持有策略,亦顯示中文財經新聞對於股市具有一定程度的預測能力。
  我們同時檢驗了2-word combination特徵抽取方法應用在中文上的效益。實驗結果證實,由於中文文字本身的語法結構以及文字前處理方式上的差異,此方法應用在中文上的表現並不如其在英文文字上的表現。
  在實驗過程中我們發現參數的設定在特徵選擇上扮演了重要角色。因此我們導入了基因演算法來提升本交易策略的表現。利用基因演算法來找出中文詞彙獨特性和預測力之間的最佳平衡。我們同時加入技術分析指標來找出結合新聞文字探勘及技術分析指標的最佳買點。
  實驗結果顯示,利用我們所提出的交易策略,不僅勝過買入持有策略,亦勝過最佳化前的新聞交易策略。
英文摘要
Stock market prediction is a very attractive topic that has inspired countless studies. Using financial news articles to forecast the effect of certain events, understand investors’ emotions, and react accordingly has been proved viable in existing literatures. In this study, we utilized Chinese financial news in attempt to predict the stock price movement and to derive a trading strategy based on news factors and technical indicators. The result shows that our proposed news-based trading strategy outperforms a simple buy-and-hold strategy, showing that Chinese financial news possesses a reasonable amount of predictive power on stock price movement.
	We also examined the use of 2-word combination feature extraction on Chinese text. Our experiment shows that, comparing to English, the Chinese language does not benefit as much from applying the said technique due to its syntactical structure and text preprocessing method.
	While conducting our experiment, we discovered that the setting of hyperparameters plays an important part in feature selection. Hence, we adopted a Genetic Algorithm approach to enhance the performance of feature selection in our input dataset by optimizing the balance between uniqueness and predictive power. We also included some technical indicators in the Genetic Algorithm in order to examine the optimal trading timing with technical indicators and financial news articles working in tandem.
	The result shows that our proposed algorithm performs better than the simple buy-and-hold strategy as well as our original stock trend prediction algorithm.
第三語言摘要
論文目次
CONTENTS
CHAPTER 1	Introduction	1
1.1	Motivation	1
1.2	Contribution	3
1.3	Dissertation Overview	4
CHAPTER 2	Literature Review and General Context	5
2.1	Literature Review	5
2.1.1	Market Predictability	5
2.1.2	Source of Corpus	6
2.1.3	Textual Features	7
2.2	General Context	12
2.2.1	Text Mining	12
2.2.2	Text Pre-processing	12
2.2.3	Feature Extraction	14
2.2.4	Feature Selection	16
2.2.5	Machine Learning Algorithms	18
2.2.6	Genetic Algorithm (GA)	19
2.3	Summary	19
CHAPTER 3	Proposed Approaches	20
3.1	Proposed Stock Trend Prediction Approach Based on Traditional Chinese News and Technical Indicators	20
3.1.1	Stock Trend Prediction (STP) Framework	20
3.1.2	Pseudo Code of the STP Algorithm	21
3.1.3	Details of the STP Algorithm	23
3.2	GA-Based Stock Trend Prediction Approach	26
3.2.1	GA-Based STP (GASTP) Framework	26
3.2.2	Pseudo Code of the GASTP Algorithm	27
3.2.3	Components of the GASTP Algorithm	27
CHAPTER 4	Experimental Result	31
4.1	Data Description	31
4.2	Experimental Result for STP	34
4.3	Experimental Result for GASTP	42
CHAPTER 5	Conclusion and Future Work	50
References      	51
Appendix A: Traditional Chinese Taiwan Stock Market Exchange User-Defined Dictionary       	55

LIST OF FIGURES
Figure 1: Proposed Stock Trend Prediction framework.	21
Figure 2: Pseudo code of the STP algorithm.	22
Figure 3: GA-based STP (GASTP) framework.	26
Figure 4: Pseudo code of the GASTP algorithm.	27
Figure 5: Encoding scheme of GASTP.	28
Figure 6: 2317.TW stock trend.	32
Figure 7: 2327.TW stock trend.	32
Figure 8: 2330.TW stock trend.	32
Figure 9: 2474.TW stock trend.	33
Figure 10: 2497.TW stock trend.	33
Figure 11: 3008.TW stock trend.	33

LIST OF TABLES
Table 1: TWSE trading volume by type of investors.	2
Table 2: sources and length of financial corpus.	7
Table 3: Summary of training data.	34
Table 4: Summary of testing data.	34
Table 5: Experimental result of STP algorithm with SVM (with/ without technical indicator).	36
Table 6: Features captured by 2-word combinations.	38
Table 7: Experimental result of STP algorithm with NB (with/ without technical indicator).	41
Table 8: Obtained hyperparameters by GASTP.	42
Table 9: Top 20th selected features from GASTP.	44
Table 10: Results of GASTP (without Technical Indicators)	45
Table 11: Results of GASTP (with Technical Indicators).	45
Table 12: Result of GASTP with NB (without technical indicators).	47
Table 13: Result of GASTP with NB (with technical indicators).	47
Table 14: Compared result of the two classifiers.	48
Table 15: Compared result of STP and GASTP with Naive Bayes.	49
參考文獻
[1]	謝委霖,從財經新聞預測公司財報之營收走勢,國立中山大學資訊管理學系碩士論文,2015。
[2]	賀安平,從新聞文章預測股票走勢:使用SVM與LDA演算法,國立高雄應用科技大學,資訊管理系碩士論文,2016。
[3]	周紹文,探討文字指標對企業績效的影響,國立中山大學資訊管理學系碩士論文,2016。
[4]	王彥鈞,不同市場狀態下新聞情緒的預測能力:以台灣五十指數為例,國立中央大學財務金融學系碩士論文,2017。
[5]	林政修,文字探勘投資策略分析,雲林科技大學財務金融學系碩士論文,2017。
[6]	J. Boudoukh, R. Feldman, S. Kogan and M. Richardson, “Which news moves stock prices? A textual analysis,” NBER Working Paper, No. 18725, 2013.
[7]	M. Clatworthy and M. J. Jones, “Financial reporting of good news and bad news: evidence from accounting narratives,” Accounting and Business Research, Vol. 33, No. 3, pp. 171-185, 2003.
[8]	R. Caruana and A. Niculescu-Mizil, “An empirical comparison of supervised learning algorithms.” Proc. 23rd International Conference on Machine Learning. CiteSeerX, 2006.
[9]	E. Fama, “Random walks in stock market prices.” Financial Analysts Journal, Vol. 21, pp. 55–59, 1965.
[10]	E. Fama, “Efficient capital markets: A review of theory and empirical work,” The Journal of Finance, Vol. 25, pp. 383–417, 1970.
[11]	S. Feuerriegel and H. Prendinger, “News-based trading strategies,” Decision Support Systems, Vol. 90, pp. 65-74, 2016.
[12]	T. Geva and J. Zahavi, “Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news,” Decision Support Systems, Vol. 57, pp. 212-223, 2014.
[13]	A. Handler, M. J. Denny, H. Wallach and B. O’Connor, “Bag of what? Simple noun phrase extraction for text analysis.” Proceedings of the workshop on natural language processing and computational social science at the 2016 conference on empirical methods in natural language processing, 2016.
[14]	M. Hagenau, M. Liebmann and D. Neumann, “Automated news reading: Stock price prediction based on financial news using context-capturing features,” Decision Support Systems, Vol. 55, No. 3, pp. 685-697, 2013.
[15]	Y. Kim, S. R. Jeong and I. Ghani, “Text opinion mining to analyze news for stock market prediction,” Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, 2014.
[16]	C. Lee, “A study of deep learning with different finance news providers for forecasting stock price trends,” Executive Master’s Program of Business Administration(EMBA) in Information Management of Tamkang University, 2016
[17]	K. Lim, R. D. Brooks and J. H. Kim, “Financial crisis and stock market efficiency: Empirical evidence from Asian countries,” International Review of Financial Analysis, Vol. 17, No. 3, pp. 571-591, 2008.
[18]	Q. Li, T. Wang, Q. Gong, Y. Chen, Z. Lin and S. Song, “Media-aware quantitative trading based on public Web information,” Decision Support Systems, Vol. 61, pp. 93-105, 2014.
[19]	Q. Li, T. Wang, P. Li, L. Liu, Q. Gong and Y. Chen, “The effect of news and public mood on stock movements,” Information Sciences, Volume 278, pp. 826-840, 2014.
[20]	E. Marsh and D. Perzanowski, “MUC-7 Evaluation of IE technology: Overview of results,” MUC-7, 1998.
[21]	A. K. Nassirtoussi, S. Aghabozorgi, T. Wah and D. C. L. Ngo, “Text mining for market prediction: A systematic review,” Expert Systems with Applications, Vol. 41, pp. 7653–7670, 2014.
[22]	A. K. Nassirtoussi, S. Aghabozorgi, T. Wah and D. C. L. Ngo, “Text mining of news-headlines for FOREX market prediction: A multi-layer dimension reduction algorithm with semantics and sentiment,” Expert Systems with Applications, Vol. 42, No. 1, pp. 306-324, 2015.
[23]	W. Nuij, V. Milea, F. Hogenboom, F. Frasincar and U. Kaymak, “An automated framework for incorporating news into stock trading strategies,” IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 4, 2014.
[24]	T. H. Nguyen, K. Shirai and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Systems with Applications, Volume 42, No. 24, pp. 9603-9611, 2015.
[25]	V. Pestov, “Is the -NN classifier in high dimensions affected by the curse of dimensionality?” Computers and Mathematics with Applications, Vol. 65, pp. 1427–1437, 2013.
[26]	T. Poibeau and L. Kosseim, “Proper name extraction from non-journalistic texts,” Language and Computers, Vol. 37, No. 1, pp. 144–157, 2001.
[27]	R. P. Schumaker and H. Chen, “A Discrete Stock Price Prediction Engine Based on Financial News,” Computer, Volume 43, No. 1, 2010.
[28]	R. P. Schumaker, Y. Zhang, C. Huang and H. Chen, “Evaluating sentiment in financial news articles,” Decision Support Systems, Volume 53, No. 3, pp. 458-464, 2012.
[29]	T. T. Vu, S. Chang, Q. T. Ha and N. Collier, “An experiment in integrating sentiment features for tech stock prediction in twitter.” Proceedings of the workshop on information extraction and entity analytics on social media data, pp. 23–38, 2012.
[30]	G. Wu, T. Hou and J. Lin, “Can economic news predict Taiwan stock market returns?” Asia Pacific Management Review, 2018.
[31]	Y. Yu, W. Duan and Q. Cao, “The impact of social and conventional media on firm equity value: A sentiment analysis approach,” Decision Support Systems, Vol. 55, No. 4, pp. 919-926, 2013.
[32]	Y. Yang, S. Mo, A. Liu and A. A. Kirilenko, “Genetic programming optimization for a sentiment feedback strength based trading strategy,” Neurocomputing, Vol. 264, pp. 29–41, 2017.
論文全文使用權限
校內
紙本論文於授權書繳交後5年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後5年公開
校外
同意授權
校外電子論文於授權書繳交後5年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信