Search button

Exploring the connection between sentiment analysis and stock market

Aluno: Chenjie Chen


Resumo
News is an important source of market insight for investors. Sentiment analysis of news has become a useful tool for quantifying market sentiment and has shown a potential correlation with stock prices. This study aims to explore the relationship between sentiment in Chinese and English news and stock price volatility, as well as its potential predictive power. To this end, we collected Chinese news data from 2009 to 2023 and English news data from 2009 to 2020, covering two major stock markets in the world, including the Shanghai Stock Exchange (SSE) and the Standard and Poor’s 500 (S&P 500) Index. We applied a variety of sentiment analysis methods, including lexicon-based techniques and machine learning models. To account for language differences, we constructed specific sentiment analysis methods for Chinese and English news, respectively. We used various sentiment analysis models, such as Vader and Textblob, as well as more complex models, such as BERT, and we also introduced machine learning models, such as Long Short-Term Memory (LSTM) and Random Forest (RF), to explore the potential relationship between sentiment and stock prices. The study examines the performance and potential predictive power of sentiment analysis of news in different language contexts. In addition, the impact of global events (such as the COVID-19 pandemic) on sentiment and stock prices was also assessed. Preliminary results show that the relatively complex BERT model does not necessarily guarantee a high correlation with stock prices, while simple models may perform better in sentiment analysis. At the same time, the sentiment of Chinese news and English news are moderately correlated with long-term stock price changes in the market, and sentiment analysis also has a certain positive effect on prediction. However, when it comes to predicting trends, the relatively complex FinBERT can maintain relative accuracy, while the performance of simple models has dropped significantly.


Trabalho final de Mestrado