Authors:
(1) Deborah Miori, Mathematical Institute, University of Oxford, Oxford, UK and 2Oxford-Man Institute of Quantitative Finance, Oxford, UK (Corresponding author: Deborah Miori, [email protected]);
(2) Constantin Petrov, Fidelity Investments, London, UK.
Table of Links
Conclusions, Acknowledgements, and References
5 Conclusions
Starting from a curated selection of economic articles sourced from The Wall Street Journal, our research introduces an innovative and dynamic approach to dissecting news content. We leverage on GPT3.5 to sift out the most salient entities within each article, which become the building blocks of a proposed series of graphs. The graphs track indeed the co-occurrence of such entities among news on a weekly basis, and allow investigations on the inter-relations of topics discussed over time. Network analysis techniques and fuzzy community detection are then used to design a comprehensive framework, which systematically unveils interpretable topics and surrounding narratives within news.
The importance of the proposed investigations is highlighted by the results of the logistic regression models. Indeed, we test whether there is a statistically significant connection between the features and structure of news, and moments of dislocation within financial markets. As expected (and desired), lower sentiment within news is more likely to be associated with weeks of market dislocation. However, multiple features computed from our graph construct are found to be also significant, especially from the entropy of discussions and consequent likelihood of contagion of sentiment, both in the contemporaneous and predictive scenarios. This suggests that the interconnectedness of news’ topics and structure therein are meaningful aspect to further analyse within financial research, for which our proposed study desires to serve as a first baseline. Improving entity recognition, extending the corpus of news, and designing generalisation studies are examples of possible advances to pursue in this research branch.
As a final remark, we desire to point to the problem of network alignment, which is especially important in network biology [26]. Many related studies try to find a measure of protein similarity between proteins in different species, since similar protein structures often imply the same biological results. With a parallel approach, one could investigate more deeply whether equivalent structures among news (but that do not account for the actual “label” of the topic) result indeed in similar market reactions.
Acknowledgements
Deborah Miori’s research was supported by the EPSRC CDT in Mathematics of Random Systems (EPSRC Grant EP/S023925/1).
References
[1] Nadine Strauß, Rens Vliegenthart, and Piet Verhoeven. Intraday News Trading: The Reciprocal Relationships Between the Stock Market and Economic News. Communication Research, 45(7):1054–1077, 2018. PMID: 30443092.
[2] Fabrizio Lillo, Salvatore Miccichè, Michele Tumminello, Jyrki Piilo, and Rosario N. Mantegna. How news affects the trading behaviour of different categories of investors in a financial market. Quantitative Finance, 15(2):213–229, 2015.
[3] Eric C. So and Sean Wang. News-driven return reversals: Liquidity provision ahead of earnings announcements. Journal of Financial Economics, 114(1):20–35, 2014.
[4] Haizhou Qu and Dimitar Kazakov. Quantifying correlation between financial news and stocks. In 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1–6, 2016.
[5] Ryohei Hisano, Didier Sornette, Takayuki Mizuno, Takaaki Ohnishi, and Tsutomu Watanabe. High quality topic extraction from business news explains abnormal financial market volatility. CARF F-Series CARF-F-299, Center for Advanced Research in Finance, Faculty of Economics, The University of Tokyo, October 2012.
[6] Shaheen Syed and Marco Spruit. Full-text or abstract? examining topic coherence scores using latent dirichlet allocation. In 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pages 165–174, 2017.
[7] Weisi Chen, Fethi Rabhi, Wenqi Liao, and Islam Al-Qudah. Leveraging state-of-the-art topic modeling for news impact analysis on financial markets: A comparative study. Electronics, 12(12), 2023.
[8] Dimo Angelov. Top2vec: Distributed representations of topics, 2020.
[9] Maarten Grootendorst. BERTopic: Neural topic modeling with a class-based TF-IDF procedure, 2022.
[10] Xinli Yu, Zheng Chen, Yuan Ling, Shujing Dong, Zongyi Liu, and Yanbin Lu. Temporal data meets llm – explainable financial time series forecasting, 2023.
[11] Udit Gupta. Gpt-investar: Enhancing stock investment strategies through annual report analysis with large language models, 2023.
[12] Rick Steinert and Saskia Altmann. Linking microblogging sentiments to stock price movement: An application of gpt-4, 2023.
[13] Paolo Pasquariello. Financial market dislocations. The Review of Financial Studies, 27(6):1868–1914, 2014.
[14] OpenAI. Gpt-4 technical report, 2023.
[15] Partha Pratim Ray. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3:121–154, 2023.
[16] C.J. Hutto and Eric Gilbert. Vader: A parsimonious rule-based model for sentiment analysis of social media text. 01 2015.
[17] Mark E. J. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23):8577–8582, 2006.
[18] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, Oct 2008.
[19] Shihua Zhang, Rui-Sheng Wang, and Xiang-Sun Zhang. Uncovering fuzzy community structure in complex networks. Physical Review E, 76(4):046103, 2007.
[20] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space, 2013.
[21] Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks, 2016.
[22] Eghbal Rahimikia, Stefan Zohren, and Ser-Huang Poon. Realised Volatility Forecasting: Machine Learning via Financial Word Embedding, 08 2021.
[23] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013.
[24] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16:321–357, jun 2002.
[25] Yi-Jiao Zhang, Kai-Cheng Yang, and Filippo Radicchi. Systematic comparison of graph embedding methods in practical tasks. Phys. Rev. E, 104:044315, Oct 2021.
[26] Walter Nelson, Marinka Zitnik, Bo Wang, Jure Leskovec, Anna Goldenberg, and Roded Sharan. To Embed or Not: Network Embedding as a Paradigm in Computational Biology. Frontiers in Genetics, 10, 2019.
This paper is available on arxiv under CC0 1.0 DEED license.