Exploiting social media with higher-order Factorization Machines: statistical arbitrage on high-frequency data of the S&P 500 |
| |
Authors: | Julian Knoll Johannes Stübinger Michael Grottke |
| |
Affiliation: | 1. Technische Hochschule Nürnberg Georg Simon Ohm, D-90489 Nürnberg, Germany;2. Department of Statistics and Econometrics, University of Erlangen–Nürnberg, D-90403 Nürnberg, Germany;3. GfK SE, D-90419 Nürnberg, Germany |
| |
Abstract: | Over the past 15 years, there have been a number of studies using text mining for predicting stock market data. Two recent publications employed support vector machines and second-order Factorization Machines, respectively, to this end. However, these approaches either completely neglect interactions between the features extracted from the text, or they only account for second-order interactions. In this paper, we apply higher-order Factorization Machines, for which efficient training algorithms have only been available since 2016. As Factorization Machines require hyperparameters to be specified, we also introduce a novel adaptive-order algorithm for automatically determining them. Our study is the first one to make use of social media data for predicting minute-by-minute stock returns, namely the ones of the S&P 500 stock constituents. We show that, unlike a trading strategy employing support vector machines, Factorization-Machine-based strategies attain positive returns after transactions costs for the years 2014 and 2015. Especially the approach applying the adaptive-order algorithm outperforms classical approaches with respect to a multitude of criteria, and it features very favorable characteristics. |
| |
Keywords: | Finance Factorization Machine Social media Statistical arbitrage High-frequency trading Machine learning |
|
|