Graph Auto-Encoders for Financial Clustering

26 Nov 2021 · Edward Turner ·

Deep learning has shown remarkable results on Euclidean data (e.g. audio, images, text) however this type of data is limited in the amount of relational information it can hold. In mathematics we can model more general relational data in a graph structure while retaining Euclidean data as associated node or edge features. Due to the ubiquity of graph data, and its ability to hold multiple dimensions of information, graph deep learning has become a fast emerging field. We look at applying and optimising graph deep learning on a finance graph to produce more informed clusters of companies. Having clusters produced from multiple streams of data can be highly useful in quantitative finance; not only does it allow clusters to be tailored to the specific task but the culmination of multiple streams allows for cross source pattern recognition that would have otherwise gone unnoticed. This can provide financial institutions with an edge over competitors which is crucial in the heavily optimised world of trading. In this paper we use news co-occurrence and stock price for our data combination. We optimise our model to achieve an average testing precision of 78% and find a clear improvement in clustering capabilities when dual data sources are used; cluster purity rises from 32% for just vertex data and 42% for just edge data to 64% when both are used in comparisons to ground-truth Bloomberg clusters. The framework we provide utilises unsupervised learning which we view as key for future work due to the volume of unlabelled data in financial markets.

PDF Abstract