Research on intelligent financial statement analysis and anomaly identification technology based on machine learning

Abstract

In order to accurately assess the financial status of a company and identify potential anomalies, this paper first implements unsupervised classification of financial transaction data based on Support Vector Machines, which automatically classifies the data into normal and abnormal categories. Histograms are introduced in combination with LightGBM to quickly fuse data from multiple sources. The most suitable first layer is selected by different algorithms, and the outputs of these algorithms are combined with industry-wide common abnormal features as inputs for LightGBM’s second layer identification. With this two-layer structure, the model not only takes into account the industry characteristics, but also the common anomaly features. Empirical results show that in the accuracy of smart financial statement generation, the sensitivity of this paper’s model iterates to 99.99% at 41.25% specificity, and the accuracy of this paper’s model is as high as 0.98 when dealing with financial private information, macroeconomic, and market information.In the identification of financial transaction anomalies, the number of anomalous weeks is identified to be 24, 29, 34, and 36, and the fusion of multi-source data effectively identifies the large amount of financial transactions, fluctuating transactions and other suspicious abnormal transactions.

Keywords: support vector machine; unsupervised classification; anomaly features; multi-source data; intelligent financial statement