Use of machine learning to detect misreporting
Performance and interpretability of algorithm-based identification of erroneous financial reporting
Incorrect and, in particular, fraudulent financial reporting is a comparatively rare phenomenon, which, however, when such a case occurs, is accompanied by considerable financial losses for investors, liability risks for auditors and reputational damage for the responsible enforcement authorities. It is not uncommon for errors to remain undetected for years, with the consequence that the extent of the damage can increase immensely over time.
Over the past two decades in particular, the development of models to identify erroneous reporting has made significant progress, incorporating a wide range of algorithms and increasing amounts of data and data types.
Despite all the progress, two critical challenges remain: Existing models regularly have high proportions of false positives, which are associated with a high additional audit burden, particularly for auditors and enforcement agencies. On the other hand, numerous models remain difficult to interpret, as many black box models make relatively good predictions (sensitivity) but offer few starting points for further investigations.
Therefore, the project investigates the potential and limitations of algorithm-based models for the identification of errors respectively fraud in financial reporting, with a special focus on the possible improvement of the false positive rate and the interpretability of the results.