To read this content please select one of the options below:

Identifying financial statement fraud with decision rules obtained from Modified Random Forest

Byungdae An (Korea University Business School, Seoul, Republic of Korea)
Yongmoo Suh (Korea University Business School, Seoul, Republic of Korea)

Data Technologies and Applications

ISSN: 2514-9288

Article publication date: 14 May 2020

Issue publication date: 2 June 2020

876

Abstract

Purpose

Financial statement fraud (FSF) committed by companies implies the current status of the companies may not be healthy. As such, it is important to detect FSF, since such companies tend to conceal bad information, which causes a great loss to various stakeholders. Thus, the objective of the paper is to propose a novel approach to building a classification model to identify FSF, which shows high classification performance and from which human-readable rules are extracted to explain why a company is likely to commit FSF.

Design/methodology/approach

Having prepared multiple sub-datasets to cope with class imbalance problem, we build a set of decision trees for each sub-dataset; select a subset of the set as a model for the sub-dataset by removing the tree, each of whose performance is less than the average accuracy of all trees in the set; and then select one such model which shows the best accuracy among the models. We call the resulting model MRF (Modified Random Forest). Given a new instance, we extract rules from the MRF model to explain whether the company corresponding to the new instance is likely to commit FSF or not.

Findings

Experimental results show that MRF classifier outperformed the benchmark models. The results also revealed that all the variables related to profit belong to the set of the most important indicators to FSF and that two new variables related to gross profit which were unapprised in previous studies on FSF were identified.

Originality/value

This study proposed a method of building a classification model which shows the outstanding performance and provides decision rules that can be used to explain the classification results. In addition, a new way to resolve the class imbalance problem was suggested in this paper.

Keywords

Acknowledgements

This research is partially supported by the Korea University Business School Research Grant.

Citation

An, B. and Suh, Y. (2020), "Identifying financial statement fraud with decision rules obtained from Modified Random Forest", Data Technologies and Applications, Vol. 54 No. 2, pp. 235-255. https://doi.org/10.1108/DTA-11-2019-0208

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles