TY - GEN
T1 - Balance sheet outlier detection using a graph similarity algorithm
AU - Yang, Steve
AU - Cogill, Randy
PY - 2013
Y1 - 2013
N2 - Graph similarity measurement has been used in many applications, such as computational biology, text mining, pattern recognition, and computer vision. In this paper, we apply similarity measurement on graphs to measure structural differences in financial statements. Unconventional financial statement structures may potentially reveal deceptive intention of hiding certain information while making technically 'correct' financial statements. Furthermore, unconventional financial statements may also lead to investment opportunities if legitimacy is not questioned. We construct an algorithm based on the metric of string edit distance as an approximation of graph similarity, and apply the Levenshtein algorithm with modified string edit costs to measure string edit distance. We demonstrate the effectiveness of this algorithm in capturing the sensitive changes of balance sheet structures by applying the algorithm in two experiments. The first experiment shows the algorithm is sensitive to all three basic edits (namely deletion, insertion and substitution) on a particular balance sheet, and the second experiment shows more than 90% clustering accuracy on real balance sheets.
AB - Graph similarity measurement has been used in many applications, such as computational biology, text mining, pattern recognition, and computer vision. In this paper, we apply similarity measurement on graphs to measure structural differences in financial statements. Unconventional financial statement structures may potentially reveal deceptive intention of hiding certain information while making technically 'correct' financial statements. Furthermore, unconventional financial statements may also lead to investment opportunities if legitimacy is not questioned. We construct an algorithm based on the metric of string edit distance as an approximation of graph similarity, and apply the Levenshtein algorithm with modified string edit costs to measure string edit distance. We demonstrate the effectiveness of this algorithm in capturing the sensitive changes of balance sheet structures by applying the algorithm in two experiments. The first experiment shows the algorithm is sensitive to all three basic edits (namely deletion, insertion and substitution) on a particular balance sheet, and the second experiment shows more than 90% clustering accuracy on real balance sheets.
KW - Balance sheet
KW - Graph similarity metric
KW - Hierarchical clustering
KW - Outliers detection
KW - String edit distance
KW - XBRL
UR - http://www.scopus.com/inward/record.url?scp=84885985902&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84885985902&partnerID=8YFLogxK
U2 - 10.1109/CIFEr.2013.6611709
DO - 10.1109/CIFEr.2013.6611709
M3 - Conference contribution
AN - SCOPUS:84885985902
SN - 9781467359214
T3 - Proceedings of the 2013 IEEE Conference on Computational Intelligence for Financial Engineering and Economics, CIFEr 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
SP - 135
EP - 142
BT - Proceedings of the 2013 IEEE Conference on Computational Intelligence for Financial Engineering and Economics, CIFEr 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
T2 - 2013 IEEE Conference on Computational Intelligence for Financial Engineering and Economics, CIFEr 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
Y2 - 16 April 2013 through 19 April 2013
ER -