TY - JOUR
T1 - Is Image Encoding Beneficial for Deep Learning in Finance?
AU - Wang, Dan
AU - Wang, Tianrui
AU - Florescu, Ionut
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2022/4/15
Y1 - 2022/4/15
N2 - In 2012, Securities and Exchange Commission (SEC) mandated all corporate filings for any company doing business in the U.S. be entered into the electronic data gathering, analysis, and retrieval (EDGAR) system. In this work, we are investigating ways to analyze the data available through the EDGAR database. This may serve portfolio managers (pension funds, mutual funds, insurance, and hedge funds) to get automated insights into companies they invest in, to better manage their portfolios. The analysis is based on artificial neural networks applied to the data. In particular, one of the most popular machine learning methods, the convolutional neural network (CNN) architecture, originally developed to interpret and classify images, is now being used to interpret financial data. This work investigates the best way to input data collected from the SEC filings into a CNN architecture. We incorporate accounting principles and mathematical methods into the design of three image encoding methods. Specifically, two methods are derived from accounting principles (sequential arrangement, category chunk arrangement) and one is using a purely mathematical technique [the Hilbert vector arrangement (HVA)]. In this work, we analyze fundamental financial data as well as financial ratio data and study companies from the financial, healthcare, and information technology sectors in the United States. We find that using imaging techniques to input data for CNN works better for financial ratio data but is not significantly better than simply using the 1-D input directly for fundamental data. We do not find the HVA technique to be significantly better than other imaging techniques.
AB - In 2012, Securities and Exchange Commission (SEC) mandated all corporate filings for any company doing business in the U.S. be entered into the electronic data gathering, analysis, and retrieval (EDGAR) system. In this work, we are investigating ways to analyze the data available through the EDGAR database. This may serve portfolio managers (pension funds, mutual funds, insurance, and hedge funds) to get automated insights into companies they invest in, to better manage their portfolios. The analysis is based on artificial neural networks applied to the data. In particular, one of the most popular machine learning methods, the convolutional neural network (CNN) architecture, originally developed to interpret and classify images, is now being used to interpret financial data. This work investigates the best way to input data collected from the SEC filings into a CNN architecture. We incorporate accounting principles and mathematical methods into the design of three image encoding methods. Specifically, two methods are derived from accounting principles (sequential arrangement, category chunk arrangement) and one is using a purely mathematical technique [the Hilbert vector arrangement (HVA)]. In this work, we analyze fundamental financial data as well as financial ratio data and study companies from the financial, healthcare, and information technology sectors in the United States. We find that using imaging techniques to input data for CNN works better for financial ratio data but is not significantly better than simply using the 1-D input directly for fundamental data. We do not find the HVA technique to be significantly better than other imaging techniques.
KW - Accounting principles
KW - convolutional neural network (CNN)
KW - corporate credit rating
KW - input features encoding
UR - http://www.scopus.com/inward/record.url?scp=85128344852&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128344852&partnerID=8YFLogxK
U2 - 10.1109/JIOT.2020.3030492
DO - 10.1109/JIOT.2020.3030492
M3 - Article
AN - SCOPUS:85128344852
VL - 9
SP - 5617
EP - 5628
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 8
ER -