TY - JOUR
T1 - Corpus-level and Concept-based Explanations for Interpretable Document Classification
AU - Shi, Tian
AU - Zhang, Xuchao
AU - Wang, Ping
AU - Reddy, Chandan K.
N1 - Publisher Copyright:
© 2021 Association for Computing Machinery.
PY - 2021/10/22
Y1 - 2021/10/22
N2 - Using attention weights to identify information that is important for models' decision making is a popular approach to interpret attention-based neural networks. This is commonly realized in practice through the generation of a heat-map for every single document based on attention weights. However, this interpretation method is fragile and it is easy to find contradictory examples. In this article, we propose a corpus-level explanation approach, which aims at capturing causal relationships between keywords and model predictions via learning the importance of keywords for predicted labels across a training corpus based on attention weights. Based on this idea, we further propose a concept-based explanation method that can automatically learn higher level concepts and their importance to model prediction tasks. Our concept-based explanation method is built upon a novel Abstraction-Aggregation Network (AAN), which can automatically cluster important keywords during an end-to-end training process. We apply these methods to the document classification task and show that they are powerful in extracting semantically meaningful keywords and concepts. Our consistency analysis results based on an attention-based Naïve Bayes classifier (NBC) also demonstrate that these keywords and concepts are important for model predictions.
AB - Using attention weights to identify information that is important for models' decision making is a popular approach to interpret attention-based neural networks. This is commonly realized in practice through the generation of a heat-map for every single document based on attention weights. However, this interpretation method is fragile and it is easy to find contradictory examples. In this article, we propose a corpus-level explanation approach, which aims at capturing causal relationships between keywords and model predictions via learning the importance of keywords for predicted labels across a training corpus based on attention weights. Based on this idea, we further propose a concept-based explanation method that can automatically learn higher level concepts and their importance to model prediction tasks. Our concept-based explanation method is built upon a novel Abstraction-Aggregation Network (AAN), which can automatically cluster important keywords during an end-to-end training process. We apply these methods to the document classification task and show that they are powerful in extracting semantically meaningful keywords and concepts. Our consistency analysis results based on an attention-based Naïve Bayes classifier (NBC) also demonstrate that these keywords and concepts are important for model predictions.
KW - Attention mechanism
KW - concept-based explanation
KW - document classification
KW - model interpretation
KW - sentiment classification
UR - http://www.scopus.com/inward/record.url?scp=85159151745&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85159151745&partnerID=8YFLogxK
U2 - 10.1145/3477539
DO - 10.1145/3477539
M3 - Article
AN - SCOPUS:85159151745
SN - 1556-4681
VL - 16
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 3
M1 - 48
ER -