TY - JOUR
T1 - An Alzheimers disease related genes identification method based on multiple classifier integration
AU - Miao, Yu
AU - Jiang, Huiyan
AU - Liu, Huiling
AU - Yao, Yu dong
N1 - Publisher Copyright:
© 2017 Elsevier B.V.
PY - 2017/10
Y1 - 2017/10
N2 - Background and Objective: Alzheimers disease (AD) is a fatal neurodegenerative disease and the onset of AD is insidious. Full understanding of the AD-related genes (ADGs) has not been completed. The National Center for Biotechnology Information (NCBI) provides an AD dataset of 22,283 genes. Among these genes, 71 genes have been identified as ADGs. But there may still be underlying ADGs that have not yet been identified in the remaining 22,212 genes. This paper aims to identify additional ADGs using machine learning techniques. Methods: To improve the accuracy of ADG identification, we propose a gene identification method through multiple classifier integration. First, a feature selection algorithm is applied to select the most relevant attributes. Second, a two-stage cascading classifier is developed to identify ADGs. The first stage classification task is based on the relevance vector machine and, in the second stage, the results of three classifiers, support vector machine, random forest and extreme learning machine, are combined through voting. Results: According to our results, feature selection improves accuracy and reduces training time. Voting based classifier reduces the classification errors. The proposed ADG identification system provides accuracy, sensitivity and specificity at levels of 78.77%, 83.10% and 74.67%, respectively. Based on the proposed ADG identification method, potentially additional ADGs are identified and top 13 genes (predicted ADGs) are presented. Conclusions: In this paper, an ADG identification method for identifying ADGs is presented. The proposed method which combines feature selection, cascading classifier and majority voting leads to higher specificity and significantly increases the accuracy and sensitivity of ADG identification. Potentially new ADGs are identified.
AB - Background and Objective: Alzheimers disease (AD) is a fatal neurodegenerative disease and the onset of AD is insidious. Full understanding of the AD-related genes (ADGs) has not been completed. The National Center for Biotechnology Information (NCBI) provides an AD dataset of 22,283 genes. Among these genes, 71 genes have been identified as ADGs. But there may still be underlying ADGs that have not yet been identified in the remaining 22,212 genes. This paper aims to identify additional ADGs using machine learning techniques. Methods: To improve the accuracy of ADG identification, we propose a gene identification method through multiple classifier integration. First, a feature selection algorithm is applied to select the most relevant attributes. Second, a two-stage cascading classifier is developed to identify ADGs. The first stage classification task is based on the relevance vector machine and, in the second stage, the results of three classifiers, support vector machine, random forest and extreme learning machine, are combined through voting. Results: According to our results, feature selection improves accuracy and reduces training time. Voting based classifier reduces the classification errors. The proposed ADG identification system provides accuracy, sensitivity and specificity at levels of 78.77%, 83.10% and 74.67%, respectively. Based on the proposed ADG identification method, potentially additional ADGs are identified and top 13 genes (predicted ADGs) are presented. Conclusions: In this paper, an ADG identification method for identifying ADGs is presented. The proposed method which combines feature selection, cascading classifier and majority voting leads to higher specificity and significantly increases the accuracy and sensitivity of ADG identification. Potentially new ADGs are identified.
KW - Alzheimers disease
KW - Cascading classifier
KW - Feature selection
KW - Gene identification
KW - Majority voting
UR - http://www.scopus.com/inward/record.url?scp=85027867396&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85027867396&partnerID=8YFLogxK
U2 - 10.1016/j.cmpb.2017.08.006
DO - 10.1016/j.cmpb.2017.08.006
M3 - Article
C2 - 28859826
AN - SCOPUS:85027867396
SN - 0169-2607
VL - 150
SP - 107
EP - 115
JO - Computer Methods and Programs in Biomedicine
JF - Computer Methods and Programs in Biomedicine
ER -