TY - GEN
T1 - Big Data architecture for IT incident management
AU - Liu, Rong
AU - Li, Qicheng
AU - Li, Feng
AU - Mei, Lijun
AU - Lee, Juhnyoung
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/11/17
Y1 - 2014/11/17
N2 - IT incident management aims to restore normal service quality and availability of IT systems from interruptions. IT incidents often have complicated causes aggregated from an IT environment composed of thousands of interdependent components. Incident diagnosis then requires collecting and analyzing a large scale of data regarding these components, often, in real time to find suspect causes. It is extremely difficult to fulfill this requirement using traditional techniques. In this paper, we propose a new analysis architecture using Big Data techniques. This architecture leverages stream computing and MapReduce techniques to analyze data from various data sources, uses NoSQL databases to store incident-related documents and their relationships, and further utilizes other analytical techniques to examine the documents for root causes and failure prediction. We demonstrate this approach using a real-world example and present evaluation results from a recent pilot study.
AB - IT incident management aims to restore normal service quality and availability of IT systems from interruptions. IT incidents often have complicated causes aggregated from an IT environment composed of thousands of interdependent components. Incident diagnosis then requires collecting and analyzing a large scale of data regarding these components, often, in real time to find suspect causes. It is extremely difficult to fulfill this requirement using traditional techniques. In this paper, we propose a new analysis architecture using Big Data techniques. This architecture leverages stream computing and MapReduce techniques to analyze data from various data sources, uses NoSQL databases to store incident-related documents and their relationships, and further utilizes other analytical techniques to examine the documents for root causes and failure prediction. We demonstrate this approach using a real-world example and present evaluation results from a recent pilot study.
KW - Big Data
KW - Incident management
KW - MapReduce
KW - NoSQL
KW - co-occurrence
KW - reoccurrence
KW - stream computing
UR - http://www.scopus.com/inward/record.url?scp=84915731880&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84915731880&partnerID=8YFLogxK
U2 - 10.1109/SOLI.2014.6960762
DO - 10.1109/SOLI.2014.6960762
M3 - Conference contribution
AN - SCOPUS:84915731880
T3 - Proceedings of 2014 IEEE International Conference on Service Operations and Logistics, and Informatics, SOLI 2014
SP - 424
EP - 429
BT - Proceedings of 2014 IEEE International Conference on Service Operations and Logistics, and Informatics, SOLI 2014
T2 - 2014 IEEE International Conference on Service Operations and Logistics, and Informatics, SOLI 2014
Y2 - 8 October 2014 through 10 October 2014
ER -