TY - GEN
T1 - Understanding data completeness in network monitoring systems
AU - Korn, Flip
AU - Liu, Ruilin
AU - Wang, Hui
PY - 2012
Y1 - 2012
N2 - In many networks including Internet Service Providers, transportation monitoring systems and the electric grid, measurements from a set of objects are continuously taken over time and used for important decisions such as provisioning and pricing. It is therefore vital to understand the completeness and reliability of such data. Given the large volume of data generated by these systems, rather than enumerating the times and objects incurringmissing or spurious data, it is more effective to provide patterns (groups of tuples) concisely summarizing trends that may not otherwise be readily observable. In this paper, we define the Graph Tableau Discovery Problem where the measured tuples can be thought of as edges in a bipartite graph on an ordered attribute (time) and an unordered attribute (object identifiers). We define the problem of finding an optimal summary, show that it is NP-complete, and then provide a polynomial-time approximation algorithm with guarantees to find a good summary. Experiments on real and synthetic data demonstrate the effectiveness and efficiency of our approach.
AB - In many networks including Internet Service Providers, transportation monitoring systems and the electric grid, measurements from a set of objects are continuously taken over time and used for important decisions such as provisioning and pricing. It is therefore vital to understand the completeness and reliability of such data. Given the large volume of data generated by these systems, rather than enumerating the times and objects incurringmissing or spurious data, it is more effective to provide patterns (groups of tuples) concisely summarizing trends that may not otherwise be readily observable. In this paper, we define the Graph Tableau Discovery Problem where the measured tuples can be thought of as edges in a bipartite graph on an ordered attribute (time) and an unordered attribute (object identifiers). We define the problem of finding an optimal summary, show that it is NP-complete, and then provide a polynomial-time approximation algorithm with guarantees to find a good summary. Experiments on real and synthetic data demonstrate the effectiveness and efficiency of our approach.
UR - http://www.scopus.com/inward/record.url?scp=84874062630&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84874062630&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2012.149
DO - 10.1109/ICDM.2012.149
M3 - Conference contribution
AN - SCOPUS:84874062630
SN - 9780769549057
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 359
EP - 368
BT - Proceedings - 12th IEEE International Conference on Data Mining, ICDM 2012
T2 - 12th IEEE International Conference on Data Mining, ICDM 2012
Y2 - 10 December 2012 through 13 December 2012
ER -