TY - CHAP
T1 - High performance big data clustering
AU - Agrawal, Ankit
AU - Patwary, Md Mostofa Ali
AU - Hendrix, William
AU - Liao, Wei Keng
AU - Choudhary, Alok
PY - 2013
Y1 - 2013
N2 - Scientific advances are collectively exploding the amount, diversity, and complexity of data becoming available. Our ability to collect huge amounts of data has greatly surpassed our analytical capacity to make sense of it. Efficient use of high performance computing techniques is critical for the success of the data-driven paradigm to scientific discovery. Data clustering is one of the fundamental analytics tasks heavily relied upon in many application domains, like astrohpysics, climate science, bioinformatics, etc. In this book chapter, we illustrate the challenges and opportunities in mining big data using two recently developed scalable parallel clustering algorithms. Experimental results on millions of high-dimensional data points clustered in parallel on thousands of processor cores are also presented.
AB - Scientific advances are collectively exploding the amount, diversity, and complexity of data becoming available. Our ability to collect huge amounts of data has greatly surpassed our analytical capacity to make sense of it. Efficient use of high performance computing techniques is critical for the success of the data-driven paradigm to scientific discovery. Data clustering is one of the fundamental analytics tasks heavily relied upon in many application domains, like astrohpysics, climate science, bioinformatics, etc. In this book chapter, we illustrate the challenges and opportunities in mining big data using two recently developed scalable parallel clustering algorithms. Experimental results on millions of high-dimensional data points clustered in parallel on thousands of processor cores are also presented.
KW - big data
KW - clustering
KW - density-based clustering
KW - hierarchical clustering
UR - http://www.scopus.com/inward/record.url?scp=84895107082&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84895107082&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-322-3-192
DO - 10.3233/978-1-61499-322-3-192
M3 - Chapter
AN - SCOPUS:84895107082
SN - 9781614993216
T3 - Advances in Parallel Computing
SP - 192
EP - 211
BT - Cloud Computing and Big Data
ER -