Parallel hierarchical clustering on shared memory platforms

William Hendrix, Md Mostofa Ali Patwary, Ankit Agrawal, Wei Keng Liao, Alok Choudhary

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Scopus citations

Abstract

Hierarchical clustering has many advantages over traditional clustering algorithms like k-means, but it suffers from higher computational costs and a less obvious parallel structure. Thus, in order to scale this technique up to larger datasets, we present SHRINK, a novel shared-memory algorithm for single-linkage hierarchical clustering based on merging the solutions from overlapping sub-problems. In our experiments, we find that SHRINK provides a speedup of 18-20 on 36 cores on both real and synthetic datasets of up to 250,000 points. Source code for SHRINK is available for download on our website, http://cucis.ece.northwestern.edu.

Original languageEnglish
Title of host publication2012 19th International Conference on High Performance Computing, HiPC 2012
DOIs
StatePublished - 2012
Event2012 19th International Conference on High Performance Computing, HiPC 2012 - Pune, India
Duration: 18 Dec 201221 Dec 2012

Publication series

Name2012 19th International Conference on High Performance Computing, HiPC 2012

Conference

Conference2012 19th International Conference on High Performance Computing, HiPC 2012
Country/TerritoryIndia
CityPune
Period18/12/1221/12/12

Fingerprint

Dive into the research topics of 'Parallel hierarchical clustering on shared memory platforms'. Together they form a unique fingerprint.

Cite this