TY - GEN
T1 - Towards Scalable Spectral Embedding and Data Visualization via Spectral Coarsening
AU - Zhao, Zhiqiang
AU - Zhang, Ying
AU - Feng, Zhuo
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/8/3
Y1 - 2021/8/3
N2 - This paper proposes a scalable multilevel framework for the spectral embedding of large undirected graphs. The proposed method first computes much smaller yet sparse graphs while preserving the key spectral (structural) properties of the original graph, by exploiting a nearly-linear time spectral graph coarsening approach. Then, the resultant spectrally-coarsened graphs are leveraged for the development of much faster algorithms for multilevel spectral graph embedding (clustering) as well as visualization of large data sets. We conducted extensive experiments using a variety of large graphs and datasets and obtained very promising results. For instance, we are able to coarsen the "coPapersCiteseer"graph with 0.43 million nodes and 16 million edges into a much smaller graph with only 13K (32X fewer) nodes and 17K (950X fewer) edges in about 16 seconds; the spectrally-coarsened graphs allow us to achieve up to 1,100X speedup for multilevel spectral graph embedding (clustering) and up to 60X speedup for t-SNE visualization of large data sets.
AB - This paper proposes a scalable multilevel framework for the spectral embedding of large undirected graphs. The proposed method first computes much smaller yet sparse graphs while preserving the key spectral (structural) properties of the original graph, by exploiting a nearly-linear time spectral graph coarsening approach. Then, the resultant spectrally-coarsened graphs are leveraged for the development of much faster algorithms for multilevel spectral graph embedding (clustering) as well as visualization of large data sets. We conducted extensive experiments using a variety of large graphs and datasets and obtained very promising results. For instance, we are able to coarsen the "coPapersCiteseer"graph with 0.43 million nodes and 16 million edges into a much smaller graph with only 13K (32X fewer) nodes and 17K (950X fewer) edges in about 16 seconds; the spectrally-coarsened graphs allow us to achieve up to 1,100X speedup for multilevel spectral graph embedding (clustering) and up to 60X speedup for t-SNE visualization of large data sets.
KW - data visualization
KW - graph clustering
KW - spectral embedding
KW - spectral graph theory
UR - http://www.scopus.com/inward/record.url?scp=85103054143&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103054143&partnerID=8YFLogxK
U2 - 10.1145/3437963.3441767
DO - 10.1145/3437963.3441767
M3 - Conference contribution
AN - SCOPUS:85103054143
T3 - WSDM 2021 - Proceedings of the 14th ACM International Conference on Web Search and Data Mining
SP - 869
EP - 877
BT - WSDM 2021 - Proceedings of the 14th ACM International Conference on Web Search and Data Mining
T2 - 14th ACM International Conference on Web Search and Data Mining, WSDM 2021
Y2 - 8 March 2021 through 12 March 2021
ER -