TY - GEN
T1 - Topology-aware optimizations for multi-GPU ptychographic image reconstruction
AU - Yu, Xiaodong
AU - Biçer, Tekin
AU - Kettimuthu, Rajkumar
AU - Foster, Ian T.
N1 - Publisher Copyright:
© 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2021/6/3
Y1 - 2021/6/3
N2 - Ptychography is an advanced high-resolution X-ray imaging technique that can generate extremely large datasets. Ptychographic reconstruction transforms reciprocal space experimental data to high-resolution 2D real-space images. GPUs have been used extensively to meet the computational requirements of the reconstruction. Generic multi-GPU reconstruction solutions use common communication topologies, such as P2P graph and ring, that are provided by MPI and NCCL libraries, to establish inter-GPU communications. However, these common topologies assume homogeneous physical links between GPUs, resulting in sub-optimal performance on heterogeneous configurations that are composed of both high- (e.g., NVLink) and low-speed (e.g., PCIe) interconnects. This mismatch between application-level communication topology and physical interconnection can cause data transfer congestion, inefficient memory access, and under-utilization of network resources. Here we present topology-aware designs and optimizations to address the aforementioned mismatch and boost end-to-end application performance. We introduce topology-aware data splitting, propose a novel communication topology, and incorporate asynchronous data movement and computation. We evaluate our design and optimizations using real and artificial datasets and compare its performance with that of the direct P2P and NCCL-based approaches. The results show that our optimizations always outperform the counterparts and achieve up to 5.13× and 1.63× communication and end-to-end application speedups, respectively.
AB - Ptychography is an advanced high-resolution X-ray imaging technique that can generate extremely large datasets. Ptychographic reconstruction transforms reciprocal space experimental data to high-resolution 2D real-space images. GPUs have been used extensively to meet the computational requirements of the reconstruction. Generic multi-GPU reconstruction solutions use common communication topologies, such as P2P graph and ring, that are provided by MPI and NCCL libraries, to establish inter-GPU communications. However, these common topologies assume homogeneous physical links between GPUs, resulting in sub-optimal performance on heterogeneous configurations that are composed of both high- (e.g., NVLink) and low-speed (e.g., PCIe) interconnects. This mismatch between application-level communication topology and physical interconnection can cause data transfer congestion, inefficient memory access, and under-utilization of network resources. Here we present topology-aware designs and optimizations to address the aforementioned mismatch and boost end-to-end application performance. We introduce topology-aware data splitting, propose a novel communication topology, and incorporate asynchronous data movement and computation. We evaluate our design and optimizations using real and artificial datasets and compare its performance with that of the direct P2P and NCCL-based approaches. The results show that our optimizations always outperform the counterparts and achieve up to 5.13× and 1.63× communication and end-to-end application speedups, respectively.
KW - GPU
KW - Heterogeneous inter-GPU connections
KW - Image reconstruction
KW - NVLink
KW - Neighborhood communication
KW - Ptychography
UR - http://www.scopus.com/inward/record.url?scp=85107486724&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107486724&partnerID=8YFLogxK
U2 - 10.1145/3447818.3460380
DO - 10.1145/3447818.3460380
M3 - Conference contribution
AN - SCOPUS:85107486724
T3 - Proceedings of the International Conference on Supercomputing
SP - 354
EP - 366
BT - ICS 2021 - Proceedings of the 2021 ACM International Conference on Supercomputing
T2 - 35th ACM International Conference on Supercomputing, ICS 2021
Y2 - 14 June 2021 through 17 June 2021
ER -