TY - GEN
T1 - Ultrafast Error-bounded Lossy Compression for Scientific Datasets
AU - Yu, Xiaodong
AU - Di, Sheng
AU - Zhao, Kai
AU - Tian, Jiannan
AU - Tao, Dingwen
AU - Liang, Xin
AU - Cappello, Franck
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/6/27
Y1 - 2022/6/27
N2 - Today's scientific high-performance computing applications and advanced instruments are producing vast volumes of data across a wide range of domains, which impose a serious burden on data transfer and storage. Error-bounded lossy compression has been developed and widely used in the scientific community because it not only can significantly reduce the data volumes but also can strictly control the data distortion based on the user-specified error bound. Existing lossy compressors, however, cannot offer ultrafast compression speed, which is highly demanded by numerous applications or use cases (such as in-memory compression and online instrument data compression). In this paper, we propose a novel ultrafast error-bounded lossy compressor that can obtain fairly high compression performance on both CPUs and GPUs and with reasonably high compression ratios. The key contributions are threefold. (1) We propose a generic error-bounded lossy compression framework - -called SZx - -that achieves ultrafast performance through its novel design comprising only lightweight operations such as bitwise and addition/subtraction operations, while still keeping a high compression ratio. (2) We implement SZx on both CPUs and GPUs and optimize the performance according to their architectures. (3) We perform a comprehensive evaluation with six real-world production-level scientific datasets on both CPUs and GPUs. Experiments show that SZx is 2∼16x faster than the second-fastest existing error-bounded lossy compressor (either SZ or ZFP) on CPUs and GPUs, with respect to both compression and decompression.
AB - Today's scientific high-performance computing applications and advanced instruments are producing vast volumes of data across a wide range of domains, which impose a serious burden on data transfer and storage. Error-bounded lossy compression has been developed and widely used in the scientific community because it not only can significantly reduce the data volumes but also can strictly control the data distortion based on the user-specified error bound. Existing lossy compressors, however, cannot offer ultrafast compression speed, which is highly demanded by numerous applications or use cases (such as in-memory compression and online instrument data compression). In this paper, we propose a novel ultrafast error-bounded lossy compressor that can obtain fairly high compression performance on both CPUs and GPUs and with reasonably high compression ratios. The key contributions are threefold. (1) We propose a generic error-bounded lossy compression framework - -called SZx - -that achieves ultrafast performance through its novel design comprising only lightweight operations such as bitwise and addition/subtraction operations, while still keeping a high compression ratio. (2) We implement SZx on both CPUs and GPUs and optimize the performance according to their architectures. (3) We perform a comprehensive evaluation with six real-world production-level scientific datasets on both CPUs and GPUs. Experiments show that SZx is 2∼16x faster than the second-fastest existing error-bounded lossy compressor (either SZ or ZFP) on CPUs and GPUs, with respect to both compression and decompression.
KW - error-bounded lossy compression
KW - gpu
KW - high-speed compressor
KW - scientific data
UR - http://www.scopus.com/inward/record.url?scp=85134162366&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134162366&partnerID=8YFLogxK
U2 - 10.1145/3502181.3531473
DO - 10.1145/3502181.3531473
M3 - Conference contribution
AN - SCOPUS:85134162366
T3 - HPDC 2022 - Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing
SP - 159
EP - 171
BT - HPDC 2022 - Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing
T2 - 31st International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2022
Y2 - 27 June 2022 through 30 June 2022
ER -