CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2

Shihui Song, Yafan Huang, Peng Jiang, Xiaodong Yu, Weijian Zheng, Sheng Di, Qinglei Cao, Yunhe Feng, Zhen Xie, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Today's scientific applications running on supercomputers produce large volumes of data, leading to critical data storage and communication challenges. To tackle the challenges, error-bounded lossy compression is commonly adopted since it can reduce data size drastically within a user-defined error threshold. Previous work has shown that compression techniques can significantly reduce the storage and I/O overhead while retaining good data quality. However, the existing compressors are mainly designed for CPU and GPU. As new AI chips are being incorporated into supercomputers and increasingly used for accelerating scientific computing, there is a growing demand for efficient data compression on the new architecture. In this paper, we propose an efficient lossy compressor, CereSZ, based on the Cerebras CS-2 system. The compression algorithm is mapped onto Cerebras using both data parallelism and pipeline parallelism. In order to achieve a balanced workload on each processing unit, we propose an algorithm to evenly distribute the pipeline stages. Our experiments with six scientific datasets demonstrate that CereSZ can achieve a throughput from 227.93 GB/s to 773.8 GB/s, 2.43x to 10.98x faster than existing GPU compressors.

Original languageEnglish
Title of host publicationHPDC 2024 - Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing
Pages309-321
Number of pages13
ISBN (Electronic)9798400704130
DOIs
StatePublished - 3 Jun 2024
Event33rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2024 - Pisa, Italy
Duration: 3 Jun 20247 Jun 2024

Publication series

NameHPDC 2024 - Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference33rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2024
Country/TerritoryItaly
CityPisa
Period3/06/247/06/24

Keywords

  • AI-optimized architecture
  • error-bounded lossy compression
  • high-speed compressor
  • parallel computing
  • scientific simulation

Fingerprint

Dive into the research topics of 'CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2'. Together they form a unique fingerprint.

Cite this