A Portable, Fast, DCT-based Compressor for AI Accelerators

Milan Shah, Xiaodong Yu, Sheng Di, Michela Becchi, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Lossy compression can be an effective tool in AI training and inference to reduce memory requirements, storage footprint, and in some cases, execution time. With the rise of novel architectures designed to accelerate AI workloads, compression can continue to serve these purposes, but must be adapted to the new accelerators. Due to programmability and architectural differences, existing lossy compressors cannot be directly ported to and are not optimized for any AI accelerator, thus requiring new compression designs.In this paper, we propose a novel, portable, DCT-based lossy compressor that can be used across a variety of AI accelerators. More specifically, we make the following contributions: 1) We propose a DCT-based lossy compressor design for training data that uses operators supported across four state-of-the-art AI accelerators: Cerebras CS-2, SambaNova SN30, Groq GroqChip, and Graphcore IPU. 2) We design two optimization techniques to allow for higher resolution compressed data on certain platforms and improved compression ratio on the IPU. 3) We evaluate our compressor's ability to preserve accuracy on four benchmarks, three of which are AI for science benchmarks going beyond image classification. Our experiments show that accuracy degradation can be limited to 3% or less, and sometimes, compression improves accuracy. 4) We study compression/decompression time as a function of resolution and batch size, finding that our compressor can achieve throughputs on the scale of tens of GB/s, depending on the platform.

Original languageEnglish
Title of host publicationHPDC 2024 - Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing
Pages109-121
Number of pages13
ISBN (Electronic)9798400704130
DOIs
StatePublished - 3 Jun 2024
Event33rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2024 - Pisa, Italy
Duration: 3 Jun 20247 Jun 2024

Publication series

NameHPDC 2024 - Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference33rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2024
Country/TerritoryItaly
CityPisa
Period3/06/247/06/24

Keywords

  • AI accelerator
  • compression
  • ML training

Fingerprint

Dive into the research topics of 'A Portable, Fast, DCT-based Compressor for AI Accelerators'. Together they form a unique fingerprint.

Cite this