HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs

Chengming Zhang, Shaden Smith, Baixi Sun, Jiannan Tian, Jonathan Soifer, Xiaodong Yu, Shuaiwen Leon Song, Yuxiong He, Dingwen Tao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Collaborative filtering (CF) has been proven to be one of the most effective techniques for recommendation. Among all CF approaches, SimpleX is the state-of-the-art method that adopts a novel loss function and a proper number of negative samples. However, there is no work that optimizes SimpleX on multi-core CPUs, leading to limited performance. To this end, we perform an in-depth profiling and analysis of existing SimpleX implementations and identify their performance bottlenecks including (1) irregular memory accesses, (2) unnecessary memory copies, and (3) redundant computations. To address these issues, we propose an efficient CF training system (called HEAT) that fully enables the multi-level caching and multi-threading capabilities of modern CPUs. Specifically, the optimization of HEAT is threefold: (1) It tiles the embedding matrix to increase data locality and reduce cache misses (thus reduces read latency); (2) It optimizes stochastic gradient descent (SGD) with sampling by parallelizing vector products instead of matrix-matrix multiplications, in particular the similarity computation therein, to avoid memory copies for matrix data preparation; and (3) It aggressively reuses intermediate results from the forward phase in the backward phase to alleviate redundant computation. Evaluation on five widely used datasets with both x86- and ARM-architecture processors shows that HEAT achieves up to 45.2× speedup over existing CPU solution and 4.5× speedup and 7.9× cost reduction in Cloud over existing GPU solution with NVIDIA V100 GPU.

Original languageEnglish
Title of host publicationACM ICS 2023 - Proceedings of the International Conference on Supercomputing
Pages324-335
Number of pages12
ISBN (Electronic)9798400700569
DOIs
StatePublished - 21 Jun 2023
Event37th ACM International Conference on Supercomputing, ICS 2023 - Orlando, United States
Duration: 21 Jun 202323 Jun 2023

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference37th ACM International Conference on Supercomputing, ICS 2023
Country/TerritoryUnited States
CityOrlando
Period21/06/2323/06/23

Keywords

  • multi-core processor
  • performance
  • recommender system

Fingerprint

Dive into the research topics of 'HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs'. Together they form a unique fingerprint.

Cite this