Fast spectral analysis for approximate nearest neighbor search

Jing Wang, Jie Shen

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

In large-scale machine learning, of central interest is the problem of approximate nearest neighbor (ANN) search, where the goal is to query particular points that are close to a given object under certain metric. In this paper, we develop a novel data-driven ANN search algorithm where the data structure is learned by fast spectral technique based on s landmarks selected by approximate ridge leverage scores. We show that with overwhelming probability, our algorithm returns the (1 + ϵ/ 4) -ANN for any approximation parameter ϵ∈ (0 , 1). A remarkable feature of our algorithm is that it is computationally efficient. Specifically, learning k-length hash codes requires O((s3+ ns2) log n) running time and O(d2) extra space, and returning the (1 + ϵ/ 4) -ANN of the query needs O(klog n) running time. The experimental results on computer vision and natural language understanding tasks demonstrate the significant advantage of our algorithm compared to state-of-the-art methods.

Original languageEnglish
Pages (from-to)2297-2322
Number of pages26
JournalMachine Learning
Volume111
Issue number6
DOIs
StatePublished - Jun 2022

Keywords

  • Approximate nearest neighbor search
  • Hashing
  • Noise
  • Spectral analysis
  • Subspace

Fingerprint

Dive into the research topics of 'Fast spectral analysis for approximate nearest neighbor search'. Together they form a unique fingerprint.

Cite this