TY - JOUR
T1 - On the distortion of locality sensitive hashing
AU - Chierichetti, Flavio
AU - Kumar, Ravi
AU - Panconesi, Alessandro
AU - Terolli, Erisa
N1 - Publisher Copyright:
© 2019 Society for Industrial and Applied Mathematics
PY - 2019
Y1 - 2019
N2 - Given a notion of pairwise similarity between objects, locality sensitive hashing (LSH) aims to construct a hash function family over the universe of objects such that the probability two objects hash to the same value is their similarity. LSH is a powerful algorithmic tool for large scale applications and much work has been done to understand LSHable similarities, i.e., similarities that admit an LSH. In this paper we focus on similarities that are provably non-LSHable and propose a notion of distortion to capture the approximation of such a similarity by an LSHable similarity. We consider several well-known non-LSHable similarities and show tight upper and lower bounds on their distortion.
AB - Given a notion of pairwise similarity between objects, locality sensitive hashing (LSH) aims to construct a hash function family over the universe of objects such that the probability two objects hash to the same value is their similarity. LSH is a powerful algorithmic tool for large scale applications and much work has been done to understand LSHable similarities, i.e., similarities that admit an LSH. In this paper we focus on similarities that are provably non-LSHable and propose a notion of distortion to capture the approximation of such a similarity by an LSHable similarity. We consider several well-known non-LSHable similarities and show tight upper and lower bounds on their distortion.
KW - Distortion
KW - Locality sensitive hashing
KW - Similarity
UR - http://www.scopus.com/inward/record.url?scp=85065465919&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065465919&partnerID=8YFLogxK
U2 - 10.1137/17M1127752
DO - 10.1137/17M1127752
M3 - Article
AN - SCOPUS:85065465919
SN - 0097-5397
VL - 48
SP - 350
EP - 372
JO - SIAM Journal on Computing
JF - SIAM Journal on Computing
IS - 2
ER -