TY - GEN
T1 - Towards publishing recommendation data with predictive anonymization
AU - Chang, Chih Cheng
AU - Thompson, Brian
AU - Wang, Hui
AU - Yao, Danfeng
PY - 2010
Y1 - 2010
N2 - Recommender systems are used to predict user preferences for products or services. In order to seek better prediction techniques, data owners of recommender systems such as Netflix sometimes make their customers' reviews available to the public, which raises serious privacy concerns. With only a small amount of knowledge about individuals and their ratings to some items in a recommender system, an adversary may easily identify the users and breach their privacy. Unfortunately, most of the existing privacy models (e.g., k-anonymity) cannot be directly applied to recommender systems. In this paper, we study the problem of privacy-preserving publishing of recommendation datasets. We represent recommendation data as a bipartite graph, and identify several attacks that can re-identify users and determine their item ratings. To deal with these attacks, we first give formal privacy definitions for recommendation data, and then develop a robust and efficient anonymization algorithm, Predictive Anonymization, to achieve our privacy goals. Our experimental results show that Predictive Anonymization can prevent the attacks with very little impact to prediction accuracy.
AB - Recommender systems are used to predict user preferences for products or services. In order to seek better prediction techniques, data owners of recommender systems such as Netflix sometimes make their customers' reviews available to the public, which raises serious privacy concerns. With only a small amount of knowledge about individuals and their ratings to some items in a recommender system, an adversary may easily identify the users and breach their privacy. Unfortunately, most of the existing privacy models (e.g., k-anonymity) cannot be directly applied to recommender systems. In this paper, we study the problem of privacy-preserving publishing of recommendation datasets. We represent recommendation data as a bipartite graph, and identify several attacks that can re-identify users and determine their item ratings. To deal with these attacks, we first give formal privacy definitions for recommendation data, and then develop a robust and efficient anonymization algorithm, Predictive Anonymization, to achieve our privacy goals. Our experimental results show that Predictive Anonymization can prevent the attacks with very little impact to prediction accuracy.
KW - anonymization
KW - clustering
KW - prediction
KW - privacy
KW - sparsity
UR - http://www.scopus.com/inward/record.url?scp=77954475442&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954475442&partnerID=8YFLogxK
U2 - 10.1145/1755688.1755693
DO - 10.1145/1755688.1755693
M3 - Conference contribution
AN - SCOPUS:77954475442
SN - 9781605589367
T3 - Proceedings of the 5th International Symposium on Information, Computer and Communications Security, ASIACCS 2010
SP - 24
EP - 35
BT - Proceedings of the 5th International Symposium on Information, Computer and Communications Security, ASIACCS 2010
T2 - 5th ACM Symposium on Information, Computer and Communication Security, ASIACCS 2010
Y2 - 13 April 2010 through 16 April 2010
ER -