TY - GEN
T1 - Data Privacy Examination against Semi-Supervised Learning
AU - Lou, Jiadong
AU - Yuan, Xu
AU - Pan, Miao
AU - Wang, Hao
AU - Tzeng, Nian Feng
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/7/10
Y1 - 2023/7/10
N2 - Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user's data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning's training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.
AB - Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user's data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning's training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.
KW - data privacy
KW - membership inference
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85168130968&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85168130968&partnerID=8YFLogxK
U2 - 10.1145/3579856.3590333
DO - 10.1145/3579856.3590333
M3 - Conference contribution
AN - SCOPUS:85168130968
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 136
EP - 148
BT - ASIA CCS 2023 - Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security
T2 - 18th ACM ASIA Conference on Computer and Communications Security, ASIA CCS 2023
Y2 - 10 July 2023 through 14 July 2023
ER -