TY - JOUR
T1 - An optimizing and differentially private clustering algorithm for mixed data in SDN-based smart grid
AU - Lv, Zefang
AU - Wang, Lirong
AU - Guan, Zhitao
AU - Wu, Jun
AU - Du, Xiaojiang
AU - Zhao, Hongtao
AU - Guizani, Mohsen
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2019
Y1 - 2019
N2 - Software-defined network (SDN) is widely used in smart grid for monitoring and managing the communication network. Big data analytics for SDN-based smart grid has got increasing attention. It is a promising approach to use machine learning technologies to analyze a large amount of data generated in SDN-based smart grid. However, the disclosure of personal privacy information must receive considerable attention. For instance, data clustering in user electricity behavior analysis may lead to the disclosure of personal privacy information. In this paper, an optimizing and differentially private clustering algorithm named ODPCA is proposed. In the ODPCA, the differentially private K-means algorithm and K-modes algorithm are combined to cluster mixed data in a privacy-preserving manner. The allocation of privacy budgets is optimized to improve the accuracy of clustering results. Specifically, the loss function that considers both the numerical and categorical attributes between true centroids and noisy centroids is analyzed to optimize the allocation the privacy budget; the number of iterations of clustering is set to a fixed value based on the total privacy budget and the minimal privacy budget allocated to each iteration. It is proved that the ODPCA can meet the differential privacy requirements and has better performance by comparing with other popular algorithms.
AB - Software-defined network (SDN) is widely used in smart grid for monitoring and managing the communication network. Big data analytics for SDN-based smart grid has got increasing attention. It is a promising approach to use machine learning technologies to analyze a large amount of data generated in SDN-based smart grid. However, the disclosure of personal privacy information must receive considerable attention. For instance, data clustering in user electricity behavior analysis may lead to the disclosure of personal privacy information. In this paper, an optimizing and differentially private clustering algorithm named ODPCA is proposed. In the ODPCA, the differentially private K-means algorithm and K-modes algorithm are combined to cluster mixed data in a privacy-preserving manner. The allocation of privacy budgets is optimized to improve the accuracy of clustering results. Specifically, the loss function that considers both the numerical and categorical attributes between true centroids and noisy centroids is analyzed to optimize the allocation the privacy budget; the number of iterations of clustering is set to a fixed value based on the total privacy budget and the minimal privacy budget allocated to each iteration. It is proved that the ODPCA can meet the differential privacy requirements and has better performance by comparing with other popular algorithms.
KW - Differential privacy
KW - SDN-based smart grid
KW - big data
KW - clustering
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85064544712&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064544712&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2909048
DO - 10.1109/ACCESS.2019.2909048
M3 - Article
AN - SCOPUS:85064544712
VL - 7
SP - 45773
EP - 45782
JO - IEEE Access
JF - IEEE Access
M1 - 8681031
ER -