TY - GEN
T1 - One-Pass Distribution Sketch for Measuring Data Heterogeneity in Federated Learning
AU - Liu, Zichang
AU - Coleman, Benjamin
AU - Xu, Zhaozhuo
AU - Shrivastava, Anshumali
N1 - Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Federated learning (FL) is a machine learning paradigm where multiple client devices train models collaboratively without data exchange. Data heterogeneity problem is naturally inherited in FL since data in different clients follow diverse distributions. To mitigate the negative influence of data heterogeneity, we need to start by measuring it across clients. However, the efficient measurement between distributions is a challenging problem, especially in high dimensionality. In this paper, we propose a one-pass distribution sketch to represent the client data distribution. Our sketching algorithm only requires a single pass of the client data, which is efficient in terms of time and memory. Moreover, we show in both theory and practice that the distance between two distribution sketches represents the divergence between their corresponding distributions. Furthermore, we demonstrate with extensive experiments that our distribution sketch improves the client selection in the FL training. We also showcase that our distribution sketch is an efficient solution to the cold start problem in FL for new clients with unlabeled data.
AB - Federated learning (FL) is a machine learning paradigm where multiple client devices train models collaboratively without data exchange. Data heterogeneity problem is naturally inherited in FL since data in different clients follow diverse distributions. To mitigate the negative influence of data heterogeneity, we need to start by measuring it across clients. However, the efficient measurement between distributions is a challenging problem, especially in high dimensionality. In this paper, we propose a one-pass distribution sketch to represent the client data distribution. Our sketching algorithm only requires a single pass of the client data, which is efficient in terms of time and memory. Moreover, we show in both theory and practice that the distance between two distribution sketches represents the divergence between their corresponding distributions. Furthermore, we demonstrate with extensive experiments that our distribution sketch improves the client selection in the FL training. We also showcase that our distribution sketch is an efficient solution to the cold start problem in FL for new clients with unlabeled data.
UR - http://www.scopus.com/inward/record.url?scp=85185913234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85185913234&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85185913234
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
Y2 - 10 December 2023 through 16 December 2023
ER -