TY - JOUR
T1 - Communication-Efficient Federated Learning
T2 - A Variance-Reduced Stochastic Approach With Adaptive Sparsification
AU - Wang, Bin
AU - Fang, Jun
AU - Li, Hongbin
AU - Zeng, Bing
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Federated learning (FL) is an emerging distributed machine learning paradigm that aims to realize model training without gathering the data from data sources to a central processing unit. A traditional FL framework consists of a central server as well as a number of computing devices (aka workers). Training a model under the FL framework usually consumes a massive amount of communication resources because the server and devices should frequently communicate with each other. To alleviate the communication burden, we, in this paper, propose to adaptively sparsify the gradient vector which is transmitted to the server by each device, thus significantly reducing the amount of information that need to be sent to the central server. The proposed algorithm is built on the sparsified SAGA, a well-known variance-reduced stochastic algorithm. For the proposed algorithm, after the gradient vector is sparsified using conventional sparsification operators, an adaptive sparsification step is further added to identify the most informative elements in the sparsified gradient vector. Convergence analysis indicates that the proposed algorithm enjoys a linear convergence rate. Numerical results show that the adaptive sparsification mechanism can substantially improve the communication efficiency. Specifically, to achieve the same classification accuracy, the proposed method can reduce the communication overhead by at least 60% as compared with existing state-of-the-art sparsification-based methods.
AB - Federated learning (FL) is an emerging distributed machine learning paradigm that aims to realize model training without gathering the data from data sources to a central processing unit. A traditional FL framework consists of a central server as well as a number of computing devices (aka workers). Training a model under the FL framework usually consumes a massive amount of communication resources because the server and devices should frequently communicate with each other. To alleviate the communication burden, we, in this paper, propose to adaptively sparsify the gradient vector which is transmitted to the server by each device, thus significantly reducing the amount of information that need to be sent to the central server. The proposed algorithm is built on the sparsified SAGA, a well-known variance-reduced stochastic algorithm. For the proposed algorithm, after the gradient vector is sparsified using conventional sparsification operators, an adaptive sparsification step is further added to identify the most informative elements in the sparsified gradient vector. Convergence analysis indicates that the proposed algorithm enjoys a linear convergence rate. Numerical results show that the adaptive sparsification mechanism can substantially improve the communication efficiency. Specifically, to achieve the same classification accuracy, the proposed method can reduce the communication overhead by at least 60% as compared with existing state-of-the-art sparsification-based methods.
KW - Federated learning
KW - adaptive sparsification
KW - variance-reduction
UR - http://www.scopus.com/inward/record.url?scp=85174943334&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174943334&partnerID=8YFLogxK
U2 - 10.1109/TSP.2023.3316588
DO - 10.1109/TSP.2023.3316588
M3 - Article
AN - SCOPUS:85174943334
SN - 1053-587X
VL - 71
SP - 3562
EP - 3576
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
ER -