TY - GEN
T1 - Efficient parameter aggregation in federated learning with hybrid convergecast
AU - Tao, Yangyang
AU - Zhou, Junxiu
AU - Yu, Shucheng
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/9
Y1 - 2021/1/9
N2 - In federated learning, workers train local models with their private data sets and only upload local gradients to the remote aggregator. Data privacy is well preserved and parallelism is achieved. In large-scale deep learning tasks, however, frequent interactions between workers and the aggregator to transmit parameters can cause tremendous degradation of system performance in terms of communication costs, the needed number of iterations, the latency of each iteration and the accuracy of the trained model because of system 'churns' (i.e., devices frequently joining and leaving the network). Existing research leverages different network topologies to improve the performance of federated learning. In this paper, we propose a novel hybrid network topology design that integrates ring (R) and n-ary tree (T) to provide flexible and adaptive convergecast in federated learning. Specifically, multiple participated peers within one-hop are formed as a local ring to adapt to device dynamics (i.e., 'churns') and carry out local cooperation shuffling; an n-ary convergecast tree is formed from local rings to the aggregator to assure the communication efficiency. Theoretical analysis shows the superiority of the proposed hybrid (R+T) convergecast design in terms of system latency as compared to existing topologies. Prototype-based simulation on CloudLab shows that the hybrid (R+T) design is able to reduce the rounds of iterations while achieving the best model accuracy under system 'churns' as compared to the state of the art.
AB - In federated learning, workers train local models with their private data sets and only upload local gradients to the remote aggregator. Data privacy is well preserved and parallelism is achieved. In large-scale deep learning tasks, however, frequent interactions between workers and the aggregator to transmit parameters can cause tremendous degradation of system performance in terms of communication costs, the needed number of iterations, the latency of each iteration and the accuracy of the trained model because of system 'churns' (i.e., devices frequently joining and leaving the network). Existing research leverages different network topologies to improve the performance of federated learning. In this paper, we propose a novel hybrid network topology design that integrates ring (R) and n-ary tree (T) to provide flexible and adaptive convergecast in federated learning. Specifically, multiple participated peers within one-hop are formed as a local ring to adapt to device dynamics (i.e., 'churns') and carry out local cooperation shuffling; an n-ary convergecast tree is formed from local rings to the aggregator to assure the communication efficiency. Theoretical analysis shows the superiority of the proposed hybrid (R+T) convergecast design in terms of system latency as compared to existing topologies. Prototype-based simulation on CloudLab shows that the hybrid (R+T) design is able to reduce the rounds of iterations while achieving the best model accuracy under system 'churns' as compared to the state of the art.
KW - CloudLab
KW - Convergecast
KW - Federated Learning
KW - N-ary Tree
KW - Parallel Machine Learning
KW - Ring
UR - http://www.scopus.com/inward/record.url?scp=85102979888&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102979888&partnerID=8YFLogxK
U2 - 10.1109/CCNC49032.2021.9369497
DO - 10.1109/CCNC49032.2021.9369497
M3 - Conference contribution
AN - SCOPUS:85102979888
T3 - 2021 IEEE 18th Annual Consumer Communications and Networking Conference, CCNC 2021
BT - 2021 IEEE 18th Annual Consumer Communications and Networking Conference, CCNC 2021
T2 - 18th IEEE Annual Consumer Communications and Networking Conference, CCNC 2021
Y2 - 9 January 2021 through 13 January 2021
ER -