TY - GEN
T1 - Seizing Critical Learning Periods in Federated Learning
AU - Yan, Gang
AU - Wang, Hao
AU - Li, Jian
N1 - Publisher Copyright:
Copyright © 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2022/6/30
Y1 - 2022/6/30
N2 - Federated learning (FL) is a popular technique to train machine learning (ML) models with decentralized data. Extensive works have studied the performance of the global model; however, it is still unclear how the training process affects the final test accuracy. Exacerbating this problem is the fact that FL executions differ significantly from traditional ML with heterogeneous data characteristics across clients, involving more hyperparameters. In this work, we show that the final test accuracy of FL is dramatically affected by the early phase of the training process, i.e., FL exhibits critical learning periods, in which small gradient errors irrecoverably impact the final test accuracy. To further explain this phenomenon, we generalize the trace of Fisher Information Matrix (FIM) to FL and define a new notion called FedFIM, a quantity reflecting the local curvature of each client from the beginning of training in FL. Our findings suggest that the initial learning phase plays a critical role in understanding the FL performance. This is in contrast to many existing works which generally do not connect the final accuracy of FL to the early phase training. Finally, seizing critical learning periods in FL is of independent interest and could be useful for other problems such as the choices of hyperparameters including but not limited to the number of client selected per round, batch size, so as to improve the performance of FL training and testing.
AB - Federated learning (FL) is a popular technique to train machine learning (ML) models with decentralized data. Extensive works have studied the performance of the global model; however, it is still unclear how the training process affects the final test accuracy. Exacerbating this problem is the fact that FL executions differ significantly from traditional ML with heterogeneous data characteristics across clients, involving more hyperparameters. In this work, we show that the final test accuracy of FL is dramatically affected by the early phase of the training process, i.e., FL exhibits critical learning periods, in which small gradient errors irrecoverably impact the final test accuracy. To further explain this phenomenon, we generalize the trace of Fisher Information Matrix (FIM) to FL and define a new notion called FedFIM, a quantity reflecting the local curvature of each client from the beginning of training in FL. Our findings suggest that the initial learning phase plays a critical role in understanding the FL performance. This is in contrast to many existing works which generally do not connect the final accuracy of FL to the early phase training. Finally, seizing critical learning periods in FL is of independent interest and could be useful for other problems such as the choices of hyperparameters including but not limited to the number of client selected per round, batch size, so as to improve the performance of FL training and testing.
UR - http://www.scopus.com/inward/record.url?scp=85147674016&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147674016&partnerID=8YFLogxK
U2 - 10.1609/aaai.v36i8.20859
DO - 10.1609/aaai.v36i8.20859
M3 - Conference contribution
AN - SCOPUS:85147674016
T3 - Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
SP - 8788
EP - 8796
BT - AAAI-22 Technical Tracks 8
T2 - 36th AAAI Conference on Artificial Intelligence, AAAI 2022
Y2 - 22 February 2022 through 1 March 2022
ER -