TY - JOUR
T1 - An Efficient On-Device Federated Learning System Through the Interplay of Client Selection and Batch Size With Watermarked Data
AU - Ling, Tao
AU - Shi, Siping
AU - Wang, Hao
AU - Hu, Chuang
AU - Wang, Dan
N1 - Publisher Copyright:
© 2002-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Federated Learning (FL) enables edge devices to collaboratively train a global model using local data. However, the increasing prevalence of watermarks in datasets presents a new challenge to efficient FL. While watermarks assert data ownership and copyright, they introduce complexities that can lead to shortcut learning problems and mislead utility measurements for client selection. These issues are further exacerbated by batch size variations in efficient FL frameworks, ultimately undermining their time-to-accuracy performance. We introduce LotusFL, an FL system designed to address the challenges posed by watermarked datasets in efficient FL. Specifically, it tackles the increased time-to-accuracy due to erroneous client selection and the accuracy degradation observed with larger batch sizes. LotusFL first estimates the characteristics of watermarks through statistical estimation and then adjusts the batch size using this estimated watermark information to balance the negative impact of the watermark against device idle waiting time. Additionally, its client selection mechanism, based on historical information, avoids the misleading utility signals from watermarks. This mechanism, working in conjunction with batch size adjustment, aims to accurately predict device runtime and identify potentially valuable devices. We evaluated LotusFL through a real-world deployment on 40 edge devices. Compared to state-of-the-art efficient FL frameworks, LotusFL achieves superior performance, enhancing accuracy by up to 8.2% and reducing training time by 1.97×.
AB - Federated Learning (FL) enables edge devices to collaboratively train a global model using local data. However, the increasing prevalence of watermarks in datasets presents a new challenge to efficient FL. While watermarks assert data ownership and copyright, they introduce complexities that can lead to shortcut learning problems and mislead utility measurements for client selection. These issues are further exacerbated by batch size variations in efficient FL frameworks, ultimately undermining their time-to-accuracy performance. We introduce LotusFL, an FL system designed to address the challenges posed by watermarked datasets in efficient FL. Specifically, it tackles the increased time-to-accuracy due to erroneous client selection and the accuracy degradation observed with larger batch sizes. LotusFL first estimates the characteristics of watermarks through statistical estimation and then adjusts the batch size using this estimated watermark information to balance the negative impact of the watermark against device idle waiting time. Additionally, its client selection mechanism, based on historical information, avoids the misleading utility signals from watermarks. This mechanism, working in conjunction with batch size adjustment, aims to accurately predict device runtime and identify potentially valuable devices. We evaluated LotusFL through a real-world deployment on 40 edge devices. Compared to state-of-the-art efficient FL frameworks, LotusFL achieves superior performance, enhancing accuracy by up to 8.2% and reducing training time by 1.97×.
KW - Federated learning
KW - batch size
KW - client selection
KW - data and system heterogeneity
KW - machine learning systems
KW - watermark
UR - https://www.scopus.com/pages/publications/105010057912
UR - https://www.scopus.com/pages/publications/105010057912#tab=citedBy
U2 - 10.1109/TMC.2025.3585033
DO - 10.1109/TMC.2025.3585033
M3 - Article
AN - SCOPUS:105010057912
SN - 1536-1233
VL - 24
SP - 11480
EP - 11493
JO - IEEE Transactions on Mobile Computing
JF - IEEE Transactions on Mobile Computing
IS - 11
ER -