TY - JOUR
T1 - Nitro
T2 - 51st International Conference on Very Large Data Bases, VLDB 2025
AU - Yu, Hanfei
AU - Carter, Jacob
AU - Wang, Hao
AU - Tiwari, Devesh
AU - Li, Jian
AU - Park, Seung Jong
N1 - Publisher Copyright:
© 2024, VLDB Endowment. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Deep reinforcement learning (DRL) has demonstrated significant potential in various applications, including gaming AI, robotics, and system scheduling. DRL algorithms produce, sample, and learn from training data online through a trial-and-error process, demanding considerable time and computational resources. To address this, distributed DRL algorithms and paradigms have been developed to expedite training using extensive resources. Through carefully designed experiments, we are the first to observe that strategically increasing the actor-environment interactions by spawning more concurrent actors at certain training rounds within ephemeral time frames can significantly enhance training efficiency. Yet, current distributed DRL solutions, which are predominantly server-based (or serverful), fail to capitalize on these opportunities due to their long startup times, limited adaptability, and cumbersome scalability. This paper proposes Nitro, a generic training engine for distributed DRL algorithms that enforces timely and effective boosting with concurrent actors instantaneously spawned by serverless computing. With serverless functions, Nitro adjusts data sampling strategies dynamically according to the DRL training demands. Nitro seizes the opportunity of real-time boosting by accurately and swiftly detecting an empirical metric. To achieve cost efficiency, we design a heuristic actor scaling algorithm to guide Nitro for cost-aware boosting budget allocation. We integrate Nitro with state-of-the-art DRL algorithms and frameworks and evaluate them on AWS EC2 and Lambda. Experiments with Mujoco and Atari benchmarks show that Nitro improves the final rewards (i.e., training quality) by up to 6× and reduces training costs by up to 42%.
AB - Deep reinforcement learning (DRL) has demonstrated significant potential in various applications, including gaming AI, robotics, and system scheduling. DRL algorithms produce, sample, and learn from training data online through a trial-and-error process, demanding considerable time and computational resources. To address this, distributed DRL algorithms and paradigms have been developed to expedite training using extensive resources. Through carefully designed experiments, we are the first to observe that strategically increasing the actor-environment interactions by spawning more concurrent actors at certain training rounds within ephemeral time frames can significantly enhance training efficiency. Yet, current distributed DRL solutions, which are predominantly server-based (or serverful), fail to capitalize on these opportunities due to their long startup times, limited adaptability, and cumbersome scalability. This paper proposes Nitro, a generic training engine for distributed DRL algorithms that enforces timely and effective boosting with concurrent actors instantaneously spawned by serverless computing. With serverless functions, Nitro adjusts data sampling strategies dynamically according to the DRL training demands. Nitro seizes the opportunity of real-time boosting by accurately and swiftly detecting an empirical metric. To achieve cost efficiency, we design a heuristic actor scaling algorithm to guide Nitro for cost-aware boosting budget allocation. We integrate Nitro with state-of-the-art DRL algorithms and frameworks and evaluate them on AWS EC2 and Lambda. Experiments with Mujoco and Atari benchmarks show that Nitro improves the final rewards (i.e., training quality) by up to 6× and reduces training costs by up to 42%.
UR - http://www.scopus.com/inward/record.url?scp=85213877461&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85213877461&partnerID=8YFLogxK
U2 - 10.14778/3696435.3696441
DO - 10.14778/3696435.3696441
M3 - Conference article
AN - SCOPUS:85213877461
VL - 18
SP - 66
EP - 79
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 1
Y2 - 1 September 2025 through 5 September 2025
ER -