TY - GEN
T1 - A TWO-TIMESCALE REINFORCEMENT LEARNING APPROACH FOR CONTROL CO-DESIGN PROBLEMS
AU - Sadat, Eddieb
AU - Saremi, Mostaan Lotfalian
AU - Bayrak, Alparslan Emrah
N1 - Publisher Copyright:
© 2023 American Society of Mechanical Engineers (ASME). All rights reserved.
PY - 2023
Y1 - 2023
N2 - Design of smart (or active) systems that perform automated tasks intelligently based on the interaction with their environments requires a collective solution of the physical and control system design problems together. In this paper, we present a model-free on-policy reinforcement learning approach to solve control co-design problems for such smart systems. This approach uses a discrete two timescale reinforcement learning that addresses the control system design in an inner loop with a fast time scale and the physical system design in an outer loop with a slower time scale. Both design problems use the same temporal difference-based Q-learning formulation. We apply this two-time-scale reinforcement approach to the online video game EcoRacer where the physical system involves the design of a gear ratio for an electric vehicle and the control system involves acceleration and braking decisions over time to finish a track with minimum energy consumption within a limited time. The results show the ability of the proposed approach to find the system optimal solution for the EcoRacer case study within a reasonable computation time without requiring any knowledge of the physics governing the system. The proposed method is generalizable and has the potential to take advantage of the ongoing developments in the field of reinforcement learning.
AB - Design of smart (or active) systems that perform automated tasks intelligently based on the interaction with their environments requires a collective solution of the physical and control system design problems together. In this paper, we present a model-free on-policy reinforcement learning approach to solve control co-design problems for such smart systems. This approach uses a discrete two timescale reinforcement learning that addresses the control system design in an inner loop with a fast time scale and the physical system design in an outer loop with a slower time scale. Both design problems use the same temporal difference-based Q-learning formulation. We apply this two-time-scale reinforcement approach to the online video game EcoRacer where the physical system involves the design of a gear ratio for an electric vehicle and the control system involves acceleration and braking decisions over time to finish a track with minimum energy consumption within a limited time. The results show the ability of the proposed approach to find the system optimal solution for the EcoRacer case study within a reasonable computation time without requiring any knowledge of the physics governing the system. The proposed method is generalizable and has the potential to take advantage of the ongoing developments in the field of reinforcement learning.
KW - Control co-design
KW - model-free learning
KW - reinforcement learning
KW - video games
UR - http://www.scopus.com/inward/record.url?scp=85178563455&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85178563455&partnerID=8YFLogxK
U2 - 10.1115/DETC2023-116567
DO - 10.1115/DETC2023-116567
M3 - Conference contribution
AN - SCOPUS:85178563455
T3 - Proceedings of the ASME Design Engineering Technical Conference
BT - 49th Design Automation Conference (DAC)
T2 - ASME 2023 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC-CIE 2023
Y2 - 20 August 2023 through 23 August 2023
ER -