TY - JOUR
T1 - Automated design of energy efficient control strategies for building clusters using reinforcement learning
AU - Odonkor, Philip
AU - Lewis, Kemper
N1 - Publisher Copyright:
Copyright © 2019 by ASME.
PY - 2019/2/1
Y1 - 2019/2/1
N2 - The control of shared energy assets within building clusters has traditionally been confined to a discrete action space, owing in part to a computationally intractable decision space. In this work, we leverage the current state of the art in reinforcement learning (RL) for continuous control tasks, the deep deterministic policy gradient (DDPG) algorithm, toward addressing this limitation. The goals of this paper are twofold: (i) to design an efficient charged/discharged dispatch policy for a shared battery system within a building cluster and (ii) to address the continuous domain task of determining how much energy should be charged/discharged at each decision cycle. Experimentally, our results demonstrate an ability to exploit factors such as energy arbitrage, along with the continuous action space toward demand peak minimization. This approach is shown to be computationally tractable, achieving efficient results after only 5 h of simulation. Additionally, the agent showed an ability to adapt to different building clusters, designing unique control strategies to address the energy demands of the clusters studied.
AB - The control of shared energy assets within building clusters has traditionally been confined to a discrete action space, owing in part to a computationally intractable decision space. In this work, we leverage the current state of the art in reinforcement learning (RL) for continuous control tasks, the deep deterministic policy gradient (DDPG) algorithm, toward addressing this limitation. The goals of this paper are twofold: (i) to design an efficient charged/discharged dispatch policy for a shared battery system within a building cluster and (ii) to address the continuous domain task of determining how much energy should be charged/discharged at each decision cycle. Experimentally, our results demonstrate an ability to exploit factors such as energy arbitrage, along with the continuous action space toward demand peak minimization. This approach is shown to be computationally tractable, achieving efficient results after only 5 h of simulation. Additionally, the agent showed an ability to adapt to different building clusters, designing unique control strategies to address the energy demands of the clusters studied.
UR - http://www.scopus.com/inward/record.url?scp=85059069042&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059069042&partnerID=8YFLogxK
U2 - 10.1115/1.4041629
DO - 10.1115/1.4041629
M3 - Article
AN - SCOPUS:85059069042
SN - 1050-0472
VL - 141
JO - Journal of Mechanical Design
JF - Journal of Mechanical Design
IS - 2
M1 - 021704
ER -