Hierarchical Q-learning network for online simultaneous optimization of energy efficiency and battery life of the battery/ultracapacitor electric vehicle

Bin Xu; Quan Zhou; Junzhe  Shi; Sixu Li

doi:10.1016/j.est.2021.103925

Hierarchical Q-learning network for online simultaneous optimization of energy efficiency and battery life of the battery/ultracapacitor electric vehicle

Bin Xu, Quan Zhou, Junzhe Shi, Sixu Li

Mechanical Engineering

Research output: Contribution to journal › Article › peer-review

169 Downloads (Pure)

Abstract

Reinforcement learning has been gaining attention in energy management of hybrid power systems for its low computation cost and great energy saving performance. However, the potential of reinforcement learning (RL) has not been fully explored in electric vehicle (EV) applications because most studies on RL only focused on single design targets. This paper studied on online optimization of the supervisory control system of an EV (powered by battery and ultracapacitor) with two design targets, maximizing energy efficiency and battery life. Based on a widely used reinforcement learning method, Q-learning, a hierarchical learning network is proposed. Within the hierarchical Q-learning network, two independent Q tables, Q1 and Q2, are allocated in two control layers. In addition to the baseline power-split layer, which determines the power split ratio between battery and ultracapacitor based on the knowledge stored in Q1, an upper layer is developed to trigger the engagement of the ultracapacitor based on Q2. In the learning process, Q1 and Q2 are updated during the real driving using the measured signals of states, actions, and rewards. The hierarchical Q-learning network is developed and evaluated following a full propulsion system model. By introducing the single-layer Q-learning based method and the rule-based method as two baselines, performance of the EV with the three control methods (i.e., two baseline and one proposed) are simulated under different driving cycles. The results show that the addition of an ultracapacitor in the electric vehicle reduces the battery capacity loss by 12%. The proposed hierarchical Q-learning network is shown superior to the two baseline methods by reducing 8% battery capacity loss. The vehicle range is slightly extended along with the battery life extension. Moreover, the proposed strategy is validated by considering different driving cycle and measurement noise. The proposed hierarchical strategy can be adapted and applied to reinforcement learning based energy management in different hybrid power systems

Original language	English
Article number	103925
Journal	Journal of Energy Storage
Volume	46
Early online date	5 Jan 2022
DOIs	https://doi.org/10.1016/j.est.2021.103925
Publication status	Published - Feb 2022

Keywords

Battery
Electric vehicle
Energy management
Q-learning
Reinforcement learning
Ultracapacitor

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.est.2021.103925Licence: None: All rights reserved

XuB2022Hierarchical
Contains public sector information licensed under the Open Government Licence v3.0. https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
Final published version, 4.18 MBLicence: Other (please provide link to licence statement

Cite this

@article{01c64d633db24915aaf8363bcc23faad,

title = "Hierarchical Q-learning network for online simultaneous optimization of energy efficiency and battery life of the battery/ultracapacitor electric vehicle",

abstract = "Reinforcement learning has been gaining attention in energy management of hybrid power systems for its low computation cost and great energy saving performance. However, the potential of reinforcement learning (RL) has not been fully explored in electric vehicle (EV) applications because most studies on RL only focused on single design targets. This paper studied on online optimization of the supervisory control system of an EV (powered by battery and ultracapacitor) with two design targets, maximizing energy efficiency and battery life. Based on a widely used reinforcement learning method, Q-learning, a hierarchical learning network is proposed. Within the hierarchical Q-learning network, two independent Q tables, Q1 and Q2, are allocated in two control layers. In addition to the baseline power-split layer, which determines the power split ratio between battery and ultracapacitor based on the knowledge stored in Q1, an upper layer is developed to trigger the engagement of the ultracapacitor based on Q2. In the learning process, Q1 and Q2 are updated during the real driving using the measured signals of states, actions, and rewards. The hierarchical Q-learning network is developed and evaluated following a full propulsion system model. By introducing the single-layer Q-learning based method and the rule-based method as two baselines, performance of the EV with the three control methods (i.e., two baseline and one proposed) are simulated under different driving cycles. The results show that the addition of an ultracapacitor in the electric vehicle reduces the battery capacity loss by 12%. The proposed hierarchical Q-learning network is shown superior to the two baseline methods by reducing 8% battery capacity loss. The vehicle range is slightly extended along with the battery life extension. Moreover, the proposed strategy is validated by considering different driving cycle and measurement noise. The proposed hierarchical strategy can be adapted and applied to reinforcement learning based energy management in different hybrid power systems",

keywords = "Battery, Electric vehicle, Energy management, Q-learning, Reinforcement learning, Ultracapacitor",

author = "Bin Xu and Quan Zhou and Junzhe Shi and Sixu Li",

year = "2022",

month = feb,

doi = "10.1016/j.est.2021.103925",

language = "English",

volume = "46",

journal = "Journal of Energy Storage",

issn = "2352-152X",

publisher = "Elsevier",

}

TY - JOUR

T1 - Hierarchical Q-learning network for online simultaneous optimization of energy efficiency and battery life of the battery/ultracapacitor electric vehicle

AU - Xu, Bin

AU - Zhou, Quan

AU - Shi, Junzhe

AU - Li, Sixu

PY - 2022/2

Y1 - 2022/2

N2 - Reinforcement learning has been gaining attention in energy management of hybrid power systems for its low computation cost and great energy saving performance. However, the potential of reinforcement learning (RL) has not been fully explored in electric vehicle (EV) applications because most studies on RL only focused on single design targets. This paper studied on online optimization of the supervisory control system of an EV (powered by battery and ultracapacitor) with two design targets, maximizing energy efficiency and battery life. Based on a widely used reinforcement learning method, Q-learning, a hierarchical learning network is proposed. Within the hierarchical Q-learning network, two independent Q tables, Q1 and Q2, are allocated in two control layers. In addition to the baseline power-split layer, which determines the power split ratio between battery and ultracapacitor based on the knowledge stored in Q1, an upper layer is developed to trigger the engagement of the ultracapacitor based on Q2. In the learning process, Q1 and Q2 are updated during the real driving using the measured signals of states, actions, and rewards. The hierarchical Q-learning network is developed and evaluated following a full propulsion system model. By introducing the single-layer Q-learning based method and the rule-based method as two baselines, performance of the EV with the three control methods (i.e., two baseline and one proposed) are simulated under different driving cycles. The results show that the addition of an ultracapacitor in the electric vehicle reduces the battery capacity loss by 12%. The proposed hierarchical Q-learning network is shown superior to the two baseline methods by reducing 8% battery capacity loss. The vehicle range is slightly extended along with the battery life extension. Moreover, the proposed strategy is validated by considering different driving cycle and measurement noise. The proposed hierarchical strategy can be adapted and applied to reinforcement learning based energy management in different hybrid power systems

AB - Reinforcement learning has been gaining attention in energy management of hybrid power systems for its low computation cost and great energy saving performance. However, the potential of reinforcement learning (RL) has not been fully explored in electric vehicle (EV) applications because most studies on RL only focused on single design targets. This paper studied on online optimization of the supervisory control system of an EV (powered by battery and ultracapacitor) with two design targets, maximizing energy efficiency and battery life. Based on a widely used reinforcement learning method, Q-learning, a hierarchical learning network is proposed. Within the hierarchical Q-learning network, two independent Q tables, Q1 and Q2, are allocated in two control layers. In addition to the baseline power-split layer, which determines the power split ratio between battery and ultracapacitor based on the knowledge stored in Q1, an upper layer is developed to trigger the engagement of the ultracapacitor based on Q2. In the learning process, Q1 and Q2 are updated during the real driving using the measured signals of states, actions, and rewards. The hierarchical Q-learning network is developed and evaluated following a full propulsion system model. By introducing the single-layer Q-learning based method and the rule-based method as two baselines, performance of the EV with the three control methods (i.e., two baseline and one proposed) are simulated under different driving cycles. The results show that the addition of an ultracapacitor in the electric vehicle reduces the battery capacity loss by 12%. The proposed hierarchical Q-learning network is shown superior to the two baseline methods by reducing 8% battery capacity loss. The vehicle range is slightly extended along with the battery life extension. Moreover, the proposed strategy is validated by considering different driving cycle and measurement noise. The proposed hierarchical strategy can be adapted and applied to reinforcement learning based energy management in different hybrid power systems

KW - Battery

KW - Electric vehicle

KW - Energy management

KW - Q-learning

KW - Reinforcement learning

KW - Ultracapacitor

UR - http://www.scopus.com/inward/record.url?scp=85123957772&partnerID=8YFLogxK

U2 - 10.1016/j.est.2021.103925

DO - 10.1016/j.est.2021.103925

M3 - Article

SN - 2352-152X

VL - 46

JO - Journal of Energy Storage

JF - Journal of Energy Storage

M1 - 103925

ER -

Hierarchical Q-learning network for online simultaneous optimization of energy efficiency and battery life of the battery/ultracapacitor electric vehicle

Abstract

Keywords

UN SDGs

Access to Document

Fingerprint

Cite this