Loading…

Hierarchical Reinforcement Learning-Based Routing Algorithm With Grouped RSU in Urban VANETs

The rapid growth of the Internet of Vehicles (IoV) has generated significant interest in routing techniques for vehicular ad hoc networks (VANETs) in both academic and industrial communities. To address the complexity of urban environments and dynamic vehicle mobility, we propose a hierarchical Q-le...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on intelligent transportation systems 2024-08, Vol.25 (8), p.10131-10146
Main Authors:	Yang, Qin, Yoo, Sang-Jo
Format:	Article
Language:	English
Subjects:	Data routing distributed learning Heuristic algorithms intelligent transportation systems (ITS) Measurement Q-learning reinforcement learning (RL) Roads roadside unit (RSU) Routing Vehicle-to-everything Vehicular ad hoc networks vehicular ad hoc networks (VANETs)
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The rapid growth of the Internet of Vehicles (IoV) has generated significant interest in routing techniques for vehicular ad hoc networks (VANETs) in both academic and industrial communities. To address the complexity of urban environments and dynamic vehicle mobility, we propose a hierarchical Q-learning-based routing algorithm with grouped roadside unit (RSU) for VANETs. RSUs are grouped, and a Q-vector containing group information is exchanged through vehicle-to-everything (V2X) communications. Q-vector-based road-segment (QVRS) control messages are periodically broadcasted to refresh the V2X evaluation metric, which considers vehicle positions, velocities, directions, and communication conditions. To adapt to the nonstationary vehicular environment, a multi-agent reinforcement learning (RL) algorithm is performed on RSUs at each intersection to achieve distributed learning and local decisions. The hierarchical Q-learning algorithm trains group Q-table and local Q-table individually for reaching destinations on each RSU. The optimal data routing behavior is conducted with two separate Q-tables by utilizing the integrated V2X metric as the reward function. Simulation results demonstrate that our proposed method reduces broadcasting overhead, prolongs path lifetime and maintains a high packet delivery ratio and low average end-to-end delay. The incorporation of group design in our method accelerates the learning process, which facilitates more efficient communication in VANETs.
ISSN:	1524-9050 1558-0016
DOI:	10.1109/TITS.2024.3353258