Loading…

Learning-Based Computation Offloading Approaches in UAVs-Assisted Edge Computing

Technological evolutions in unmanned aerial vehicle (UAV) industry have granted UAVs more computing and storage resources, leading to the vision of UAVs-assisted edge computing, in which the computing missions can be offloaded from a cellular network to a UAV cloudlet. In this paper, we propose a UA...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on vehicular technology 2021-01, Vol.70 (1), p.928-944
Main Authors:	Zhu, Shichao, Gui, Lin, Zhao, Dongmei, Cheng, Nan, Zhang, Qi, Lang, Xiupu
Format:	Article
Language:	English
Subjects:	Algorithms Bandwidth allocation Cellular communication Computation offloading Edge computing Finishes Heuristic algorithms inter-dependencies Machine learning Markov processes Missions multi-agent reinforcement learning Multiagent systems Reinforcement learning Resource management Response time Task analysis Time factors UAV Unmanned aerial vehicles Vehicle dynamics
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Technological evolutions in unmanned aerial vehicle (UAV) industry have granted UAVs more computing and storage resources, leading to the vision of UAVs-assisted edge computing, in which the computing missions can be offloaded from a cellular network to a UAV cloudlet. In this paper, we propose a UAVs-assisted computation offloading paradigm, where a group of UAVs fly around, while providing value-added edge computing services. The complex computing missions are decomposed as some typical task-flows with inter-dependencies. By taking into consideration the inter-dependencies of the tasks, dynamic network states, and energy constraints of the UAVs, we formulate the average mission response time minimization problem and then model it as a Markov decision process. Specifically, each time a mission arrives or a task execution finishes, we should decide the target helper for the next task execution and the fraction of the bandwidth allocated to the communication. To separate the evaluation of the integrated decision, we propose multi-agent reinforcement learning (MARL) algorithms, where the target helper and the bandwidth allocation are determined by two agents. We design respective advantage evaluation functions for the agents to solve the multi-agent credit assignment challenge, and further extend the on-policy algorithm to off-policy. Simulation results show that the proposed MARL-based approaches have desirable convergence property, and can adapt to the dynamic environment. The proposed approaches can significantly reduce the average mission response time compared with other benchmark approaches.
ISSN:	0018-9545 1939-9359
DOI:	10.1109/TVT.2020.3048938