Loading…

Learning-Based Computation Offloading Approaches in UAVs-Assisted Edge Computing

Technological evolutions in unmanned aerial vehicle (UAV) industry have granted UAVs more computing and storage resources, leading to the vision of UAVs-assisted edge computing, in which the computing missions can be offloaded from a cellular network to a UAV cloudlet. In this paper, we propose a UA...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on vehicular technology 2021-01, Vol.70 (1), p.928-944
Main Authors: Zhu, Shichao, Gui, Lin, Zhao, Dongmei, Cheng, Nan, Zhang, Qi, Lang, Xiupu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Technological evolutions in unmanned aerial vehicle (UAV) industry have granted UAVs more computing and storage resources, leading to the vision of UAVs-assisted edge computing, in which the computing missions can be offloaded from a cellular network to a UAV cloudlet. In this paper, we propose a UAVs-assisted computation offloading paradigm, where a group of UAVs fly around, while providing value-added edge computing services. The complex computing missions are decomposed as some typical task-flows with inter-dependencies. By taking into consideration the inter-dependencies of the tasks, dynamic network states, and energy constraints of the UAVs, we formulate the average mission response time minimization problem and then model it as a Markov decision process. Specifically, each time a mission arrives or a task execution finishes, we should decide the target helper for the next task execution and the fraction of the bandwidth allocated to the communication. To separate the evaluation of the integrated decision, we propose multi-agent reinforcement learning (MARL) algorithms, where the target helper and the bandwidth allocation are determined by two agents. We design respective advantage evaluation functions for the agents to solve the multi-agent credit assignment challenge, and further extend the on-policy algorithm to off-policy. Simulation results show that the proposed MARL-based approaches have desirable convergence property, and can adapt to the dynamic environment. The proposed approaches can significantly reduce the average mission response time compared with other benchmark approaches.
ISSN:0018-9545
1939-9359
DOI:10.1109/TVT.2020.3048938