Electrical and Computer Engineering ETDs
Publication Date
Spring 5-11-2024
Abstract
In the dynamic landscape of autonomous aerial systems, the integration of uncrewed aerial vehicles (UAVs) has sparked a paradigm shift, offering unprecedented opportunities and challenges in collaborative decision-making and navigation. This thesis explores the application of multi-agent reinforcement learning (MARL) for the planning and coordination of UAVs in complex environments.
The first part of this thesis provides an introduction to single-agent reinforcement learning and MARL. We provide examples of the use of MARL for countering uncrewed aerial systems (C-UAS). We formulate the Counter-UAS problem as a multiagent partially observable Markov decision process (MAPOMDP), and we propose Multi-AGent partial observable deep reiNforcement lEarning for pursuer conTrol optimization (MAGNET) to train a group of UAS in terms of pursuers or agents, to pursue and intercept a faster UAS or evader, which tries to escape from capture while navigating through crowded airspace with several moving non-cooperating interacting entities (NCIEs). In MAGNET, we integrate the Control Barrier Function iv (CBF) based safety layer into proximal policy optimization (PPO) to provide safety guarantees during the training and testing processes. In addition, we incorporate the DeepSet network into MAGNET to handle the time-varying dimension of an agent’s observations. We conduct extensive simulations, and the results show that MAGNET can maintain a collision-free environment at the sacrifice of slight evader capture rate reduction as compared to the baseline implementations.
The second part of this thesis deals with learning safe methods for Multi-Agent Systems. To this extent, we explore a more complicated scenario in Advanced air mobility applications, where a group of autonomous uncrewed aerial vehicles (UAVs) may need to cooperate to arrive at their predefined destinations simultaneously to, for example, attack a target or carry heavy cargo. However, controlling a group of UAVs to arrive at destinations simultaneously is nontrivial as they have to meet spatial constraints, meaning that the control algorithm has to avoid collisions not only among UAVs but also between UAVs and non-cooperative flying objects (NCFOs), which are not coordinated by the control algorithm. The existing time-coordinated control algorithms can achieve simultaneous arrivals for a multi-UAV system but are unable to ensure collision-free. In this example, we propose a safe linear quadratic optimal control algorithm, which comprises two major parts, i.e., a time-coordinated planner and a safety layer, where the time-coordinated planner is to derive the accelerations of UAVs to minimize the difference between the arrival time and the predefined termination time for all the UAVs, and the safety layer applies a control barrier function based solution to generate feasible accelerations of UAVs that ensure collision-free environment.
Finally, we use the MARL framework to solve the terminal time-coordinated problem, successfully achieving the simultaneous arrival of UAVs at their destinations while avoiding collisions with other UAVs and non-cooperative flying objects (NCFOs).
Keywords
reinforcement learning, machine learning, uncrewed aerial vehicles, safety
Sponsors
Sandia Laboratories, Air Force Research Laboratories.
Document Type
Dissertation
Language
English
Degree Name
Computer Engineering
Level of Degree
Doctoral
Department Name
Electrical and Computer Engineering
First Committee Member (Chair)
Rafael Fierro
Second Committee Member
Xiang Sun
Third Committee Member
Marios Pattichis
Fourth Committee Member
Claus Danielson
Recommended Citation
Pierre, Jean-Elie. "Securing The Skies: Safety-Constrained Decentralized Multi-UAV Coordination with Deep Reinforcement Learning." (2024). https://digitalrepository.unm.edu/ece_etds/647
Included in
Computational Engineering Commons, Electrical and Computer Engineering Commons, Robotics Commons