Research Student at ISAE-SUPAERO

Vicente ACOSTA

PPO-Clip Deep Reinforcement Learning Controller for MAVION VTOL UAV

About Me

  • I'm Vicente Acosta, Aerospace Engineer from Universidad Nacional de La Plata (UNLP), Argentina, currently pursuing an MSc in Aerospace Engineering – Systems & Control at ISAE-SUPAERO in Toulouse, France.

    I work at the Centro Tecnológico Aeroespacial (CTA) in Argentina, focusing on dynamics simulations, trajectory design, and guidance and control for launch vehicles.

    My current research, supervised by Professor Philippe Pastor, explores control of VTOL tail-sitter drones using Deep Reinforcement Learning. This project bridges my experience in dynamics and control with AI applications in aerospace systems.

  • Deep Reinforcement Learning (DRL) has shown promising potential for developing controllers capable of handling complex scenarios in the context of Hybrid Aerial Vehicles (HAVs), which combine characteristics of both fixed-wing and rotorcraft dynamics. While DRL offers the possibility of outperforming traditional methods in highly nonlinear systems operating under uncertain or dynamically changing conditions, its performance and robustness in real-world applications remain active areas of research. In this work, the Proximal Policy Optimization (PPO) algorithm—implemented via OpenAI’s Stable-Baselines3 library—is employed to train unified policy controllers for the take-off, cruise, and landing phases, including the transitions between vertical and horizontal flight. A custom simulation and training environment was developed using the MAVION platform, a Vertical Take-Off and Landing (VTOL) Unmanned Aerial Vehicle (UAV) designed at ISAE-SUPAERO. Separate two-dimensional (2D) trajectory controllers were trained for each phase under symmetric flight assumptions, demonstrating accurate tracking of complete flight profiles with minimal error. In addition, generalization techniques were introduced, enabling the trained policies to reliably track unseen target trajectories within a predefined flight envelope. The performance of the trained controllers was also evaluated under light atmospheric turbulence, showing encouraging results and suggesting potential for robust real-world applications.

    Keywords: Proximal Policy Optimization (PPO), Unmanned Aerial Vehicle (UAV), Hybrid Aerial Vehicle (HAV), Vertical Take-Off and Landing (VTOL), Deep Reinforcement Learning (DRL), Trajectory Controller

  • Philippe PASTOR, Researcher & Professor at ISAE-SUPAERO in Flight Dynamics & Aircraft Design, PhD in Automatic Control & AI from ISAE-SUPAERO

  • I have implemented flight dynamics simulators for various aerial vehicles in Python, C++, and MATLAB/Simulink, focusing on accurate modeling of aerodynamics, propulsion, and control systems. This work has supported projects in trajectory optimization, guidance design, and performance analysis. Recently, I’ve been extending these simulation capabilities to explore Deep Reinforcement Learning (DRL) for flight control applications. In particular, I’m working with Proximal Policy Optimization (PPO) to train agents capable of handling the full flight envelope of VTOL drones, aiming for seamless control across hover, transition, and forward flight regimes.

  • DRL control algorithm deployed in autonomous UAVs for both civil and defense applications.

Paper

Paper ⋆

Get in touch.

Previous
Previous

PPO-CLIP Deep Reinforcement Learning Controller for MAVION VTOL UAV - ISAE-SUPAERO

Next
Next

Quantum Clustering Algorithm for Galaxy Classification - ISAE SUPAERO