Unmanned underwater vehicles and dual use remote sensing applications
The increasing demand for advanced technologies in dual use scientific exploration and military operations has led to significant advancements in unmanned underwater vehicles (UUVs). These versatile systems have become indispensable for conducting a variety of maritime missions, including environmental monitoring, resource exploration, and defense-related applications. (Roh et al 2025)
With the increasing demand for scientific and military operations in marine environments, unmanned underwater vehicles (UUVs) have become essential due to their adaptability in conducting various maritime missions. Autonomous control systems further enhance this flexibility, particularly in mission-specific operations. However, much of the existing research focuses predominantly on 2D environments, often neglecting the complexities and orientation challenges that arise in real-world 3D underwater scenarios. To address these issues, this paper introduces a 5-degree-of-freedom (DOF) control approach for UUVs, aimed at improving waypoint-based path planning in 3D maritime missions through directional policy optimization (DPO). Utilizing a deep reinforcement learning (DRL) framework, the proposed method employs an efficient directional policy that accounts for the unique maneuvering characteristics of UUVs, including the maximum feasible rotation angles under dynamic constraints. The DPO algorithm enables UUVs to achieve efficient path planning by minimizing the number of waypoints required. Additionally, by considering the impact angle, which dictates the approach angle to the target, the methodology facilitates a mission execution strategy that minimizes the UUV’s exposure, thus enhancing stealth during military operations.
This paper introduced a novel DPO algorithm designed for autonomous mission-oriented UUV control, integrating directional rewards and efficient path planning to enhance navigation in dynamic 3D underwater environments. Based on the evaluation results, the proposed DPO algorithm has demonstrated improved path efficiency and higher mission success rates by effectively minimizing the number of waypoints required for targeting missions.