Sensors, Vol. 23, Pages 4925: Object Detection of Flexible Objects with Arbitrary Orientation Based on Rotation-Adaptive YOLOv5
Sensors doi: 10.3390/s23104925
Authors: Jiajun Wu Lumei Su Zhiwei Lin Yuhan Chen Jiaming Ji Tianyou Li
It is challenging to accurately detect flexible objects with arbitrary orientation from monitoring images in power grid maintenance and inspection sites. This is because these images exhibit a significant imbalance between the foreground and background, which can lead to low detection accuracy when using a horizontal bounding box (HBB) as the detector in general object detection algorithms. Existing multi-oriented detection algorithms that use irregular polygons as the detector can improve accuracy to some extent, but their accuracy is limited due to boundary problems during the training process. This paper proposes a rotation-adaptive YOLOv5 (R_YOLOv5) with a rotated bounding box (RBB) to detect flexible objects with arbitrary orientation, effectively addressing the above issues and achieving high accuracy. Firstly, a long-side representation method is used to add the degree of freedom (DOF) for bounding boxes, enabling accurate detection of flexible objects with large spans, deformable shapes, and small foreground-to-background ratios. Furthermore, the further boundary problem induced by the proposed bounding box strategy is overcome by using classification discretization and symmetric function mapping methods. Finally, the loss function is optimized to ensure training convergence for the new bounding box. To meet various practical requirements, we propose four models with different scales based on YOLOv5, namely R_YOLOv5s, R_YOLOv5m, R_YOLOv5l, and R_YOLOv5x. Experimental results demonstrate that these four models achieve mean average precision (mAP) values of 0.712, 0.731, 0.736, and 0.745 on the DOTA-v1.5 dataset and 0.579, 0.629, 0.689, and 0.713 on our self-built FO dataset, exhibiting higher recognition accuracy and a stronger generalization ability. Among them, R_YOLOv5x achieves a mAP that is about 6.84% higher than ReDet on the DOTAv-1.5 dataset and at least 2% higher than the original YOLOv5 model on the FO dataset.