Abstract:To efficiently utilize the hardware computing power of autonomous vehicles, a multi-task perception model OLAD is constructed based on YOLOv5,which can simultaneously achieve traffic object detection, lane lines recognition, and drivable area segmentation. By introducing an improved SPPFCSPC module and redesigning the feature fusion network based on Slim Neck, OLAD enhances feature extraction capabilities, inference speed, and detection accuracy, the loss function is improved by incorporating MPDIoU to boost the accuracy of traffic objects detection. In terms of model performance validation, a comprehensive performance evaluation is conducted by supplementing the self-made domestic road dataset in the BDD100K validation set. The results show that the detection accuracy and speed of OLAD are better than the YOLOP of SOTA; In addition, public road images from different time periods in Suzhou are randomly selected to test the performance of the model on domestic roads. The results show that the perception results of the OLAD model in this paper are more accurate and suitable for domestic roads.
[1] ALAM M K, AHMED A, SALIH R, et al. Faster RCNN based robust vehicle detection algorithm for identifying and classifying vehicles[J]. Journal of Real-Time Image Processing, 2023, 20(5): 93-103. [2] 宋华杰,周磊.基于函数改进的YOLOv3车辆检测与识别算法研究[J].智能科学与技术学报, 2023, 5(4): 535-542. SONG H, ZHOU L.Research on vehicle detection and recognition algorithm based on function improvement of YOLOv3 [J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(4): 535-542. [3] LIU Z, HAN W, XU H, et al. Research on vehicle detection based on improved YOLOX_S[J]. Scientific Reports, 2023, 13(1): 23081. [4] WEMG W, ZHU X. INet: Convolutional networks for biomedical image segmentation[J]. IEEE Access, 2021, 9: 16591-16603. [5] ZHAO H, SHI J,QI X, et al. Pyramid scene parsing network[C].2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2017: 2881-2890. [6] DU J, SONG J, CHENG K, et al. Efficient spatial pyramid of dilated convolution and bottleneck network for Zero-Shot super resolution[J]. IEEE Access, 2020, 8: 117961-117971. [7] ZHANG Y, LU Z, MA D, et al. Ripple-GAN: lane line detection with ripple lane line detection network and wasserstein GAN[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22: 1532-1542. [8] HARIS M, HOU J,WANG X. Lane lines detection under complex environment by fusion of detection and prediction models[J]. Transportation Research Record, 2022, 2676(3): 342-359. [9] YANG Q, MA Y,LI L, et al. Lightweight lane line detection based on learnable cluster segmentation with self-attention mechanism[J]. IET Intelligent Transport Systems, 2023, 17(3): 522-533. [10] QIAN Y, DOLAN J M,YANG M, DLT-Net: joint detection of drivable areas, lane lines, and traffic objects[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(11): 4670-4679. [11] CHEN G, WU T,DUAN J, et al. CenterPNets: a multi-task shared network for traffic perception[J]. Sensors, 2023, 23(5): 2467. [12] 孙传龙,赵红,崔翔宇,等.基于特征融合的无人驾驶多任务感知算法[J].复杂系统与复杂性科学, 2023, 20(3): 103-110. SUN C, ZHAO H, CUI X, et al. Multi-task sensing algorithm for driverless vehicle based on feature fusion[J]. Complex Systems and Complexity Science, 2023, 20(3): 103-110. [13] WU D, LIAO M,ZHANG W, et al. YOLOP: you only look once for panoptic driving perception[J]. Machine Intelligence Research, 2022, 19(6): 550-562. [14] LI H. Slim-neck byGSConv: a better design paradigm of detector architectures for autonomous vehicles[DB/OL]. (2022-08-17)[2024-01-01]. https://doi.org/10.48550/arXiv.2206.02424. [15] WANG C Y,BOCHKOVSKIY A, LIAO H, et al. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C].2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2023: 7464-7475. [16] MA S, XU Y.MPDIoU: a loss for efficient and accurate bounding box regression[DB/OL]. (2022-07-14)[2024-01-01]. https://doi.org/10.48550/arXiv.2307.07662. [17] YU F, CHEN H,WANG X, et al. BDD100K: a diverse driving dataset for heterogeneous multitask learning[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2020: 2636-2645.