Abstract:Traditional loop closure detection algorithms mostly utilize handcrafted features to represent images. The robustness of dealing environmental changes such as illumination and viewpoint changes is vulnerable, and the features extraction is time-consuming, which cannot meet the real-time requirements. To address these problems, we propose an improved algorithm which uses deep neural network to extract more robust image features. Specifically, the atrous spatial pyramid pooling (ASPP) module is introduced into the classic NetVLAD model to characterize the image. By the fusion of multi-scale features, the feature maps have fewer dimension and higher resolution, and thus, more accurate and compact image features are obtained. Experiments on public datasets show that the proposed algorithm has higher precision and recall rate. It can deal with the changes of illumination and viewpoint to a certain extent, and has less time cost in extracting image features.
[1] 周彦, 李雅芳, 王冬丽, 等. 视觉同时定位与地图创建综述[J]. 智能系统学报, 2018, 13(1): 97-106. ZHOU Y, LI Y F, WANG D L, et al. A survey of VSLAM[J]. CAAI Transactions on Intelligent Systems, 2018, 13(1): 97-106. [2] Zhang X, Wang L, Su Y. Visual place recognition: a survey from deep learningperspective[J].Pattern Recognition, 2021, 113: 107760. [3] Tsintotas K A, Bampis L, Rallis S, et al.SeqSLAM with bag of visual words for appearance based loop closure detection[C]//International Conference on Robotics in Alpe-Adria Danube Region.Saint Paul, Minnesota, USA: Springer, Cham, 2018: 580-587. [4] Sánchez J,Perronnin F, Mensink T, et al.Image classification with the fisher vector: theory and practice[J].International Journal of Computer Vision, 2013, 105(3): 222-245. [5] Jégou H, Douze M, Schmid C, et al.Aggregating local descriptors into a compact image representation[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.San Francisco, California, USA: IEEE, 2010: 3304-3311. [6] 卢宏涛, 张秦川.深度卷积神经网络在计算机视觉中的应用研究综述[J].数据采集与处理, 2016, 31(1): 1-17. LU H T, ZHANG Q C.Applications of deep convolutional neural network in computer vision[J].Journal of Data Acquisition and Processing, 2016, 31(1): 1-17. [7] Sünderhauf N, Shirazi S, Dayoub F, et al.On the performance of convnet features for place recognition[C]//2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). Hamburger,Germany: IEEE, 2015: 4297-4304. [8] Chen B, Yuan D, Liu C, et al.Loop closure detection based on multi-scale deep featurefusion[J].Applied Sciences, 2019, 9(6): 1120. [9] Arandjelovic R, Gronat P, Torii A, et al.NetVLAD: CNN architecture for weakly supervised place recognition[C]//The IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA, 2016: 5297-5307. [10] Chen L C, Papandreou G, Kokkinos I, et al.Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848. [11] He K, Zhang X, Ren S, et al.Deep residual learning for image recognition[C]//The IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA, 2016: 770-778. [12] Milford M J, Wyeth G F.SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights[C]//2012 IEEE International Conference on Robotics and Automation. Saint Paul, Minnesota, USA: IEEE, 2012: 1643-1649. [13] Bentley J L.Multidimensional binary search trees used for associativesearching[J].Communications of the ACM, 1975, 18(9): 509-517. [14] Malkov Y A, Yashunin D A.Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 42(4): 824-836.