Abstract:In this paper, we use generative adversarial network (GAN) to improve semantic segmentation of images. The model is composed of a semantic segmentation network and a discriminant network, where the segmentation network responses for generating semantic segmentation result while the discriminant network responses for detecting the difference between the generated result and the labels on the global structure level and improving the segmentation effect. In order to extract context information, we adopt the spatial pyramid pooling module in the segmentation network, which could perform pooling operation on multiple levels of sub-regions. Meanwhile, in order to solve the problem of a large number of manual annotations needed in the semantic segmentation data set, we use the discriminant network to generate pseudo labels and realize semi-supervision in the training of the segmentation network. The model has been tested using PASCAL VOC2012 dataset, and the results show that supervised and semi-supervised approaches proposed in this paper are superior to the existing methods.
[1] Lateef F,Ruichek Y. Survey on semantic segmentation using deep learning techniques[J]. Neurocomputing, 2019, 338: 321348. [2] Xia K J, Yin H S, Qian P J, et al. Liver semantic segmentation algorithm based on improved deep adversarial networks in combination of weighted loss function on abdominal CT images[J]. IEEE Access, 2019, 7(99): 9634996358. [3] Kundu A, Vineet V, Koltun V. Feature space optimization for semantic video segmentation[C]//Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas:IEEE, 2016: 31683175. [4] Long J,Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015: 34313440. [5] Chandra S, Kokkinos I. Fast, exact and multi-scale inference for semantic image segmentation with deepgaussian crfs[C]//Proceedings of the 14th European Conference on Computer Vision (ECCV). Amsterdam: Springer, 2016: 402418. [6] Liu Z, Li X, Luo P, et al. Semantic image segmentation via deep parsing network[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015: 13771385. [7] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Proceedings of Neural Information Processing Systems (NeurIPS). Montreal, Canada: NIPS, 2014: 26722680. [8] Luc P,Couprie C, Chintala S, et al. Semantic segmentation using adversarial networks[DB/OL].(20161125) [20201030].https://arxiv.org/pdf/1611.08408.pdf. [9] Hung W, Tsai Y, Liou Y, et al. Adversarial learning for semi-supervised semantic segmentation[DB/OL].(20180724) [20201030].https://arxiv.org/pdf/1802.07934.pdf. [10] 刘贝贝,华蓓. 基于编码器解码器的半监督图像语义分割[J]. 计算机系统应用, 2019, 28(11):182187. Liu Beibei, Hua Bei. Encoder-decoder for semi-supervised image semantic segmentation[J]. Computer Systems & Applications, 2019, 28(11): 182187. [11] 张桂梅,潘国峰. 基于自适应对抗学习的半监督图像语义分割[J]. 南昌航空大学学报:自然科学版, 2019, 33(3): 3240. Zhang Guimei, Pan Guofeng. Semi-supervised image semantic segmentation based on adaptive adversarial learning[J]. Journal of Nanchang Hangkong University: Social Sciences, 2019, 33(3): 3240. [12] 潘国峰. 基于生成对抗网络的语义分割方法研究[D]. 南昌:南昌航空大学硕士论文,2019. Pan Guofeng. Research on semantic segmentation method based on generative adversarial networks[D]. Nanchang: Nanchang Hangkong University, 2019. [13] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Hawaii: IEEE, 2017: 28812890. [14] Everingham M, Van Gool L, Williams C, et al. The PASCAL visual object classes challenge 2012 results [DB/OL].[20200721]. http://www.pascalnetwork.org/challenges/VOC/voc2012/ workshop /index.html. [15] Bulo S, Porzi L Kontschieder P. In-place activated batchnorm for memory-optimized training of DNNs[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE,2018:56395647. [16] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[DB/OL]. (20160107)[20200630].https://arxiv.org/pdf/1511.06434.pdf%c3. [17] Hariharan B, Arbeláez P, Bourdev L, et al. Semantic contours from inverse detectors[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Barcelona, Spain: IEEE, 2011: 991998. [18] Chen L, Papandreou G, Kokkinos I, et al.DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834848.