Deformable ConvNet with aspect ratio constrained NMS for object detection in remote sensing imagery

Zhaozhuo Xu, Xin Xu, Lei Wang, Rui Yang, Fangling Pu

Research output: Contribution to journalArticlepeer-review

117 Scopus citations

Abstract

Convolutional neural networks (CNNs) have demonstrated their ability object detection of very high resolution remote sensing images. However, CNNs have obvious limitations for modeling geometric variations in remote sensing targets. In this paper, we introduced a CNN structure, namely deformable ConvNet, to address geometric modeling in object recognition. By adding offsets to the convolution layers, feature mapping of CNN can be applied to unfixed locations, enhancing CNNs' visual appearance understanding. In our work, a deformable region-based fully convolutional networks (R-FCN) was constructed by substituting the regular convolution layer with a deformable convolution layer. To efficiently use this deformable convolutional neural network (ConvNet), a training mechanism is developed in our work. We first set the pre-trained R-FCN natural image model as the default network parameters in deformable R-FCN. Then, this deformable ConvNet was fine-tuned on very high resolution (VHR) remote sensing images. To remedy the increase in lines like false region proposals, we developed aspect ratio constrained non maximum suppression (arcNMS). The precision of deformable ConvNet for detecting objects was then improved. An end-to-end approach was then developed by combining deformable R-FCN, a smart fine-tuning strategy and aspect ratio constrained NMS. The developed method was better than a state-of-the-art benchmark in object detection without data augmentation.

Original languageEnglish
Article number1312
JournalRemote Sensing
Volume9
Issue number12
DOIs
StatePublished - 1 Dec 2017

Keywords

  • Deformable ConvNet
  • Non maximum suppression
  • Object detection
  • Training mechanism
  • Very high resolution remote sensing imagery

Fingerprint

Dive into the research topics of 'Deformable ConvNet with aspect ratio constrained NMS for object detection in remote sensing imagery'. Together they form a unique fingerprint.

Cite this