TY - JOUR
T1 - Deformable ConvNet with aspect ratio constrained NMS for object detection in remote sensing imagery
AU - Xu, Zhaozhuo
AU - Xu, Xin
AU - Wang, Lei
AU - Yang, Rui
AU - Pu, Fangling
N1 - Publisher Copyright:
© 2017 by the author.
PY - 2017/12/1
Y1 - 2017/12/1
N2 - Convolutional neural networks (CNNs) have demonstrated their ability object detection of very high resolution remote sensing images. However, CNNs have obvious limitations for modeling geometric variations in remote sensing targets. In this paper, we introduced a CNN structure, namely deformable ConvNet, to address geometric modeling in object recognition. By adding offsets to the convolution layers, feature mapping of CNN can be applied to unfixed locations, enhancing CNNs' visual appearance understanding. In our work, a deformable region-based fully convolutional networks (R-FCN) was constructed by substituting the regular convolution layer with a deformable convolution layer. To efficiently use this deformable convolutional neural network (ConvNet), a training mechanism is developed in our work. We first set the pre-trained R-FCN natural image model as the default network parameters in deformable R-FCN. Then, this deformable ConvNet was fine-tuned on very high resolution (VHR) remote sensing images. To remedy the increase in lines like false region proposals, we developed aspect ratio constrained non maximum suppression (arcNMS). The precision of deformable ConvNet for detecting objects was then improved. An end-to-end approach was then developed by combining deformable R-FCN, a smart fine-tuning strategy and aspect ratio constrained NMS. The developed method was better than a state-of-the-art benchmark in object detection without data augmentation.
AB - Convolutional neural networks (CNNs) have demonstrated their ability object detection of very high resolution remote sensing images. However, CNNs have obvious limitations for modeling geometric variations in remote sensing targets. In this paper, we introduced a CNN structure, namely deformable ConvNet, to address geometric modeling in object recognition. By adding offsets to the convolution layers, feature mapping of CNN can be applied to unfixed locations, enhancing CNNs' visual appearance understanding. In our work, a deformable region-based fully convolutional networks (R-FCN) was constructed by substituting the regular convolution layer with a deformable convolution layer. To efficiently use this deformable convolutional neural network (ConvNet), a training mechanism is developed in our work. We first set the pre-trained R-FCN natural image model as the default network parameters in deformable R-FCN. Then, this deformable ConvNet was fine-tuned on very high resolution (VHR) remote sensing images. To remedy the increase in lines like false region proposals, we developed aspect ratio constrained non maximum suppression (arcNMS). The precision of deformable ConvNet for detecting objects was then improved. An end-to-end approach was then developed by combining deformable R-FCN, a smart fine-tuning strategy and aspect ratio constrained NMS. The developed method was better than a state-of-the-art benchmark in object detection without data augmentation.
KW - Deformable ConvNet
KW - Non maximum suppression
KW - Object detection
KW - Training mechanism
KW - Very high resolution remote sensing imagery
UR - http://www.scopus.com/inward/record.url?scp=85038212265&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85038212265&partnerID=8YFLogxK
U2 - 10.3390/rs9121312
DO - 10.3390/rs9121312
M3 - Article
AN - SCOPUS:85038212265
VL - 9
JO - Remote Sensing
JF - Remote Sensing
IS - 12
M1 - 1312
ER -