Deep Semantic Domain Adaptation for Remote Sensing Image Classification: A Universal Approach
Keywords:
Remote sensing, Image scene classification Deep learning (DL), Convolutional neural network (CNN) Feature learning, Spatial information, multi-source data, multi-granularity feature learning module, Socio-economic semantic features, Attention-based poolingAbstract
Remote sensing image scene classification with deep learning (DL) is a rapidly growing field that has gained significant attention in the past few years. While previous review papers in this domain have been confined to 2024, an up-to-date review to show the progression of research extending into the present phase is lacking. Remote sensing image scene classification is one of the most challenging problems in understanding high resolution remote sensing images. Deep learning techniques, especially the convolutional neural network (CNN), have improved the performance of remote sensing image scene classification due to the powerful perspective of feature learning and reasoning. However, several fully connected layers are always added to the end of CNN models, which is not efficient in capturing the hierarchical structure of the entities in the images and does not fully consider the spatial information that is important to classification. Deep learning approaches are gaining popularity in image feature analysis and in attaining state-of-the-art performances in scene classification of remote sensing imagery. To tackle the challenges, we propose a novel scene classification model that integrates heterogeneous features of multi-source data. Firstly, a multi-granularity feature learning module is designed, which can conduct uniform grid sampling of images to learn multi-granularity features. In this module, in addition to the features of our previous research, we also supplemented the socio-economic semantic features of the scene, and attention-based pooling is introduced to achieve different levels of representation of images. Then, to reduce the dimension of the feature, we adopt the feature-level fusion method. Next, the maxout-based module is designed to fuse the features of different granularity and extract the most distinguishing second-order latent ontology essence features. Experimental results show that the proposed scene classification method can achieve the most advanced results on limited datasets.