Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Instructional Mask AutoEncoder: A Powerful Pretrained Model for Hyperspectral Image Classification

Version 1 : Received: 7 September 2023 / Approved: 8 September 2023 / Online: 8 September 2023 (07:42:04 CEST)

A peer-reviewed article of this Preprint also exists.

Kong, W., Liu, B., Bi, X., Pei, J., & Chen, Z. (2023). Instructional Mask AutoEncoder: A Scalable Learner for Hyperspectral Image Classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. Kong, W., Liu, B., Bi, X., Pei, J., & Chen, Z. (2023). Instructional Mask AutoEncoder: A Scalable Learner for Hyperspectral Image Classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

Abstract

"Finding fresh water in the ocean of data." is a challenge that all deep learning domains struggle with, especially in the area of hyperspectral image analysis. As hyperspectral remote sensing technology advances by leaps and bounds, there are increasing amounts of hyperspectral images(HSIs) can be available. Whereas in fact, these unlabeled HSIs are powerless to be used as material to driven a supervised learning task due to the extremely expensive labeling costs and some unknown regions. Although learning-based methods have achieved remarkable performance due to their superior ability to represent features, at the cost, these methods are complex, inflexible and tough to carry out transfer learning. In this paper, we propose the "Instructional Mask AutoEncoder"(IMAE), which is a simple and powerful self-supervised learner for HSI classification that uses a transformer-based mask autoencoder to extract the general features of HSIs through a self-reconstructing agent task. Moreover, we utilize the metric learning to perform an instructor which can direct the model focus on the human interested region of the input so that we can alleviate the defects of transformer-based model such as local attention distraction, lack of inductive bias and tremendous training data requirement. In downstream forward propagation, instead of global average pooling, we employ a learnable aggregation to put the tokens into fullplay. The obtained results illustrate that our method effectively accelerates the convergence rate and promotes the performance in downstream task.

Keywords

Self-supervised; Pretrained Model; Transfer learning; Metric Learning; Transformer; Mask AutoEncoder; Hyperspectral Image Classification

Subject

Environmental and Earth Sciences, Geochemistry and Petrology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.