Spectral and Spatial Cloud Detection Onboard for Hyperspectral Remote Sensing Image

It is strongly desirable to accurately detect the clouds in hyperspectral images onboard 1 before compression. However, conventional onboard cloud detection methods are not appropriate 2 to all situation such as shadowed cloud or darken snow covered surfaces which are not identified 3 properly in the NDSI test. In this paper, we propose a new spectral–spatial classification strategy to 4 enhance the orbiting cloud screen performances obtained on hyperspectral images by integrating 5 threshold exponential spectral angle map (TESAM), adaptive Markov random field (aMRF) and 6 dynamic stochastic resonance (DSR). TESAM is performed to classify the cloud pixels coarsely based 7 on spectral information. Then aMRF is performed to do optimal process by using spatial information, 8 which improved the classification performance significantly. Some misclassification points still exist 9 after aMRF processing because of the noisy data in the onboard environment. DSR is used to eliminate 10 misclassification points in binary labeling image after aMRF. Taking level 0.5 data from hyperion 11 as dataset, the average overall accuracy of the proposed algorithm is 96.28% after test. The method 12 can provide cloud mask for the on-going EO-1 images and related satellites with the same spectral 13 settings without manual intervention. The experiment indicate that the proposed method reveals 14 better performance than the classical onboard cloud detection or current state-of-the-art hyperspectral 15 classification methods. 16


Introduction
With the development of Hyperspectral remote sensing technology, hyperspectral imaging techniques have been widely used in many applications, including meteorological, earth observation, military affairs, etc. Meteorological satellite has obvious advantages in monitoring the continuity, spatiality and tendency of qualitative change of atmospheric environment.It provides important information for the omnidirectional monitoring of atmosphere state over the entire globe.Different from the concerns of meteorological satellites, earth observation satellites focus on the change of earth surface, city planning, geological prospecting, military reconnaissance and natural disaster.No matter what the application background is, most remote sensing images encompass the presence of clouds that, especially in the visible and infrared range, strongly affect the received electromagnetic radiation.Historically, clouds cover about 70% of the earth's surface [1] and play a dominant role in the energy and water cycle of our planet.But in this paper, earth's radiative budget or aerosol detection that influenced by clouds are not the main focus of our research.Typically, one hyperspectral image data has over 200 spectral bands, which presents a challenge for both data transmission and storage [2].Future earth exploration missions will face unprecedented data volumes.Data size have steadily increased due to improvements in detector, optics and onboard data handling technology.
Compared with meteorological satellite, the data size of earth observation satellite is larger due to higher spatial resolution and revisiting frequency.Satellite-ground link (download speed) is more tense, which can be refered to the appendix for details.Given the fact that almost all these sensors have only limited memory capacity.And data transmission from the sensors to the ground become inevitable for the further data analysis [3].The large data volumes affect mission requirements for the entire data handling chain, including onboard digitization, storage, downlink, ground processing, and distribution [4].Bottlenecks along this path will constrain the instrument duty cycle, reducing science and application yield [5]. Depending on the application, clouds act as important information for meteorological research [6][7] [8] or, as unwanted corruption for earth observation [9] [10].For the meteorological research users, data of image can be retained and transmitted to ground completely for further research.For non-meteorological research users, as a disturbance factor for earth exploration, cloud shaded the surface features who are the region of interest.As the invalid data, the data of cloud region can be discarded onboard directly.They are two kinds of onboard processing strategy.
Data compression is necessary for onboard processing.But lossy compression methods might not be suitable for hyperspectral images that are used in many accuracy demanding applications.Because the images are intended to be analyzed automatically by computers [11].Bandwidth constraints have motivated new advanced lossless compression techniques such as the KLT algorithm [12][13] [14] that have achieved compression rates of four or greater.Efforts to optimize lossless methods eventually face theoretical limits, but data size continues to increase.The challenge has driven research into other techniques that can further reduce data volumes while preserving science yield.For a specific application, it is very likely that only part region of the entire image carries the information of interest.
Rather than compressing the entire image, sometimes it just needs to compress region of interest (ROI) of the image [15].Compression methods will benefit significantly from simply not compressing those invalid data regions for the sake of a higher compression ratio.Cloud region is regarded as arbitrary shape of region.The ROI map will be generated to describe the shape of cloud region.And ROI map can be encoded by using ARLE [16] algorithm.The ROI map with 400×256 pixels can be maximally compressed into 3200bit whose compression ratio is 1:256.It is 0.002 % of the original data size, which is fairly small.Excising data of cloud region before compression could significantly reduce data size.
However, the community lacks an accurate algorithms of real-time cloud detection in instrument hardware.
Most onboard cloud detection methods are based on the radiometric features of clouds.The "classical" cloud detection applies threshold tests to spectral properties of the image [17][18].Pixels whose values fall outside of valid ranges would be marked as clouds.For example, the algorithms that are corresponding to MODIS compares the selected visible and near-infrared (VNIR) and near-infrared  cloud contamination in acquired scenes [26][27] [28].These onboard cloud detecting methods are based on threshold decision tree (TDT) in general.
Even more complex algorithms which belongs to ground side are proposed.Some state of-the-art cloud-screening techniques estimate the optical path from absorption features such as the oxygen A band, as in Gómez-Chova et al. [29] or Taylor et al [30].Thermal infrared (TIR) channels can add brightness temperature information.Minnis et al. predict clear-sky brightness temperature values using ambient temperature and humidity and then excise pixels outside these intervals [31].
Texture cues can be utilized to recognize clouds by their high spatial heterogeneity [32].The proposed methodology for cloud detection is introduced in section 3. Performance evaluation for different operation scenarios by using a decade-year historical image archive of the "classic" hyperion spectrometer is given in section 4. Section 5 discusses the advantages, limitations and applicability of the proposed method.Section 6 presents the conclusions.

Related work
Current onboard cloud detection usually used the TDT method.The used bands of several typical TDT methods are shown in table 1.All of the three TDT methods use normalized difference snow index (NDSI).The NDSI test has difficulties with shadowed cloud or darken snow covered surfaces [41].
And it also has difficulties with the thin cloud.One scene is shown in figure 1  HCC [23] 0.55µm, 0.66µm, 0.86µm, DCC-ASE [24] 0.43µm, 0.56µm, 0.66µm, 0.86µm, The influence of clouds on solar radiation is due to reflectance, absorption and scattering of the The pure threshold method is a simple, efficient, and practical approach for cloud detection, but its sensitivity to the background and the cloud condition, which makes it impractical for general use [42].Comparing with threshold method, spectral angle map (SAM) have a better cloud detecting performance because of taking advantage of the more spectral information.In this paper, we demonstrate a cloud-detecting algorithm which mainly uses threshold exponential spectral angle map (TESAM), adaptive markov random field (aMRF) and dynamic stochastic resonance (DSR).In order to get an accurate cloud cover region, we presented TESAM-aMRF-DSR method for cloud detecting.The following sections describe the algorithm's theoretical method.

Proposed method
For the questions mentioned above, a new method was proposed.The general framework of the proposed methods is shown in figure.

T-ESAM
SAM calculates the angle θ(x,y), where x and y are N-dimensional spectral, {x i } N i=1 and {y i } N i=1 , respectively, where x, y is the scalar product between x and y x, y = and || • || means the Euclidean norm, i.e, x 2 = x, x .The x means the target spectral vector.And y means the referenced spectral vector.
TDT methods for cloud detection onboard such as ACCA algorithm [26] for multispectral and HCC algorithm [23] for hyperspectral appear to be good discriminators for most of the circumstance.The performance of these cloud detection algorithms are not good enough (75% of the ACCA scores were within 10% of the actual cloud cover content) [26].This situation can be improved under SAM.And we have encapsulated the SAM metric inside an exponential function to produce ESAM function which is a positive semi-definite function.The ESAM function is defined as where k is the gain parameters.The resolution of ESAM is getting lower when k is getting lower.
Generally, k is set as 0.5 (between 0 to 1).ESAM amplifies the angular distance between two vectors.
After the 3-D original hyperspectral image I [L,W,H] processing by ESAM, we can obtain a 2-D computing result.The lowest value indicates most similar spectrum.These data are most likely cloud region if there are clouds in the image.Simultaneously, threshold algorithms also have detected the result of cloud region.Then we can obtain the classifier by combining ESAM with TDT, as shown in figure .4.Through TDT method, we could obtain the mumber of cloud pixels n TDT which is the red solid line.The curve of cumulative frequency can be drawn when the histogram of an image has been calculated.The intersection between n TDT and curve of cumulative frequency locates the threshold value "a" of the histogram of ESAM.where "histogram(ESAM(I,y)=i)" means histogram statistics of the ESAM results between hyperspectral image and referenced spectrum that equals to "i".The g(min) and g(n) indicate the frequencies corresponding to gray level minimum and gray level n respectively.Then we could obtain a classifier parameter g(n) which detects the cloud region coarsely when g(n) satisfies equation (4) and equation ( 5) at the same time.The cloud detection coarse classifier is defined as The observed spectrum of instrument data forms a vector x with multiple spectral channels per pixel.The cloud-screen decision maps these pixel brightness values to a binary classification c=f(x) : , where c 1 represents that there is a cloud present and c 2 represents the event that clear sky is observed.Classifier f(x) detects the cloud coarsely.

Algorithm 1 TDT assisted ESAM
Input: the remote sensing image data I with K pixels, each pixel is N-dimentional spectral vectors Output: the class labels map M step1: for k=1 to K do E_I=ψ(X K , Y) (ψ computes the exponential spectral angle according to Equations( 1)-( 3)).end for k=1 to K do n TA _I=φ(X k ) (φ computes the number of cloud pixels according to TDT) end step2: Computes the histogram of E_I step3: for k=1 to n do g(n)_I = Ω(E_I) (Ω computes the threshold for ESAM according to Equations( 4)-( 5)) end Step4: for k=1 to K do f (x)_I=Υ(E_I) (Υ determine the binary class label according to Equations( 6)) end

aMRF model
The MRF model provide an accurate feature representation of pixels and their neighborhoods.The basic principle of aMRF is to spatial correlation information into the posterior probability of the spectral features.Based on the maximum posterior probability principle, the classic MRF model can be expressed as follows: Where m k and Σ K are the mean vector and covariance matrix of class k respectively.And the neighborhood and class of pixel i are represented by ε i and ψ k respectively.Equation ( 6) separate the pixels of remote sensing image into 2 classes, ground pixels and cloud pixels.The parameter γ i is the weight coefficient, which is used to control the influence of the spatial term.To obtain the local spatial weight coefficients γ i , Haoyang Yu [43] etc. used the noise-adjusted principal components (NAPC) transform to obtain the first principal component to calculate the γ i , where var k represents the class-decision variance of the neighborhood of pixel i as determined by majority voting rules and var i is the local variance of pixel i [44].When RH I i is high, it can be concluded that pixel i is located in a homogeneous region.By contrast, pixel i is on a boundary when RH I i is low.The local spatial weight coefficient when var i =var k ; usually, γ 0 = 1.
According to Equation ( 7), the aMRF model can be devided into two components: the energy of spectral term a i (k) and the energy of spatial term b i (k).Thus, Equation ( 7) can be represented in the form where δ(ψ ki , ψ εi ) is the Kronecker delta function, defined as The pseudocode for the TESAM algorithm combined with the aMRF algorithm, abbreviated as TESAM-aMRF, is shown in Algorithm 2.

Algorithm 2 TESAM-aMRF
Input: the remote sensing image data I with K pixels, each pixel is n-dimentional spectral vectors X = {x i } n i=1 , the referenced spectrum Y={y i } n i=1 , the class labels map M. Output: the class labels map M step1: Computes the labels map M (results of TDT-ESAM) according to Algorithm 1; step2: Computes the m k and Σ k according to class labels map and I; (k=2); step3: Computes the p(x i ) according to Equations ( 7)- (10), where computing the Equations ( 10) with class labels map; step4: Refresh the class labels map M with minimal class of p(x i ); step5: Iterate the procedure of step2-step4;

DSR model
The DSR model here is used to denoise the cloud mask.In analogy to Benzi's double-well model, the binary image pixel value is treated as the position of a particle in the double well.Addition of stochastic energy effects its transition to the strong signal state, just as a particle makes a transition from one well to another.Such a change of state of pixel under noise can be modelled by Brownian motion of a particle placed in a double-well potential system shown in figure.5.The particle A is located in the left well.The state of particle A may or may not turn over in the double well after giving a sochastic energy to A. The location of particle A may be at ponit B if it does not turn over.And the location may be at point C if it turns over.The left and the right well represents black and white pixel of binary cloud mask respectively.
A classic 1D nonlinear dynamic system that exhibits SR is modelled with the help of the Langevin equation of motion is given below This equation describes the motion of a particle of mass m moving in the presence of friction, γ.
The restoring force is expressed as the gradient of some bistable potential function U(x).In addition, there is an additive stochastic force ξ(t) of intensity D.
If the system is heavily damped, the inertial m dt 2 term can be neglected.Rescaling the system in (11) with the damping term γ gives the stochastic overdamped Duffing equation, which is frequently used to model non-equilibrium critical phenomena as given in ( 12) where U(x) is a bistable quartic potential given in Here, a and b are positive bistable double-well parameters.The double-well system is stable at x m = ± a b separated by a barrier of height ∆U = a 2 4b when ξ(t) is zero.The Langevin equation describes the motion of particle in a general double-well.

Algorithm 3 aMRF-DSR
Input: the class labels M Output: the class labels M f inal step1: for k=1 to K do computes the pixel number of 8-neighborhood around pixel k that belongs to ground and cloud respectively); compare C k and G k , designating the number of bigger one to ξ(t); Refresh x according to Equations ( 12)-( 13); end step2: Refresh M step3: Iterate the procedure of step1-step2;

Dataset
In this section, we evaluate the performance of the proposed algorithms by using the widely used hyperspectral data from hyperion sensor of EO-1.The data used in onboard processing is level 0. In meteorological research, clouds are labeled pixel by pixel through particle scattering models.
The single scattering properties of liquid water clouds are calculated from Mie theory [48] and are integrated over a Modified Gamma droplet size distribution.The single scattering properties of ice clouds are obtained from Yang et al [49].Computed single scattering properties (single scattering albedo, asymmetry parameter, extinction efficiency, phase function) for both ice and liquid water clouds are stored in the LUT.However, for earth observed satellite, resolution is higher than meteorological satellite.Particle scattering models can't guarantee that each cloud pixel has been labeled just through the spectrum.The groundtruth of cloud is that manual label uses Visual Cloud-Cover Assessment method (VCCA).This method was used as a measure of the true cloud cover in the scene.The magic wand and freehand lasso tools of Photoshop were used to isolate clouds.The wand employs a seed-fill threshold algorithm to compute regions of brightness similarity based on a mouse click of a single pixel.The algorithm compares the selected pixel's brightness values to all other pixels and retains those within a selectable tolerance threshold.Additional cloud pixels are added by using the wand repeatedly until the cumulative selection of visible clouds had essentially zero possibility of VCCA omission errors.Snowfields and other unwanted bright features were then manually subtracted using the lasso tool to reduce VCCA commission errors.All these works were done by well-trained professional persons.After the VCCA scores were established, the result was a binary cloud mask that allowed a cloud cover percentage computation that served as the cloud "truth" for validating the accuracy of our proposed method.The uncertainty of manual labeling is the border of thin cloud and the cirrus which is floating on the snow especially in visible bands.Therefore, it is necessary to use infrared bands to label cloud pixels in assistant, but choosing which bands to separate cloud pixels from ground pixels maximally depends on surface features.It is another kind of uncertainty.

Accuracy Accessment
The following three different accuracies measures, which were the overall accuracy, precision and recall, were used to assess the accuracy of the algorithm results.Define the True Positive (TP) as the pixel-number of clouds correctly labeled as belonging to clouds in the algorithm, False Negatives (FN) as the pixel-number of clouds incorrectly labeled as belonging to non-cloud, and True Negatives (TN) as pixel-number of non-cloud which also labeled as belonging to non-cloud.The accuracy, precision and recall are then defined as Recall = TP/(TP + FN) ( 14) In the cloud case, precision denotes the proportion of correctly detected cloud pixels in the cloud detection results, while recall is of all pixels that are actually clouds in the image, what fraction of them were detected as clouds.For precision and recall, they are better reflects the errors of cloud classification than overall accuracy.previous iteration.And after 8th iteration, classification has good agreement with the real cloud region.

Detection results
And this image tended to be convergent at the 16 th iteration.that mainly because thin clouds are mixed with other spectrum that can't be learned sufficiently.ROC curve and precision/recall curve can be seen in figure.11and figure.12.

The Effectiveness of Combining Threshold Decision Tree and Spectral Angle Map
Spectral Angle Map is widely used due to its simplicity and geometrical interpretability.SAM is invariant to (unknown) multiplicative scalings of spectra due to differences in illumination and angular orientation.One of the most important properties of the spectral angle distance is the invariance to multiplicative scaling.Due to the invariant nature of the angle among the linearly scaled variations, the spectral angle between two pixels is more sensitive to the pattern (shape) of the spectral signatures instead of absolute intensities.Traditional TDT methods sometimes overestimate or underestimate the cloud region because the fixed parameters were inappropriate to the variant illumination and angular orientation.In theory, TESAM method could reduce the misclassification error.

The Useness of Spatial Information for Cloud Detection
For still existing wrong classification pixels after TESAM, aMRF was used to synthesize all the spectral and spatial information into an energy index to find the class attribute at the regional scales.In general, it achieved the optimal status when the energy was stable, then the iteration procedure terminated.The aMRF mainly choose the bands of vapor reflection (1.38µm∼1.39µmand 1.46µm∼1.55µm).Spectrums of thin cloud pixels and dark cloud pixels deviate threshold a lot.But aMRF can recognize these cloud pixels again.The cloud mask resulted from aMRF contains noisy points because the data processed onboard belongs to level 0.5 that are not fully calibrated.The radiance and reflectance of values for SWIR bands of level 0.5 should be considered as pseudo-radiance and pseudo-reflectance.DSR can eliminate these noisy points of binary mask.It is a refinement process for cloud detection.Figure .12shows that the over all accuracy of aMRF iteration results and DSR iteration results.We randomly selected parts of the dateset.Over all accuracy of each aMRF iteration results is shown in figure .12.(a).The 0 th iteration means over all accuracy after TESAM.During the aMRF iteration, the accuracy of each time increased more or less.The final performance of improved accuracy with aMRF iteration are not all the same because it mainly depends on cloud condition.The convergence criterion for aMRF iteration is that class attribute of 0.5% pixels won't change between two adjacent iterations.And the over all accuracy of each DSR iteration results is shown in figure .12.(b).
The 0 th iteration means over all accuracy after aMRF.During the DSR iteration procedure, the accuracy of each time numerically increased little.But it eliminates many isolated noise-points, which are greatly beneficial to ROI compression.The convergence criterion for DSR iteration is that class attribute of 0.01% pixels won't change between two adjacent iterations.

Error Sources of the proposed Method
In brief, the cloud detection results indicated that the proposed method has a good performance for the detecting cloud in EO-1 images.However, two error sources which might influence the algorithm accuracy should also be pointed out.The first error source came from the way that the cloud region detected by TDT algorithm is bigger than actual size.That's probably because of unsuitable parameters.Correspondingly, TESAM will overestimate cloud areas because size of cloud region is decided by TDT and histogram of TESAM together.In this condition, it will increase FPR region of TESAM results.Because the impure cloud spectrum may lead to error classification of large area under aMRF.The second error is that the selected bands for aMRF may not be optimal for all kinds of surface features.In this case, the advantages of seed region are losed if the contribution from the weight of neighbor is not enough.

Effect of compression based on cloud detection
The compression effect is worth mentioning.The cloud region is filled with special value after obtaining the cloud mask.Then the data of cloud region can be removed through compression.For a scene with 30.12% cloud cover rate of hyperion image, data size of the filled-value one is 71.27% of the original one that both of them have been processed by lossless compression.One hand, it is necessary to consider the difference of lossless compression ratio between ground and cloud.On the other hand, the non-filling cloud's contribution to compression is not as much as the filling cloud's.The statistics of cloud cover and ratio of compression quantity between filling and non-filling cloud regions is shown in figure 13.The regression line is shown that the ratio of compression quantity between filling and non-filling cloud regions is approximately proportional to the cloud cover ratio.The tendency is linear.
And the closer to 1:1 they are, the better compression performance of filling-value it is.Some points that are higher than 1 means that the scene contains little thin cloud.Some points that are close to zero indicate that the scene is totally covered by cloud.

Applicability of the Developed Methods in the Feature
The proposed method is highly automatic and efficient when processing a tremendously large volume of imagery real-time.It can be easily implemented on a parallel processor, such as FPGA.
The proposed method needs some external storage devices or the architectures such as ping-pong structure.Because it could restore data for supporting the use of spatial context.Moreover, classifiers instantiated in hardware logic typically have already achieved in the implementation of arccosine [45], exponentials [46][47] functions, and even floating-point operations, supporting many nonlinear classifiers and naive implementations of linear classifiers.Additionally, there are real-time requirements for processing procedure.Bandwidth of multi DDRs could satisfy Gb/s throughput of algorithm by using a small fixed number of arithmetic operations on locally available data.It also can be adapted for images acquired by similar satellite instruments, which have similar spectral bands and temporal resolutions.The method in this paper is general and efforts in the future will be put into the test for other regions with different environments.

(
NIR) bands to predetermined thresholds and then aggregates the result in different combinations depending on land type[19][20][21][22].These algorithms use a combination of 14 wavelengths and over than 40 tests.This underscores the intrinsic difficulty of constructing a universal and complete cloud screening procedure.We focus on the visible short wave infrared (VSWIR) electromagnetic spectrum from 0.4-2.5 µm.There are many studies of cloud detection in these wavelengths, and algorithms vary in their assumptions and complexity.Of direct relevance to this work, onboard cloud detection has been demonstrated onboard the EO-1 spacecraft[23].EO-1 cloud detection uses the solar zenith angle to compute the apparent top-of-atmosphere (TOA) reflectance.Then it applies a branching sequence of threshold tests based on carefully crafted spectral ratios to distinguish clouds and bright landforms such as snow, ice, and desert sand.The EO-1 cloud detection also acts as a data filtering step prior to onboard cryospheric and flood classification[24][25].To our knowledge, it is the only previous case of cloud screening performed on orbit.Another kind of onboard cloud detection algorithms are mainly based on ACCA.They are used to give cloud-cover (CC) predictions to reduce . Since the complexity of various factors, it's hard to classify the cloud pixels completely only in spectral feature space, as shown in figure.1.(a)and (e).There are some omission errors (yellow region of figure.1.(e)) of cloud detection because the optical thickness of cloud is various.The three spectral curves in figure.1.(b)are sampled from three crosses marked in (a).The spectral difference between thin cloud and thick cloud is obvious especially in the bands of NIR.Near 1.
radiation by cloud particles.It depends strongly on the dimensions, opacity, thickness and composition of the clouds.There are different types of clouds with different dimensions, opacity and properties depending on several parameters, resulting to a different effect on solar radiation.The clouds are divided into ten types as seen in table 2. Ice crystals and water drops have a different impact at the absorption and scattering of solar radiation especially in SWIR.According to statistics from 184 scenes of Hyperion level 0.5 data, the solar reflectance of 10 cloud types and different ground types can be seen in electromagnetic spectrum from 0.4-2.5um,as shown in figure.2.Different clouds may have the different amplitude of reflectance.After normalization, the envelope of the spectral curve are roughly the same, as shown in figure.2.(a).However, different surface features have different spectral reflectance, as shown in figure.2.(b).In this paper, we mainly focus on how to detect cloud pixels instead of recognizing different types of cloud.

PreprintsFigure 1 .
Figure 1.Cloud detection result under TDT method.(a) Original image; (b) spectrum of thick cloud, thin cloud and surface features that are sampled from red, blue and green cross in (a); (c) Two scenes that contain liquid cloud, mixed phase cloud, ice cloud and snow are labeled by box; (d) spectrum of liquid cloud, mixed phase cloud, ice cloud and snow that are sampled from the region in the boxes of (c) correspondingly; (e) cloud detection result under TDT method (red denotes extracted correct cloud region, yellow denotes omission errors and green denotes commission errors) (f) Diagrammatic sketch of misclassification between ground and cloud pixels under TDT method.

Figure 2 .
Figure 2. Spectral curve statistics of cloud and ground reflectance.(a) Normalized spectral reflectance curve of different cloud types.(b) Normalized spectral reflectance curve of different materials.

Figure 3 .
Figure 3. General framework and flowchart of the proposed methods.

Figure 5 .
Figure 5. SR in double-well potential valley.

Figure 6 .
Figure 6.Test dataset description.(a) Geographical distribution of the selected scene; (b) Season distribution of the selected scene; (c) Time distribution of the selected scene; (d) Scene number of each terrain.

Figure. 7
Figure.7 shows the cloud detection results for different terrains.Just by visually comparing the results with the false color composites, it is clear that the algorithm developed in this study achieves good performance in detecting the cloud pixels.Figure.7.(a) was a summer image acquired on 8 th

Figure. 7 .
(a) was a summer image acquired on 8 th August 2013 with cirrostratus over the desert.The detection result indicates that algorithm showed its strong ability in excluding the cloud from desert even if cloud is such thin that spectrum of cloud is mixed with spectrum of desert pixels.On the other hand figure.7.(b) was a winter image acquired on 3 rd June 2013 with dark stratus over the ocean and coast.Clouds contain water droplets that have the same materials of ocean in this season.However, water of the ocean is in liquid and water of clouds is in form of aerosol.Even if the spectrum of the same materials is different due to different forms or temperature.About 1.73% omission errors, the yellow region, which is different from manually labeled cloud mask exist on the border of thin cloud.Figure.7.(c) was a spring image acquired at noon on 22 th May 2012 with cumulus and stratocumulus around Himalaya mountains.And figure.7.(d) was a winter image acquired at dusk on 3 rd January 2007 with altocumulus over mountain with 0.62% omission errors.Comparing figure.7.(c) with figure.7.(d), cloud of the former seems lighter than the later due to smaller sun zenith angle.But both the two image have good cloud detection result even if the darker cloud can be also detected.Figure.7.(f) was a winter image acquired on 28 th March 2005 with cumulus over the city Haerbin.About 0.23% commission errors that both freezing river and city highlights are classified into cloud.About 0.16% omission errors exist in suspected cloud region.

Figure. 7 .
Figure.7.(e),figure.7.(i) and figure.7.(j) are both clouds over the snow or ice.Figure.7.(e) was an image acquired on 12 th May 2012 with stratocumulus over snowfield in the cryosphere.The cloud pixels, about 4.8% pixels of the whole image, are indistinguishable by naked eyes.These pixels are floating over the snow field.There are 0.41% commission errors when compared with manually labeled cloud mask.Figure.7.(i) was a spring image acquired on 17 th March 2007 with altostratus over mountain which is covered by snow.The edge of altostratus looks similar to ground because it doesn't have a clear outline in visible bands.Although there are 2.97% cloud pixels that are hard to be distinguished by eyes in visible bands, they were classified by the proposed method correctly.Figure.7.(j) was a

Figure. 7 .Figure 7 .Figure 8 .
Figure.8 provides an illustration of the algorithm performance for the each processing stage.It depicts the cloud results for EO-1 Hyperion images from four different states.By visually comparing

Figure. 10 Figure 9 .
Figure.10 shows comparison of cloud detection performance of some methods.The terrains from the first row to the last row are ocean, mountain, city, desert, ice and cryosphere.It can be observed that the proposed method produced the best precision ratio and recall ratio and its error was lower than that of the other methods.ACCA have high FN for ordinary terrain and high FP for special terrain due to lacking the thermal infrared band.HCC has difficulty in detecting thin or dark cloud.Decision Theoretical Method(DTM) classifies majority of thin clouds into ground.It is with high FP under DTM.The support vector machine adaptive markov random field (SVM-aMRF) and rolling guidance filter and vertex component analysis network(R-VCANet) have higher recall ratio and precision ratio than the previous two.But they still have some classification error for thin clouds

Figure 11 .
Figure 11.Comparison of performance about the different algorithms.(a)ROC curve of cloud detection performance for each method.(b)Precision performance curve corresponding to Recall for each method.

Figure 12 .Figure 13 .
Figure 12.Statistics of each iteration results of aMRF and DSR.(a)over all accuracy of aMRF iteration results.(b)over all accuracy of DSR iteration results.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 November 2017 doi:10.20944/preprints201705.0014.v2 Peer
-reviewed version available at Remote Sens. 2018, 10, 152; doi:10.3390/rs10010152 [36]inset al. demonstrated that a simple spatial analysis, i.e., the standard deviation of VNIR isotropic reflectance in a 3 × 3 pixel window, reliably discriminated clouds from aerosol plumes over ocean scenes[33].Jin hu Bian et al. proposed spectral signature and spatio-temporal context method to distinguish snow from cloud[34].Markov random field model was developed to segment hyperspectral image.Murtagh et al. represented spatial dependency by using a probabilistic Markov random field prior[35].Haoyang Yu et al. proposed adaptive MRF method combined with SVM and achieve a good performance of classification of terrains[36].Probabilistic model was another kind of cloud detection method.Gómez-Chova et al. use a gaussian mixture model to produce posterior probabilities.The Bayesian probabilistic model of Merchant et al. combines observational data with prior predictions from atmospheric forecasts, leading to true probabilistic predictions[37].David R proposed the decision theoretic method (DTM) which was based on bayesian probabilistic model.The DTM achieved negligible false positives in cloud screening[38].Recently, deep learning was widely used in classification of HSI.Li Wei et al. proposed hyperspectral image classification by using deep pixel-pair features[39].Bin Pan et al. proposed a kind of vertex

Table 1 .
Spectrum used by threshold methods and disadvantage

.55-1.75µm It
use NDSI=(ρ .56-ρ 1.65 )/(ρ .56+ρ 1.65 ) index which contains spectral bands near 1.65µm to discriminate snow and cloud.But sometimes the snow covered surfaces and clouds cannot be classified clearly under NDSI because the reflectance features of cloud and snow particles sometimes are similar in particular spectrum.

Table 2 .
Characteristic of 10 types of cloud

Table A1 .
Meteorological satellite VS Earth observation satellite