Robust Regression Estimators in Magnetotelluric Data Processing: Performance & Evaluation

Wenjing Shan; Chengliang Xie

doi:10.20944/preprints202606.1085.v1

Submitted:

12 June 2026

Posted:

17 June 2026

You are already at the latest version

Abstract

Magnetotelluric (MT) transfer functions are commonly estimated using statistical regression methods. Reliable MT-derived geoelectrical structures are essential for investigating deep geodynamic processes and metallogenic systems, while noise contamination may distort impedance estimates and lead to misleading geological interpretations. In this study, a MM-estimator was introduced into MT data processing and evaluated together with M- and S-estimators using synthetic and field datasets. Under high signal-to-noise ratio (SNR) conditions, all three methods produced generally consistent apparent resistivity and phase responses, although S-estimation showed relatively scattered results. As the SNR decreased, the M-estimation exhibited stronger fluctuations and local outliers, whereas MM-estimation maintained the smoothest and most stable responses, demonstrating superior robustness against mixed-noise contamination. Field-data experiments further showed that MM-estimation results were highly consistent with those obtained using the remote reference method, while M- and S-estimation displayed larger deviations in the mid- and long-period bands. These results indicate that the MM-estimation can effectively suppress complex background noise and improve the stability and reliability of MT impedance estimation, enhancing the positive role of subsurface resistivity models in mineralization mechanism studies.

Keywords:

magnetotelluric

;

robust regression

;

impedance tensor

;

MM-estimation

;

S-estimation

Subject:

Environmental and Earth Sciences - Geophysics and Geology

1. Introduction

As a conventional electromagnetic (EM) geophysical method, magnetotelluric (MT) is widely used to investigate subsurface geo-electrical structures, including crustal and/or lithospheric architecture, deep geodynamic processes, resource exploration targets, and orogenic deformation systems. In particular, subsurface electrical structures provide important constraints on metallogenic systems because conductive anomalies are commonly associated with magma reservoirs, fluid migration pathways, partial melts, and fault-controlled weak zones that play critical roles in mineral deposits-forming processes. Mounts of classical MT studies associated with mineral deposits mechanism were published in the last decades [1,2,3,4,5,6], one of these studies, Heinson et al. [3,4] show an impressive resistivity model in the Olympic Dam district, Australia, demonstrating that MT-derived electrical structures can provide critical information for understanding of mineralization. However, it is generally challenging to observe ideal EM fields data in mining areas, that are usually dominated by complex and strong industrial background noises. Consequently, the reliability of geo-electrical model and the following geological interpretations are limited by strongly distorted MT transfer functions.

MT transfer functions are derived from orthogonal naturally time-varying EM fields. Partial transfer functions and power spectral density matrices are computed from discrete spectra obtained using short-time Fourier transforms within individual time windows. Under the assumption of noise-free conditions, the frequency-dependent impedance tensor

Z (ω)

is defined as [7–9],

\begin{matrix} (\begin{matrix} E_{x} (ω) \\ E_{y} (ω) \end{matrix}) = (\begin{matrix} Z_{x x} (ω) & Z_{x y} (ω) \\ Z_{y x} (ω) & Z_{y y} (ω) \end{matrix}) (\begin{matrix} H_{x} (ω) \\ H_{y} (ω) \end{matrix}) (1) \end{matrix}

[2] where

H_{x}

and

H_{y}

denote the magnetic field components, and

E_{x}

and

E_{y}

denote the electric field components. In practice, EM signals are generally contaminated with background noises and a linear regression system is typically expressed as,

\begin{matrix} r (ω) = E (ω) - Z (ω) H (ω) (2) \end{matrix}

where

r (ω)

is residual of the dependent variable (i.e.,

E (ω)

). The ordinary least-squares (LS) method minimizes the Euclidean norm of the residuals and is therefore sensitive to outliers when the Gaussian noise assumption is violated. Since the early development of MT, various statistical approaches have been proposed for robust impedance estimation [10–17] .

On the other hand, outliers associated with predictor variables, i.e., magnetic field data, commonly referred to as leverage points [18,19], cannot be effectively handled by conventional robust approaches. The remote reference technique was proposed to suppress local magnetic noises that are uncorrelated with the noise at reference stations [20,21,22]. In cases where a remote reference station is available, both robust and remote reference methods are employed in combination [12,15,23,24]. However, if both local and reference channels are contaminated by correlated noises, the results from the remote reference method will be distorted [25,26]. Different from the classical remote reference approach, in which the cross-power spectral density between the local and reference magnetic fields is introduced into the spectral density matrix, other algorithms have been proposed to suppress leverage points associated with noise in the local magnetic field. The two-stage bounded influence estimator is one of the effective methods used to remove correlated noise in local EM fields using clean remote stations [18,27]. Since the remote reference technique can be considered a two-input–multiple-output linear system, Usui et al. [17] proposed a robust remote reference estimator based on robust multivariate linear regression. Ogawa et al. [28] separated signals from noise-affected components by applying a frequency-domain independent component analysis to local EM data and reference magnetic data. A similar approach based on frequency-domain independent component analysis was proposed by Sato et al. [29]. It is worth noting that high-quality reference data are generally required by both conventional and derivative remote reference methods. Furthermore, when array MT stations and synchronous data are available, a robust multivariate errors-in-variables estimator that utilizes principal component analysis is proven to be a practical approach for extracting signals corresponding to the two polarizations of plane-wave MT fields [30,31,32]. Smirnov and Egbert [33] further extended the multiple-station technique [31] to be applicable to large incomplete arrays, performing practical deployment strategies such as the “rolling array”.

Pre-screening of data segments in the frequency domain is generally employed prior to robust estimation and/or remote reference methods to exclude severely noisy data. Different pre-selection strategies have been suggested in previous studies, and here we list some of the proposed criteria. To deal with low signal-to-noise ratios in the dead band, Egbert and Livelybrooks [14] selected only data segments with high electric-to-magnetic field multiple coherence. Signals meeting a required magnitude that is significantly higher than the self-noise (sensitivity) of magnetic sensors were selected as effective data [34]. In auroral zones, limited vertical magnetic field variations were considered as data contaminated by non-uniform sources [35]. In a more general case, two partial coherences calculated from orthogonal electric and magnetic components were suggested as criteria for removing highly noisy segments [16]. Moreover, physical properties, including spectral power densities, polarization directions, and coherences, were also evaluated to improve MT transfer functions [36], and a further automated pre-selection approach based on Mahalanobis distance and magnetic field constraints was developed by Platz and Weckmann [37]. A multi-criteria data sorting strategy was suggested in the study by Wang et al. [38].

In addition, various approaches have been employed in the time domain to enhance data quality, including time-series denoising using mathematical transforms [39,40], decomposition methods [41,42], digital signal filtering [43], autoregressive models [16], and signal reconstruction algorithms based on compressive sensing [44], as well as many other algorithms not listed in this paper. It should be acknowledged that artificial neural networks offer distinct advantages in time-series data cleaning [45,46]. Conventional artificial neural networks have been applied to MT data processing and perform well for typical synthetic noise [47,48,49,50,51,52]; however, misfits are inevitable when the data contain complex, persistent, or long-duration noise. Furthermore, supervised learning–based approaches typically require time-consuming manual identification and labeling [45], and the robustness of the trained models depends on relatively large datasets, which can be challenging to obtain. Unsupervised learning–based methods typically require careful selection of parameters to achieve optimal performance [53,54,55], which remains challenging in practical applications.

Although various approaches in both the time and frequency domains are generally employed to obtain final transfer functions, robust techniques combined with the remote reference method remain the most practical and widely adopted approaches for addressing the frequency-dependent linear regression relationships between orthogonal EM fields. In practice, the most widely used approach is M-estimation [56], which generalizes the maximum likelihood estimation framework. The M-estimation employs the median absolute deviation as a univariate scale estimator, which is independent of the overall data distribution [57]. An alternative approach is the S-estimator [58], which is based on robust scale estimators. Usui et al. [17] applied the S-estimator to their multiple-output system and illustrated the differences between the M- and S-estimators using a simple linear regression problem. To combine the advantages of both M- and S-estimation, the robustness and high breakdown point of S-estimation and the efficiency of M-estimation, MM-estimation was subsequently proposed by Yohai [59]. Theoretically, the performance of different estimators is expected to vary across typical statistical scenarios.

It is difficult to evaluate the contribution of each step (time- and/or frequency-dependent algorithms) to noise suppression from time-series data to transfer functions because of complex and varying EM noise. Although a combination of all possible approaches in both the time and frequency domains seems to be an optimal strategy, it is still meaningful to analyze the performance of a specified method. In this study, we developed a robust estimator based on the MM-estimation algorithm and applied the M-, S-, and MM-estimators to both synthetic and filed MT datasets to evaluate their performance. Our study suggests that a comprehensive comparative analysis of results obtained from different estimators combined with remote reference techniques should be conducted in practical applications, enhancing studies on mineralization mechanism.

2. Theory and Methods

2.1. MT Data Processing

Here, we briefly describe the basic procedures of MT data processing, while details are introduced in many previous studies [16,17,27,60]. Time-series data of observed orthogonal EM fields are generally checked and/or corrected using time-domain algorithms, including filtering, artificial neural networks, decomposition methods, etc. Pre-cleaned time-series data are then subdivided into overlapping data segments (time windows), and subsequent cascade decimation with low-pass filtering is generally performed. In general, analysis and processing proceed for each data segment as follows: (1) Pre-whitening is performed to remove long-period trends and mean values. (2) The segment is tapered using a Hanning or Hamming window to reduce spectral leakage. (3) Fast Fourier transforms (FFTs) are performed to obtain Fourier coefficients for all channels, and calibration correction is also applied. (4) A spectral density matrix consisting of auto- and cross-spectral values is calculated. (5) Frequency-domain pre-selection is carried out using coherence sorting and other methods.

As a result of the above steps, spectral density matrices from all segments at specified evaluation frequencies are prepared for the subsequent regression procedures.

2.2. Robust Estimation

M-estimation [56], developed within the maximum-likelihood framework, provides a robust generalization of the classical LS method. For the linear regression problem defined in Equations (1) and (2), M-estimation introduces a general loss function, ρ, and minimizes the following objective function,

\begin{matrix} {\hat{θ}}_{M} = \min_{\hat{θ}} \sum_{i = 1}^{N} ρ (r_{i}) (3) \end{matrix}

where “M” denotes “maximum likelihood–type,” and N denotes the total number of observations, defined as the product of the number of observation equations (for either

E_{x} (ω)

or

E_{y} (ω)

) and the number of time windows. And a scale estimate based on the median absolute deviation is typically used, which is largely independent of the overall dataset [57].

In practice, the residuals are standardized as shown in Equation (4) [61],

\begin{matrix} \sum_{i = 1}^{N} ψ (\frac{r_{i}}{\hat{σ}}) H_{i} = 0 (4) \end{matrix}

where

ψ

is the derivative of

ρ

, and the Huber function [56] is commonly employed.

\hat{σ}

is a scale parameter,

\begin{matrix} \hat{σ} = \frac{M A D}{0.6745} = \frac{m e d i a n |r_{i} - m e d i a n (r_{i})|}{0.6745} (5) \end{matrix}

To solve the nonlinear optimization problem in Equation (4), the iteratively reweighted least squares (IRLS) algorithm is typically used. The core idea is to transform the objective function of robust estimation into a weighted least-squares problem, where the weights depend on residuals from the previous iteration. Specifically, the weight matrix at the

k

-th iteration is defined as

W^{(k)} = d i a g \{ω (r_{1}^{(k)}), \dots, ω (r_{n}^{(k)})\}

, where

ω

(⋅) is a weight function of standardized residuals,

\begin{matrix} \sum_{i = 1}^{N} ω_{i} (\frac{r_{i}}{\hat{σ}}) H_{i} = 0 (6) \end{matrix}

Consequently, the updated estimator in Equation (6) can be explicitly formulated as follows,

\begin{matrix} {\hat{Z}}^{(k + 1)} = {(H^{T} W^{(k)} H)}^{- 1} H^{T} W^{(k)} E (7) \end{matrix}

This iterative procedure continues until convergence, effectively attenuating the influence of outliers by assigning them lower weights according to their relative residuals.

In this study, we employed Tukey’s bisquare weight function (Equation (8)) [62] as the weighting scheme in the estimation procedure,

\begin{matrix} ω_{i} (u_{i}) = \{\begin{matrix} {[1 - {(\frac{u_{i}}{c})}^{2}]}^{2}, |u_{i}| \leq c \\ 0, |u_{i}| > c \end{matrix} (8) \end{matrix}

where

u_{i}

denotes the standardized residual, i.e.,

u_{i} = \frac{r_{i}}{\hat{σ}}

. Under the assumption of a normal distribution, the estimator achieves approximately 95% asymptotic efficiency when

c = 4.685

.

To overcome the limitations of the median-based scale in M-estimation, S-estimation employs the residual standard deviation [57] and is formulated to minimize the scale of the residuals as shown in Equation (9) [61]:

\begin{matrix} {\hat{θ}}_{s} = a r g \min_{\hat{θ}} {\hat{σ}}_{s} \{r_{1} (θ), \dots, r_{n} (θ)\} (9) \end{matrix}

where

{\hat{σ}}_{s}

is obtained by solving Equation (10),

\begin{matrix} \frac{1}{N} \sum_{i = 1}^{N} ρ (\frac{r_{i}}{{\hat{σ}}_{s}}) = K (10) \end{matrix}

where

K

represents the expected value of the chosen loss function

ρ

under the standard normal distribution. For high-breakdown-point S-estimation, a commonly used value is

K \approx 0.199

. The initial estimate of

{\hat{σ}}_{s}

is obtained from Equation (5), and the subsequent iterative update formula is given as follows,

\begin{matrix} {\hat{σ}}_{s}^{(k + 1)} = {\hat{σ}}_{s}^{(k)} \sqrt{\frac{1}{N K} \sum_{i = 1}^{N} ρ (\frac{r_{i}}{{\hat{σ}}_{s}^{(k)}})} (11) \end{matrix}

S-estimation [58] incorporates a robust scale estimator and attenuates the influence of outliers by minimizing a scale-based criterion of the residuals. Due to the non-convex nature of the S-estimation objective function, a multi-start strategy is employed, where each run is initialized from a random subset. The final solution is selected as the one yielding the minimum scale.

One key distinction between S-estimation and M-estimation is that S-estimation minimizes its objective function by iteratively updating the scale parameter

{\hat{σ}}_{s}

, until no further decrease is observed. Accordingly, Equation (4) can be reformulated into the form given in Equation (12),

\begin{matrix} \sum_{i = 1}^{n} ψ (\frac{r_{i}}{{\hat{σ}}_{s}}) H_{i} = 0 (12) \end{matrix}

M-estimation exhibits good robustness against outliers but remains sensitive to high-leverage points. In contrast, S-estimation achieves extremely high robustness by maximizing the breakdown point through constrained scale estimation; however, this robustness comes at the cost of increased computational complexity. Therefore, S-estimation is typically used as the initial stage of MM-estimation, while the subsequent refinement is carried out using M-estimation. MM-estimation [59] combines the advantages of both approaches, achieving the high breakdown robustness of S-estimation while retaining the high statistical efficiency of M-estimation, thereby overcoming the limitations associated with using either method alone.

Specifically, MM-estimation begins by computing a high-breakdown scale parameter

{\hat{σ}}_{s}

(Equation (10)) using S-estimation. Based on this initial scale estimate, an M-estimation is subsequently constructed using the weight function in Equation (8), thereby ensuring high statistical efficiency. The final MM-estimator is then obtained by solving the following estimating equation,

\begin{matrix} \sum_{i = 1}^{n} ψ (\frac{r_{i}}{{\hat{σ}}_{M M}}) H_{i} = 0 (13) \end{matrix}

In practice, the three approaches are typically implemented using an iterative strategy, in which specific scale definitions and estimating functions are applied for M- and S-estimation, while MM-estimation is achieved by combining the M-estimator with the scale obtained from S-estimation.

Figure 1 illustrates the algorithmic workflows of the three robust estimation methods. The detailed procedures are described as,

Initialization: The impedance tensor $Z_{0}$ is initialized using an ordinary least-squares ( O LS) method.
Residual computation: For the current estimate $Z_{i}$ , the residuals $r_{i}$ are calculated from the observed magnetic (H) and electric (E) field components.
Scale estimation and residual updating:

M-estimation: Compute the scale parameter $\hat{σ}$ using the median absolute deviation (MAD), and update $r_{i}$ using the M-estimator with $\hat{σ}$ .

S-estimation: Initialize the scale parameter ${\hat{σ}}_{s}$ using MAD, then update ${\hat{σ}}_{s}$ according to Equation (10). The residuals $r_{i}$ are updated using the S-estimator with the updated ${\hat{σ}}_{s}$ .

MM-estimation: Update $r_{i}$ using the M-estimator together with the scale parameter ${\hat{σ}}_{s}$ obtained from S-estimation.
Impedance tensor update: The impedance tensor $Z_{i}$ is updated based on the weighted residuals $r_{i}$ .
Iteration: Reiterate steps 2–4 until the number of iterations reaches a defined maximum value.

3. Implementation

3.1. Synthetic Experiments

3.1.1. Synthetic Data

To better demonstrate that the synthetic experiments are fully driven by artificially imposed noise, we prepared a clean time-series dataset by computing the electrical field response of a theoretical model to observed magnetic field data. Raw MT time-series data were sampled at 0.5 Hz with a total duration of approximately 10 days using a LEMI-423 instrument. The measured magnetic flux density

B (t)

was first converted into magnetic field intensity

H (t)

using

\begin{matrix} B (t) = μ_{0} H (t) (14) \end{matrix}

where

μ_{0}

is the magnetic permeability of free space. In the frequency domain, the horizontal electric field components were computed using the impedance tensor,

\begin{matrix} E_{x} (ω) = Z_{x y} (ω) H_{y} (ω) (15) \end{matrix}

\begin{matrix} E_{y} (ω) = Z_{y x} (ω) H_{x} (ω) (16) \end{matrix}

We defined a homogeneous half-space model, and the impedance is defined as,

\begin{matrix} Z (ω) = \sqrt{\frac{i ω μ_{0}}{σ}} (17) \end{matrix}

where

σ

is electrical conductivity.

To mitigate spectral bias caused by long time series, the data were processed using a segmented windowing approach. Each segment was transformed into the frequency domain via FFT, multiplied by the impedance, and then transformed back using inverse FFT. The final time series were reconstructed using weighted overlap-add, where the weights are derived from the window functions.

The theoretical magnetic fields were then reconstructed from the synthetic electric field. A stabilized inverse formulation was adopted,

\begin{matrix} H (ω) = \frac{E (ω) Z^{*} (ω)}{{|Z (ω)|}^{2} + λ} (18) \end{matrix}

where λ is a small regularization parameter to ensure numerical stability.

The vertical magnetic field was obtained using tipper transfer functions,

\begin{matrix} H_{z} (ω) = T_{x} (ω) H_{x} (ω) + T_{y} (ω) H_{y} (ω) (19) \end{matrix}

and both horizontal and vertical magnetic fields were subsequently converted back to magnetic flux density.

3.1.2. Robust Estimations

To evaluate the robustness of the proposed estimation methods, different types of noise, including Gaussian, square, peak, and sawtooth noise, were added to clean synthetic MT time-series data. The amplitude–frequency characteristics of the noise were controlled using the signal-to-noise ratio (SNR),

\begin{matrix} S N R = 20 {l o g}_{10} (\frac{σ_{s i g n a l}}{σ_{n o i s e}}) (20) \end{matrix}

To simulate realistic non-stationary EM contaminations, noise was superimposed on randomly selected subsets of samples. Table 1 shows the different parameter settings for the four types of noise added to the time series, which were subsequently normalized using a unified SNR control scheme. Specifically, 30%, 50%, and 70% of sample points were randomly selected using a fixed random seed to ensure reproducibility, and noise was added to all channels. This strategy preserved the temporal correlation of noise across different channels and more realistically simulated EM interference in practical environments. Two SNR levels were set to 30 and 5, respectively.

Figure 2 presents comparisons between clean and noise-contaminated time series with an SNR of 30 dB. Regular preprocessing procedures mentioned above were conducted to both synthetic and observed datasets in this study. The mean value of each channel was removed to suppress DC components and reduce low-frequency spectral leakage. Seven levels of decimation were employed. At each level, the sampling frequency was reduced by a factor of 8. The evaluation frequencies were logarithmically distributed with seven frequencies per octave to ensure adequate spectral resolution across the investigated bandwidth. The decimated time series were segmented into overlapping windows of 512 samples with 25% overlap. A Hamming window was applied prior to Fourier transform to suppress spectral leakage and reduce sidelobe effects. Fourier transforms were performed on each windowed segment to obtain complex spectra for subsequent impedance tensor estimation. Frequency-domain quality control was primarily based on correlation analysis between the electric and magnetic field components.

Figure 3 presents the MT sounding results for three synthetic noise contamination cases under an SNR of 30 dB, where Gaussian, square-wave, peak, and sawtooth noise were superimposed onto the original clean time series. The contaminated proportions of the time series were set to 30%, 50%, and 70%, respectively. Overall, as the noise contamination ratio increases from 30% to 70%, the apparent resistivity and phase curves obtained using the three robust estimation methods remain relatively stable and consistent. Only the S-estimation exhibits a slight apparent resistivity deviation in the high-frequency band, while no significant distortion is observed in the phase results. These observations indicate that, under an SNR of 30 dB, all three robust estimation methods can effectively suppress the influence of mixed noise.

In comparison, the S-estimation exhibits relatively more pronounced scatter at several frequency points, followed by the M-estimation, whereas the MM-estimation shows the fewest outliers. It is noteworthy that, although the noise contamination ratio increases continuously from 30% to 70%, the phase curves do not exhibit obvious collapse or large-scale distortion. This indicates that all three robust estimation algorithms can effectively reduce the weights of anomalous spectral windows and non-stationary noise segments under low-noise conditions, thereby suppressing the influence of transient interference and intermittent noise on impedance tensor estimation. Among them, the MM-estimation demonstrates comparatively superior robustness.

Figure 4 presents the final weight distributions of the impedance tensors at ~181 s using different robust estimation methods. The weights were computed based on the linear regression models in Equations (1) and (2), and were iteratively updated during the optimization process of each robust estimator. For the

Z_{x y}

and

Z_{y x}

components, a common characteristic across all three cases is that, as the mixed-noise contamination ratio increases, all three methods are still able to maintain most high-weight points clustered within relatively stable magnitude and phase regions. However, for components with phases around 45

°

and

-

135

°

, the MM-estimation assigns the largest number of high-weight points. Combined with the previously presented apparent resistivity and phase results (Figure 3), these observations demonstrate that the MM-estimation exhibits superior robustness and noise resistance under high mixed-noise contamination conditions.

To compare the robustness of the three methods more clearly, the noise intensity was further increased by setting the SNR to 5 dB, while keeping all other conditions unchanged. Figure 5 presents a comparison between clean and noise-contaminated time series with an SNR of 5 dB.

Figure 6 presents the MT sounding results under the condition of SNR = 5 dB for different levels of mixed-noise contamination. Panels (a-c) correspond to 30%, 50%, and 70% of the time series being contaminated by a combination of Gaussian, square-wave, peak, and sawtooth noise, respectively. Compared with the results obtained at SNR = 30 dB, the apparent resistivity curves under the SNR of 5 dB exhibit more fluctuations and scattered points for all three robust estimation methods, particularly in the high-frequency band. This indicates that, as the background noise level increases, the influence of mixed noise on impedance estimation becomes increasingly significant. Nevertheless, despite the intensified noise contamination, all three methods are still able to preserve the overall structural characteristics of the apparent resistivity and phase curves, indicating that the robust estimation procedures retain the ability to suppress anomalous data under strong-noise conditions. Specifically, as the noise contamination ratio increases, the apparent resistivity curves obtained using the MM-estimation remain smoother than those produced by the M- and S-estimation methods. It is noteworthy that when the contamination ratio increases from 30% to 70%, the short-period apparent resistivity responses are affected much more strongly than the mid- and long-period responses. This is likely because short-period responses are more sensitive to high-frequency transient noise, while the peak and square-wave components of the mixed noise introduce stronger interference in local spectral estimation. Under such low-SNR conditions, the MM-estimation demonstrates relatively better overall stability and stronger resistance to outliers, indicating that this method possesses strong robustness under severe mixed-noise contamination.

Figure 7 presents the weight distributions of the impedance tensors at ~181 s under a noisy condition of SNR = 5 dB. Compared with the results at SNR = 30 dB (Figure 4), the scatter distributions in Figure 7 exhibit stronger dispersion, with a substantial increase in peripheral low-weight points, indicating enhanced instability of impedance estimation under stronger mixed-noise interference. Nevertheless, all three robust estimation methods still maintain most high-weight points concentrated within relatively stable regions, demonstrating their ability to identify anomalous spectral windows and suppress noise contamination. Among them, the MM-estimation exhibits the most pronounced clustering behavior. Its high-weight points are mainly concentrated near the stable phase direction, whereas the peripheral anomalous scatter points are generally assigned relatively low weights, resulting in a clearer weight-separation pattern. Combined with the results shown in Figure 4, it can be concluded that the MM-estimation achieves a better balance between anomalous spectral-window rejection and valid spectral-window preservation. Because the MM-estimation combines the high breakdown point of the S-estimation with the high statistical efficiency of the M-estimation, it can more effectively distinguish valid subsurface responses from noise-dominated anomalous spectral windows under strong mixed-noise conditions, thereby maintaining more stable impedance estimation results. These observations further demonstrate that the MM-estimation exhibits superior robustness and noise suppression capability under high mixed-noise contamination conditions.

3.2. Application to Observed Data

To further evaluate performance of the robust estimation methods, a remote reference technique was introduced in the processing of field data. The experimental data are from a northwest – southeast oriented broadband MT profile deployed in the northeastern china [63]. The profile starts from Songliao block and goes into the Xing’an block, regions mineral deposits and oil and gas fields are developed. The field data were acquired in April 2016 in Inner Mongolia, China, using the ADU instrument. A station located approximately 15 km away from the local station was used as a remote reference station, and a total of 16 hours of data were recorded synchronously. Our synthetic experiments focused on long-period MT data with a low sampling rate of 0.5 Hz. Therefore, field data recorded at higher sampling rates of 128 Hz and 4096 Hz were used to more comprehensively evaluate the performance of the three algorithms, suggesting that the proposed methods are applicable to both broadband and long-period MT data, and can effectively handle multi-scale electromagnetic signals.

Figure 8 compares the results from different methods. Overall, the four methods exhibit consistent trends over most of the investigated period range; however, noticeable differences can still be observed in the mid- to long-period band. Among the three robust estimation approaches, M-estimation exhibits the most pronounced outliers and local fluctuations, particularly in the apparent resistivity curves around the intermediate-period range. S-estimation also shows a relatively large number of scattered points and oscillatory variations, although its deviations are generally smaller than those of M-estimation. In contrast, the MM-estimation results are highly consistent with those obtained using the remote reference method, with the corresponding apparent resistivity and phase curves almost completely overlapping over most of the period band. In particular, the MM-estimation results exhibit smoother transitions and fewer abnormal deviations in the mid- to low-frequency range, indicating a more stable recovery of the MT transfer functions under complex field-noise conditions.

The observed differences among the methods are closely related to their respective robust weighting strategies and statistical properties. These robust estimations are primarily based on iterative residual weighting, where the influence of outliers is gradually reduced through a weighting function [64]. However, due to the relatively mild weight adjustment strategy of M-estimation, its ability to suppress outliers remains limited. S-estimation also exhibits a relatively large number of deviated points, which is likely because, although it achieves a high breakdown point by minimizing a robust scale, its statistical efficiency remains relatively low [58]. When a substantial portion of the frequency band is contaminated by noise, the scale estimate can become distorted. In real field measurements, where noise is typically more complex than in synthetic datasets, this limitation becomes more pronounced, leading to the widespread oscillations and scattered points observed in the curves. By contrast, MM-estimation combines the advantages of both S-estimation and M-estimation. The initial S-estimation provides a high-breakdown, noise-resistant scale that prevents early contamination, while the subsequent M-estimation step refines the solution with higher efficiency and improved smoothness. It is noteworthy that the MM-estimation results are nearly identical to those obtained using the remote reference method, indicating that MM-estimation can effectively recover subsurface electrical responses consistent with remote reference processing without requiring external reference station information. The remote reference method essentially reduces local uncorrelated noise by introducing remote reference signals, whereas MM-estimation actively suppresses anomalous spectral windows through robust statistical weighting functions. These results demonstrate that MM-estimation provides the highest robustness in processing real MT data and represents an effective choice for field records affected by complex background noise.

4. Discussions

4.1. Theoretical Analysis

Since natural EM signals observed in practical MT surveys are often non-stationary, time-varying, truncated, and contaminated by noise, the assumption of an ideal normal noise distribution is generally not satisfied for the linear regression model of MT transfer functions (Equations (1) and (2)). The primary distinction among the estimators lies in the definition of their objective functions: M-estimation minimizes a residual-based objective function, whereas S-estimation minimizes a scale-based objective function. These differences lead to distinct stability behaviors under varying SNR conditions. Under the condition of SNR = 30 dB, the results obtained using S-estimation exhibit relatively stronger dispersion (Figure 3 and Figure 4). When the SNR decreases to 5 dB, all three methods exhibit larger fluctuations and deviations in the mid- and long-period ranges (Figure 6 and Figure 7). This indicates that the robust estimation methods respond differently to changes in noise intensity, reflecting a clear trade-off between robustness and statistical efficiency. S-estimation employs a high-breakdown-point scale estimation strategy that strongly down-weights spectral windows with large residuals, thereby improving resistance to severe noise contamination. Since the noise level does not completely destroy the overall stability of the data, this aggressive weighting mechanism may simultaneously suppress some valid spectral windows, thereby reducing the utilization efficiency of useful data under relatively weak noise conditions. In contrast, the weight adjustment in M-estimation is relatively moderate; however, its ability to suppress outliers remains limited, such that some severely contaminated spectral windows may still retain non-negligible weights.

The MM-estimation results are generally consistent with those obtained using the remote reference method for the field data (Figure 8), indicating that the single-station processing approach based on robust statistical weighting can achieve noise suppression performance comparable to that of remote reference processing under the investigated conditions, without requiring additional reference stations. It is noteworthy that, since MM-estimation relies on the residuals of the dependent variables (electric field data), methods such as the remote reference technique or bounded-influence estimators [18,27] are still required to suppress outliers originating from the predictor variables (magnetic field data). We further suggest that a combination of MM-estimation and the remote reference method could provide advantages in handling both electric-field and magnetic-field noise at local stations. The superior performance of MM-estimation can be attributed to its two-stage structure, which combines the high breakdown point of S-estimation with the high statistical efficiency of M-estimation [59]. The S-step effectively suppresses high-amplitude transients, such as peak noise, by down-weighting outliers through the scale parameter, thereby providing a more reliable initial solution. The subsequent M-step refines the estimate using a smooth redescending weight function, improving convergence and reducing sensitivity to high-leverage points.

In addition, Tukey’s bisquare weight function [62] was employed in this study, although the use of alternative weight functions also merits consideration. Weight functions with a ψ-shape can effectively balance robustness and efficiency. During the iterative procedure, the bisquare scheme progressively suppresses the influence of outliers, enabling the estimator to focus on the dominant structure of the clean data [65]. Notably, the optimal selection of constants for the weight functions is largely based on classical experience and remains an open problem, requiring adaptive strategies for field datasets with varying noise distributions.

4.2. Limitations and Suggestions

Two major limitations of this study should be acknowledged: (1) the synthetic noise scenarios may not fully capture the complexity of industrial EM noise, and natural noise sources—such as long-period geomagnetic variations, instrument drift, and site-specific ground coupling effects—can also produce outliers; (2) the synthetic experiments were conducted by superimposing noise onto randomly selected segments of the time-series. This configuration simplifies the noise distributions, ensuring that sufficiently clean windows remain available for robust estimation. As a result, the evaluated performance may represent an optimistic scenario compared with real-field conditions. In practical MT surveys, cultural EM interference, instrument grounding issues, and environmental transients are generally persistent, widespread, or intermittently distributed throughout the entire recording. Under such circumstances, the breakdown point of the estimator, the availability of uncontaminated windows, and the convergence behavior of the iterative weighting scheme may all be adversely affected.

In this study, the primary purpose of the synthetic experiments is to validate the proposed method. It would be valuable to incorporate a more comprehensive and quantitative assessment of the estimation approaches by introducing noise disturbances with diverse frequency-domain signatures and a broader range of noise types. Evaluating robust estimators under these more realistic and complex noise conditions will help clarify their performance limits and inform the development of noise-specific weighting strategies. In addition, it is recognized that further research should focus on integrating adaptive weight selection procedures, multivariate robust estimators, and hybrid approaches that combine frequency-domain robust regression with time-domain machine-learning-based noise suppression.

5. Conclusions

We propose an iterative MM-estimation algorithm for the frequency-domain linear regression procedure in MT data processing. Together with the traditional M-estimation and S-estimation methods, we systematically evaluated the performance of all three robust regression estimators using both noise-contaminated synthetic data and field datasets. The results demonstrate that all estimators can produce reliable impedance tensors when the data are contaminated by a small amount of noise. For cases with a high proportion of noisy segments, MM-estimation demonstrates superior performance in suppressing outliers. We suggest that MM-estimation is less sensitive to industrial EM noise, which is increasingly becoming a key factor limiting the applicability of the MT method. We propose that MM-estimation is a strong candidate for routine MT processing workflows, particularly when data quality is highly variable.

Author Contributions

Conceptualization, W.S. and C.X.; writing—original draft preparation, W.S. and C.X.; writing—review and editing, W.S. and C.X. All authors have read and agreed to the pub-lished version of the manuscript.

Funding

This research was funded by the National Science and Technology Major Project for Deep Earth Probe and Mineral Resources Exploration, grant number 2024ZD1000403, and the National Natural Science Foundation of China, grant numbers 42474111 and 42074083.

Data Availability Statement

The datasets generated during the current study are publicly available in the Zenodo repository: https://doi.org/10.5281/zenodo.20197019.

Acknowledgments

We used the high-performance computing facilities at China University of Ge-osciences (Beijing) for data processing. We thank Mr. Tian Jifeng for providing field data for the field-data experiments. Our implementation and code extensions were developed based on the open-source software “resistics” developed by Neeraj Shah.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Jin, S.; Sheng, Y.; Liu, C.; Wei, W.; Ye, G.; Jing, J.; Zhang, L.; Dong, H.; Yin, Y.; Xie, C. A Review of Relationship between the Metallogenic System of Metallic Mineral Deposits and Lithospheric Electrical Structure: Insight from Magnetotelluric Imaging. Minerals 2024, 14, 541. [Google Scholar] [CrossRef]
Lü, Q.; Meng, G.; Zhang, K.; Liu, Z.; Yan, J.; Shi, D.; Han, J.; Gong, X. The Lithospheric Architecture of the Lower Yangtze Metallogenic Belt, East China: Insights into an Extensive Fe–Cu Mineral System. Ore Geol. Rev. 2021, 132, 103989. [Google Scholar] [CrossRef]
Heinson, G.S.; Direen, N.G.; Gill, R.M. Magnetotelluric Evidence for a Deep-Crustal Mineralizing System beneath the Olympic Dam Iron Oxide Copper-Gold Deposit, Southern Australia. Geology 2006, 34, 573–576. [Google Scholar] [CrossRef]
Heinson, G.; Didana, Y.; Soeffky, P.; Thiel, S.; Wise, T. The Crustal Geophysical Signature of a World-Class Magmatic Mineral System. Sci. Rep. 2018, 8, 10608. [Google Scholar] [CrossRef] [PubMed]
Zhang, K.; Lü, Q.; Lan, X.; Guo, D.; Wang, Q.; Yan, J.; Zhao, J. Magnetotelluric Evidence for Crustal Decoupling: Insights into Tectonic Controls on the Magmatic Mineral System in the Nanling–Xuancheng Area, SE China. Ore Geol. Rev. 2021, 131, 104045. [Google Scholar] [CrossRef]
Xu, L.; Jin, S.; Yin, Y.; Wei, W.; Ye, G.; Dong, H.; Zhang, L.; Jing, J.; Xie, C. Multiscale 3-D Imaging of the Crustal Electrical Structure beneath the Caosiyao Porphyry Mo Deposit, North China. Geophys. J. Int. 2022, 231, 1880–1897. [Google Scholar] [CrossRef]
Berdichevsky, M.N. Theoretical Basis of Magnetotelluric Profiling. Prikl. Geofiz. (Applied Geophysics) 1960, 28, 27–42. [Google Scholar]
Berdichevski, M.N. Linear Relationships in the Magnetotelluric Field. Appl. Geophys 1964, 38, 99–108. [Google Scholar]
Berdichevsky, M.N.; Dmitriev, V.I. The Magnetotelluric Response Functions. In Models and methods of magnetotellurics; Springer, 2008; pp. 1–49. [Google Scholar]
Egbert, G.D.; Booker, J.R. Robust Estimation of Geomagnetic Transfer Functions. Geophys J. Int. 1986, 87, 173–194. [Google Scholar] [CrossRef]
Chave, A.D.; Thomson, D.J.; Ander, M.E. On the Robust Estimation of Power Spectra, Coherences, and Transfer Functions. J. Geophys. Res. Solid Earth 1987, 92, 633–648. [Google Scholar] [CrossRef]
Chave, A.D.; Thomson, D.J. Some Comments on Magnetotelluric Response Function Estimation. J. Geophys. Res. 1989, 94, 14215–14225. [Google Scholar] [CrossRef]
Jones, A.G.; Chave, A.D.; Egbert, G.; Auld, D.; Bahr, K. A Comparison of Techniques for Magnetotelluric Response Function Estimation. J. Geophys. Res. 1989, 94, 14201–14213. [Google Scholar] [CrossRef]
Egbert, G.D.; Livelybrooks, D.W. Single Station Magnetotelluric Impedance Estimation: Coherence Weighting and the Regression M-Estimate. Geophysics 1996, 61, 964–970. [Google Scholar] [CrossRef]
Larsen, J.C.; Mackie, R.L.; Manzella, A.; Fiordelisi, A.; Rieven, S. Robust Smooth Magnetotelluric Transfer Functions. Geophys. J. Int. 1996, 124, 801–819. [Google Scholar]
Smirnov, M.Y. Magnetotelluric Data Processing with a Robust Statistical Procedure Having a High Breakdown Point. Geophys. J. Int. 2003, 152, 1–7. [Google Scholar] [CrossRef]
Usui, Y.; Uyeshima, M.; Sakanaka, S.; Hashimoto, T.; Ichiki, M.; Kaida, T.; Yamaya, Y.; Ogawa, Y.; Masuda, M.; Akiyama, T. New Robust Remote Reference Estimator Using Robust Multivariate Linear Regression. Geophys. J. Int. 2024, 238, 943–959. [Google Scholar] [CrossRef]
Chave, A.D.; Thomson, D.J. Bounded Influence Magnetotelluric Response Function Estimation. Geophys. J. Int. 2004, 157, 988–1006. [Google Scholar] [CrossRef]
Maronna, R.A.; Martin, R.D.; Yohai, V.J.; Salibián-Barrera, M. Robust Statistics: Theory and Methods (with R); John Wiley & Sons, 2019; ISBN 1-119-21468-8. [Google Scholar]
Goubau, W.M.; Gamble, T.D.; Clarke, J. Magnetotelluric Data Analysis: Removal of Bias. GEOPHYSICS 1978, 43, 1157–1166. [Google Scholar] [CrossRef]
Gamble, T.D.; Goubau, W.M.; Clarke, J. Magnetotellurics with a Remote Magnetic Reference. GEOPHYSICS 1979, 44, 53–68. [Google Scholar] [CrossRef]
Clarke, J.; Gamble, T.D.; Goubau, W.M.; Koch, R.H.; Miracky, R.F. REMOTE-REFERENCE MAGNETOTELLURICS: EQUIPMENT AND PROCEDURES*. Geophys. Prospect. 1983, 31, 149–170. [Google Scholar] [CrossRef]
Larsen, J.C. Transfer Functions: Smooth Robust Estimates by Least-Squares and Remote Reference Methods. Geophys. J. Int. 1989, 99, 645–663. [Google Scholar] [CrossRef]
Oettinger, G.; Haak, V.; Larsen, J.C. Noise Reduction in Magnetotelluric Time-Series with a New Signal–Noise Separation Method and Its Application to a Field Experiment in the Saxonian Granulite Massif. Geophys. J. Int. 2001, 146, 659–669. [Google Scholar] [CrossRef]
Pedersen, L.B. The Magnetotelluric Impedance Tensor—Its Random and Bias Errors. Geophys. Prospect. 1982, 30, 188–210. [Google Scholar] [CrossRef]
Ritter, O.; Junge, A.; Dawes, G.J. New Equipment and Processing for Magnetotelluric Remote Reference Observations. Geophys. J. Int. 1998, 132, 535–548. [Google Scholar] [CrossRef]
Chave, A.D. Estimation of the Magnetotelluric Response, in The Magnetotelluric Method: Theory and Practice. In Cambridge Univ; 2012; pp. 165–218. [Google Scholar]
Ogawa, H.; Asamori, K.; Negi, T.; Ueda, T. A Novel Method for Processing Noisy Magnetotelluric Data Based on Independence of Signal Sources and Continuity of Response Functions. J. Appl. Geophys. 2023, 213, 105012. [Google Scholar] [CrossRef]
Sato, S.; Goto, T.-N.; Kasaya, T.; Ichihara, H. Method for Obtaining Response Functions from Noisy Magnetotelluric Data Using Frequency-Domain Independent Component Analysis. Geophysics 2021, 86, E21–E35. [Google Scholar] [CrossRef]
Egbert, G.D.; Booker, J.R. Multivariate Analysis of Geomagnetic Array Data: 1. The Response Space. J. Geophys. Res. Solid Earth 1989, 94, 14227–14247. [Google Scholar] [CrossRef]
Egbert, G.D. Robust Multiple-Station Magnetotelluric Data Processing. Geophys. J. Int. 1997, 130, 475–496. [Google Scholar] [CrossRef]
Egbert, G.D. Processing And Interpretation Of Electromagnetic Induction Array Data. Surv. Geophys. 2002, 23, 207–249. [Google Scholar] [CrossRef]
Smirnov, M.Yu.; Egbert, G.D. Robust Principal Component Analysis of Electromagnetic Arrays with Missing Data: Robust PCA of EM Arrays with Missing Data. Geophys. J. Int. 2012, 190, 1423–1438. [Google Scholar] [CrossRef]
Garcia, X.; Jones, A.G. Atmospheric Sources for Audio-Magnetotelluric (AMT) Sounding. Geophysics 2002, 67, 448–458. [Google Scholar] [CrossRef]
Jones, A.G.; Spratt, J. A Simple Method for Deriving the Uniform Field MT Responses in Auroral Zones. Earth Plan. Space 2014. [Google Scholar] [CrossRef]
Weckmann, U.; Magunia, A.; Ritter, O. Effective Noise Separation for Magnetotelluric Single Site Data Processing Using a Frequency Domain Selection Scheme. Geophys. J. Int. 2005. [Google Scholar] [CrossRef]
Platz, A.; Weckmann, U. An Automated New Pre-Selection Tool for Noisy Magnetotelluric Data Using the Mahalanobis Distance and Magnetic Field Constraints. Geophys. J. Int. 2019. [Google Scholar] [CrossRef]
Wang, P.; Chen, X.; Zhang, Y. Strong Interference Magnetotelluric Data Processing Method Based on Robust Estimation, Data Screening and Rhoplus Constraint. Chin. J. Geophys. (in Chinese). 2024, 67, 4325–4342. [Google Scholar]
Garcia, X.; Jones, A.G. Robust Processing of Magnetotelluric Data in the AMT Dead Band Using the Continuous Wavelet Transform. GEOPHYSICS 2008, 73, F223–F234. [Google Scholar] [CrossRef]
Cai, J.-H.; Tang, J.-T.; Hua, X.-R.; Gong, Y.-R. An Analysis Method for Magnetotelluric Data Based on the Hilbert–Huang Transform. Explor. Geophys. 2009, 40, 197–205. [Google Scholar] [CrossRef]
Chen, J.; Heincke, B.; Jegen, M.; Moorkamp, M. Using Empirical Mode Decomposition to Process Marine Magnetotelluric Data: Using EMD to Process Marine MT Data. Geophys. J. Int. 2012, 190, 293–309. [Google Scholar] [CrossRef]
Li, J.; Zhang, X.; Tang, J. Noise Suppression for Magnetotelluric Using Variational Mode Decomposition and Detrended Fluctuation Analysis. J. Appl. Geophys. 2020, 180, 104127. [Google Scholar] [CrossRef]
Kappler, K.N. A Data Variance Technique for Automated Despiking of Magnetotelluric Data with a Remote Reference. Geophys. Prospect. 2012, 60, 179–191. [Google Scholar] [CrossRef]
Tang, J.; Li, G.; Xiao, X.; Li, J.; Zhou, C.; Zhu, H. Strong Noise Separation for Magnetotelluric Data Based on a Signal Reconstruction Algorithm of Compressive Sensing. Chin. J. Geophys. (in Chinese). 2017, 60, 3642–3654. [Google Scholar]
Manoj, C.; Nagarajan, N. The Application of Artificial Neural Networks to Magnetotelluric Time-Series Analysis. Geophys. J. Int. 2003, 153, 409–423. [Google Scholar] [CrossRef]
Dramsch, J.S. 70 Years of Machine Learning in Geoscience in Review. In Advances in Geophysics; Elsevier, 2020; Vol. 61, pp. 1–55. ISBN 978-0-12-821669-9. [Google Scholar]
Han, Y.; An, Z.; Di, Q.; Wang, Z.; Kang, L. Research on noise suppression of magnetotelluric signal based on recurrent neural network. Chin. J. Geophys. (in Chinese). 2023, 65, 4317–4331. [Google Scholar] [CrossRef]
Li, G.; Zhou, X.; Chen, C.; Xu, L.; Zhou, F.; Shi, F.; Tang, J. Multitype Geomagnetic Noise Removal via an Improved U-Net Deep Learning Network. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–12. [Google Scholar] [CrossRef]
Li, J.; Liu, Y.; Tang, J.; Ma, F. Magnetotelluric Noise Suppression via Convolutional Neural Network. Geophysics 2023, 88, WA361–WA375. [Google Scholar] [CrossRef]
Li, J.; Liu, Y.; Tang, J.; Peng, Y.; Zhang, X.; Li, Y. Magnetotelluric Data Denoising Method Combining Two Deep-Learning-Based Models. Geophysics 2023, 88, E13–E28. [Google Scholar] [CrossRef]
Li, G.; Gu, X.; Chen, C.; Zhou, C.; Xiao, D.; Wan, W.; Cai, H. Low-Frequency Magnetotelluric Data Denoising Using Improved Denoising Convolutional Neural Network and Gated Recurrent Unit. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–16. [Google Scholar] [CrossRef]
Tian, Y.; Xie, C.; Wang, Y. Long Short-Term Memory Recurrent Network Architectures for Electromagnetic Field Reconstruction Based on Underground Observations. Atmosphere 2024, 15, 734. [Google Scholar] [CrossRef]
Li, G.; Liu, X.; Tang, J.; Deng, J.; Hu, S.; Zhou, C.; Chen, C.; Tang, W. Improved Shift-Invariant Sparse Coding for Noise Attenuation of Magnetotelluric Data. Earth Plan. Space 2020, 72, 45. [Google Scholar] [CrossRef]
Li, J.; Peng, Y.; Tang, J.; Li, Y. Denoising of Magnetotelluric Data Using K-SVD Dictionary Training. Geophys. Prospect. 2021, 69, 448–473. [Google Scholar] [CrossRef]
Li, G.; He, Z.; Tang, J.; Deng, J.-Z.; Liu, X.; Zhu, H. Dictionary Learning and Shift-Invariant Sparse Coding Denoising for Controlled-Source Electromagnetic Data Combined with Complementary Ensemble Empirical Mode Decomposition. Geophysics 2021, 86, E185–E198. [Google Scholar] [CrossRef]
Huber, P.J. Robust Estimation of a Location Parameter. Ann. Math. Stat. 1964, 35, 73–101, 129. [Google Scholar] [CrossRef]
Susanti, Y.; Pratiwi, H.; Sulistijowati, H.; Liana, S. T. M ESTIMATION, S ESTIMATION, AND MM ESTIMATION IN ROBUST REGRESSION. Int. J. Pure Appl. Math. 2014, 91. [Google Scholar] [CrossRef]
Rousseeuw, P.; Yohai, V. Robust Regression by Means of S-Estimators. In Robust and Nonlinear Time Series Analysis; Lecture Notes in Statistics; Franke, J., Härdle, W., Martin, D., Eds.; Springer US: New York, NY, 1984; Vol. 26, pp. 256–272. ISBN 978-0-387-96102-6. [Google Scholar]
Yohai, V.J. High Breakdown-Point and High Efficiency Robust Estimates for Regression. Ann. Stat. 1987, 15. [Google Scholar] [CrossRef]
Simpson, F.; Bahr, K. Practical Magnetotellurics; Cambridge University Press, 2005; ISBN 978-0-521-81727-1. [Google Scholar]
Pitselis, G. A Review on Robust Estimators Applied to Regression Credibility. J. Comput. Appl. Math. 2013, 239, 231–249. [Google Scholar] [CrossRef]
Beaton, A.E.; Tukey, J.W. The Fitting of Power Series, Meaning Polynomials, Illustrated on Band-Spectroscopic Data. Technometrics 1974, 16, 147–185. [Google Scholar] [CrossRef]
Tian, J.; Ye, G.; Xie, C.; Li, L.; Wei, W.; Jin, S.; Liu, Z. Two-Dimensional Electrical Resistivity Structure of the Crust and Upper Mantle across the North-South Gravity Lineament in NE China. Tectonophysics 2022, 837, 229459. [Google Scholar] [CrossRef]
Huber, P.J. The Behavior of Maximum Likelihood Estimates under Nonstandard Conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics; University of California Press, 1967; Vol. 5.1, pp. 221–234. [Google Scholar]
Seheult, A.; Green, P.; Rousseeuw, P.; Leroy, A. Robust Regression and Outlier Detection. J. R. Stat. Soc. Ser. A (Statistics in Society) 1989, 152, 133. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the M-, S-, and MM-estimation. OLS, ordinary least squares.

Figure 2. Comparisons of clean and noise-contaminated EM time series with an SNR of 30 dB. Fifteen minutes data segments are illustrated, while a total of 10 days data were processed. Panels (a-c) represent segments where 30%, 50%, and 70% of dataset were randomly superimposed with mixed noises shown in Table 1, respectively. The orange and blue lines represent clean and noise-contaminated data, respectively.

Figure 3. Apparent resistivity and phase results under mixed-noise contamination conditions with an SNR of 30 dB, using M-estimation, S-estimation, and MM-estimation. Panels (a-c) represent segments where 30%, 50%, and 70% of dataset were randomly superimposed with mixed noises shown in Table 1, respectively.

Figure 4. Distributions of the final weights of impedance tensors at ~181 s for different contamination cases under an SNR of 30 dB, using M-estimation, S-estimation, and MM-estimation. Panels (a-c) represent segments where 30%, 50%, and 70% of dataset were randomly superimposed with mixed noises shown in Table 1, respectively. The radial denotes impedance magnitude, the angular represents impedance phase, and the color intensity indicates corresponding robust weight values of the last iteration, where darker colors correspond to higher weights.

Figure 5. Comparisons of clean and noise-contaminated EM time series with an SNR of 5 dB. Fifteen minutes data segments are illustrated, while a total of 10 days data were processed. Panels (a-c) represent segments where 30%, 50%, and 70% of dataset were randomly superimposed with mixed noises shown in Table 1, respectively. The orange and blue lines represent clean and noise-contaminated data, respectively.

Figure 6. Apparent resistivity and phase results under mixed-noise contamination conditions with an SNR of 5 dB, using M-estimation, S-estimation, and MM-estimation. Panels (a-c) represent segments where 30%, 50%, and 70% of dataset were randomly superimposed with mixed noises shown in Table 1, respectively.

Figure 7. Distributions of the final weights of impedance tensors at ~181 s for different contamination cases under an SNR of 5dB, using M-estimation, S-estimation, and MM-estimation. Panels (a-c) represent segments where 30%, 50%, and 70% of dataset were randomly superimposed with mixed noises shown in Table 1, respectively. The radial denotes impedance magnitude, the angular represents impedance phase, and the color intensity indicates corresponding robust weight values of the last iteration, where darker colors correspond to higher weights.

Figure 8. Comparison of apparent resistivity and impedance phase curves obtained from field observed MT data, using remote reference method combined with M-estimation, and local station estimation with M-, S-, and MM-estimator, respectively.

Table 1. Parameters of different superimposed synthetic noise.

Noise types	Noise amplitude	Occurrence probability	Period(s)	Duty cycle (%)
Gaussian	$0.2 \times α$	-	-	-
Square	$1.5 \times α$	-	300	50
Peak	$6.0 \times α$	$2 \times 10^{- 5}$	-	-
Sawtooth	$1.0 \times α$	-	200	50

Note.

α

, amplitude coefficient.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Robust Regression Estimators in Magnetotelluric Data Processing: Performance & Evaluation

Abstract

Keywords:

Subject:

1. Introduction

2. Theory and Methods

2.1. MT Data Processing

2.2. Robust Estimation

3. Implementation

3.1. Synthetic Experiments

3.1.1. Synthetic Data

3.1.2. Robust Estimations

3.2. Application to Observed Data

4. Discussions

4.1. Theoretical Analysis

4.2. Limitations and Suggestions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe