3. Pipeline Organization
The processing workflow proposed in this paper is organized in a data pipeline composed of several independent blocks. Each block is designed to implement functions and utilities covering the most common data processing steps, which can be personalized and recombined according to needs. The aim is to provide a solution that seamlessly integrates with different machine setups and allows a degree of customization to accommodate data-driven models and processing requirements. Given the variability of sensor installations, the different data formats they produce and all the possible end usages of the devices, the robustness of the structure is a priority, also to account for common disruptions like signal downtimes and human interventions. The modular structure, with independent, customizable, and recombinable blocks, provides sufficient flexibility to adapt to different machine configurations. The setup has been explicitly evaluated for industrial gearbox diagnosis and prognosis, and it is primarily intended to handle condition monitoring tasks, both online and offline.
The general planned workflow of the data management part is summarized in
Figure 1.
The initial section of the pipeline (
Section 3.1) encompasses all the building blocks designed for data management and preparation. It directly communicates with the sensors to supervise the acquisition and loading of raw data. It implements the most common signal processing techniques to extract and prepare relevant features that will be employed by the models. The final step of this section entails preparing the data for the domain adaptation phase, whereby the samples are split and normalized into the assigned domains.
Following this preprocessing stage, the subsequent module supports the extraction of supplementary anomaly detection indices (
Section 3.2). It employs both well-established machine learning tools and autoencoder-like models, which facilitate the extraction of features that are more domain-independent.
The final blocks of the structure (
Section 3.3–
Section 3.5) are dedicated to the design and implementation of two general deep-learning models that can address prognostics and diagnostics tasks. Due to the variability of the problem, novel domain adaptation techniques are adopted to enhance the ability to generalize to unforeseen situations and machines, frequent problems in real-world scenarios.
The pipeline’s modular design allows users to safely incorporate new custom steps into the infrastructure without modifying existing blocks. This feature enables, for instance, the transition to an online mode via a simple plugin. This plugin communicates with the sensor data stream to save intermediate files that are compatible with the format required by the first block, thus permitting the standard workflow through the rest of the pipeline.
The following Sections will examine each module in detail, elucidating its functionality and its role within the infrastructure. Additionally, delving into the technical aspects, the resilience of each block with respect to common disruptions in real machine environments will be demonstrated.
3.1. Data ETL
The first part of the infrastructure covers the fundamental data ETL (Extract, Transform, and Load) operations. Skipping the transmission of data by the sensors, which will be explored in the “online section" (
Section 4), three fundamental blocks can be identified within this first preprocessing stage:
Signal Extraction
Features Extraction
Domain Splitting
The high level of modularity of this infrastructure allows for the isolation and utilization of each pipeline component separately. Indeed, the project was realized with the objective that different users should be able to reassemble the blocks according to their specific needs and the machine setup. The only requirement is rigorous control of the input and output data formats throughout the flow, as incompatibilities may result in the generation of undesirable errors and behaviors. Moreover, as a stable solution for an industrial setup, the pipeline is designed to operate with a fair number of heterogeneous experiments even simultaneously. For this reason, folders, file names, tags, and versioning are required to follow a strict hierarchical structure with a clearly defined internal nomenclature. This enables the user to run multiple evaluations.
3.1.1. Signal Extraction
The monitoring of industrial systems is often characterized by the presence of many sensors that collect and dispatch variables of diverse types and meanings, each with its own acquisition rate. Collecting all this information, HDF5 is among the most effective data formats since it allows storing large amounts of heterogeneous data in a hierarchical structure [
11]. Although the memory access is not optimal, this format ensures high flexibility, particularly for real-time acquisition and mapping. Finally, it permits the dynamic introduction of new channels without necessarily affecting existing stable operations.
The expected internal key structure of the HDF5 files is "sensor" → "channel" → "RAW/PreProcess" → "Timestamp". This approach ensures the potential for transmitting diverse signals from the same sensors and accounts for the possibility of preliminary operations performed directly by the devices. Metadata pertaining, for example, to the acquisition time, working conditions, and sample rate can be embedded directly within the structure as independent channels.
Since data transmission is not always continuous, dividing the signal into multiple “sessions" is advisable. These are defined as data periods containing multiple time windows of arbitrary length, called “acquisitions". The length of both sessions and acquisitions may vary according to the characteristics and operational routines of the observed machine. This solution is initially developed to address the occasional downtimes of the devices involved, minimizing the consequent loss of data chunks. Additionally, it permits mitigating storage costs for large data collection periods at high frequencies.
The extraction of meaningful information from the HDF5 files is achieved by looping over all available sessions and creating Python objects containing both the signals and metadata of each acquisition window. As a result, one binary Pickle [
12] dump file is generated for each individual session. Also, while the HDF5 files contain the entire set of signals transmitted by the sensors, after this step, only the required variables are sent to the subsequent modules, significantly reducing the dataset dimension.
In offline mode, when multiple acquisition files are available, the signal extraction phase can be accelerated by enabling multiprocessing, which launches jobs that run in parallel for each HDF5. Nevertheless, this feature is disabled by default in order to prevent an increase in the hardware requirements for this infrastructure.
Aware of common practices in industrial environments, the pipeline needs to be robust to periodic changes in the machine setup. To enhance its flexibility, the configuration allows for the possibility of providing a customized set of input channels. This is achieved by providing a simple JSON file [
13], which allows the user to reassign all the default channels with the new nomenclature employed by the sensors. This ensures the pipeline’s operativity even with new devices. The online procedure (
Section 4) provides a dedicated mapping of the channel streams, which can reproduce intermediate HDF5 files with the same structure, thus easily connecting to the rest of the pipeline. This step is of critical importance for scalability. If different data sources are provided, developing an ad-hoc plugin to connect the new signals to the default key structure and proceed with the flow-through will be sufficient. On the other side, the usage of Pickle files enables an initial unfolding of the larger HDF5 files and the elimination of redundant or unnecessary variables.
The pipeline has been tested using data from various sources, including accelerometers (vibration), thermometers (temperature), multimeters (current, voltage), and field buses (speed, torque). Such data was sampled at high (25 kHz) and low (1 Hz) frequencies. The procedure is described in a general manner, as it must be applicable to all types of signals collected by the sensors. However, the literature [
14,
15,
16] clearly shows that vibrational signals are the most informative in the context of predictive maintenance for rotating systems.
3.1.2. Feature Extraction
The second block of the pipeline encompasses the entire feature extraction stage. In predictive maintenance focusing on diagnostics and prognostics of rotating machinery, most state-of-the-art models necessitate a preprocessing step aimed at extracting significant features that more accurately reflect the degradation status of the observed component. The binary files generated in the previous stage contain the time series associated with the sensors’ measurements, organized in sessions, sorted by acquisition time, and with the associated metadata. The module, iterating over each file, extracts all the single acquisition objects within the sections and further fragments them into non-overlapping sub-acquisition windows of parametrically adjustable length. This procedure ensures a higher number of samples available and equal-length sub-signals, thus generating more consistent features across the entire dataset.
The preponderance of the developed functions has been designed explicitly for analyzing vibrational signals, given their superior informative value in condition monitoring tasks. Information brought by other variables, such as the device temperature or the operational velocity of the apparatus, can also be incorporated into the models, for example, through statistical representation (average, standard deviations).
Digital Signal Processing (DSP) is a crucial tool in extracting meaningful features from the signals acquired by the sensors. The effectiveness of predictive maintenance is heavily reliant on the relevance of this information, as these meaningful features serve as the building blocks for predictive models, providing valuable insights into the health and performance of the machinery. These features can encompass various characteristics, including frequency-domain analysis, time-domain analysis, statistical measures, and spectral analysis. Consequently, selecting variables that most accurately reflect the deterioration of the apparatus’s operational regime represents a critical stage in formulating maintenance strategies. The pipeline implements the most common features provided by DSP, that have been used, in the years, as condition indicators for some kind of tasks affine with condition monitoring ([
14,
15,
17,
18]). For simplicity, they are grouped by output dimensionality.
The actual list of one-dimensional features implemented is:
- -
Time-Domain Statistics
- -
Wavelet Packet Energy
- -
Hilbert-Huang
- -
Kurtogram statistics
- -
Wavelet statistics
- -
Frequency-domain statistics (from the power spectrum)
- -
Rolling Mean, Variance, RMS
Most 1-dimensional features consist of statistics (mean, variance, maximum, peaks, etc.) extracted from the transformed signal. In this way, the meaningful content of the signal is preserved and compressed in a few informative variables.
The actual list of two-dimensional features implemented is:
- -
Short-Time Fourier Transform (STFT)
- -
Continuous Wavelet Transform (CWT) [
19]
- -
- -
Mel-Frequency Cepstral Coefficients (MFCC) [
18,
21]
- -
The module also supports the extraction of complex-valued features, like the Short-Time-Fourier-Transform ([
23,
24]), of which one can take the magnitude or continue with non-real-valued operations. The raw signal can be treated as a feature in itself and carried forward without additional processing, as many recent transformer-based models claim high performances when taking in input time series ([
25,
26]). Alternatively, rolling variables (the pipeline implements rolling mean, rolling variance, and rolling mean-square) can represent a good trade-off between maintaining the time series structure and, at the same time, compressing the information stored in the data.
The module also implements some canonical filters to be applied to the signal before processing. They could help reduce the noise or remove undesired frequencies:
- -
Band reject based on Chebyshev type II [
27]
- -
Band pass FFT denoising [
28]
- -
Wavelet soft-thresholding denoising [
29]
- -
Another functionality implemented in the block is automatically managing gaps in the signals. Given that data acquisitions and transmissions are not always synchronized and stable, the acquisition objects created in the preceding block are monitored and, if necessary, realigned to preserve the dataset’s coherence and avoid missing or duplicate samples. Finally, the pipeline enables the signal to be downsampled to a lower frequency before feature extraction [
31]. This allows the results to be aligned across sensors with different sampling rates and facilitates the comparison of the results achieved on a narrower spectrum. The features extracted from each sub-acquisition window are reformatted as dictionary-like objects and saved as binary files. This approach ensures the scalability of the pipeline even with heterogeneous data. The extraction phase is completely configurable, allowing for the inclusion of specific features or parameter variations tailored to each experiment. Additionally, the two cases of diagnosis and prognosis may require separate setups, and thus, they are handled independently. A metadata file is compiled to maintain a log of the domain-specific details of every acquisition, with one entry for each distinct binary file. The hardware requirements may vary contingent upon the specific features selected, with higher memory consumption requisite for tasks involving images, like spectrograms or nested plots, compared to those involving sparser or one-dimensional variables. Nevertheless, the resultant files from this phase serve more of a transient purpose, as they will be subject to recombination and rescaling in the subsequent step. Consequently, under strict constraints on the available memory, these files could theoretically be deleted post-processing yet remain recoverable from the original HDF5 files.
For the case study presented below in
Section 5, better prediction performances were obtained considering MFCC, Time-Domain Features, and Rolling RMS of the signal.
3.1.3. Splitting
The third and final block of the ETL part of the pipeline deals with the final preparation and organization of the samples just before the training and inference modules of the models. The main objective of this Section is to arrange a dedicated setup for the various domain adaptation algorithms (
Section 3.3), which are the final part of the infrastructure. Consequently, the previously extracted features are organized into source and target sets, each including train, validation, and inference subsets.
The splitting step, which is fully configurable, allows the train-validation-test division to be defined by either stating the percentage or the exact number of hours of data that should be added to each subset. Furthermore, the pipeline automatically assigns the "healthy" or "faulty" labels to the samples indicated for diagnosis tasks. The module copies the files into the appropriate subfolder and updates a metadata file with the new destination path. This allows the pipeline to keep track of the files and recombine them in multiple source-target combinations.
Duplicating all files is not an optimal storage solution. However, this approach can be beneficial in guaranteeing the best performance from the actual setup when working with heterogeneous data and running multiple experiments. In real-time monitoring, data generated within this block can be deleted right after being used by models to free up space.
Once per experiment, a normalization object is fitted on a custom portion of the dataset and then used to rescale all the upcoming features.
Among the methods implemented, the most common choices are standardization and min-max scaling. These practices have allowed a successive memory-friendly data-loading step during the model training and inference phases. In industrial applications, sensors often collect data for very long periods; thus, the possibility of efficiently selecting and loading the samples assumes high relevance. A more detailed discussion of domain adaptation algorithms is presented later in the paper.
3.2. High-Level Feature Extraction
A common and straightforward approach to fault detection tasks is to compare signals’ features with some reference thresholds, which may be obtained from literature or earlier studies ([
32,
33]). However, direct confrontation may be deceptive because signals are frequently acquired in different dynamic situations or from different devices or sensors.
In the presented framework, an intermediate module provides an alternative linkage between the low-level features extracted by the ETL stage and the models implemented afterward. The objective is to provide a novel set of more domain-independent variables that could enable a more fair comparison. Furthermore, integrating these higher-level features may reduce the workload for subsequent diagnostic and prognostic algorithms by supplying new data that are more compressed, more sensitive to changes in the machine’s health, and, therefore, better suited for anomaly detection tasks. It is still possible to feed features such as raw signals or time-frequency representations, like MFCCs, directly into the models by simply skipping this block. However, this may compromise the scalability of the solution, especially in situations where an extended period of data acquisition or a demanding transfer learning step is required.
In the tested solution, autoencoders were selected to provide greater abstraction in the feature engineering process. An autoencoder is an artificial neural network designed to learn efficient latent representations of data, typically by reducing the input’s dimensionality and capturing the most relevant features. Composed of a pair encoder-decoder, autoencoders are trained via backpropagation to minimize a reconstruction loss, typically a mean-squared error between the input and its reconstruction. Autoencoders have been well established in the literature for anomaly detection tasks and are widely used also in the area of condition monitoring [
25,
34,
35]. The solution implements a Multi-Variational Autoencoder (MultiVAE) model, a composite architecture that processes the different feature channels from the preceding modules, automatically adapting to their types and dimensions. Additionally, the MultiVAE is designed to enhance the information content of the lower-level features, thereby providing a more accurate reflection of the machine’s degradation status at the subsequent stages of the pipeline.
As the name suggests, the architecture employs three different categories of autoencoders:
- -
a Variational Autoencoder for one-dimensional vectors of features (RMS, statistics, ...) [
34,
35];
- -
a Convolutional Autoencoder for two-dimensional features (MFCC, Fourier transforms, spectrograms) [
36];
- -
an LSTM Autoencoder for time-series (raw signal, rolling mean) [
25,
37].
The training procedure and the network’s primary parameters are entirely customizable. At the same time, the pipeline automatically selects the optimal architecture, depending on the collection of selected features generated by the earlier stages. The complete MultiVAE setup workflow is shown in
Figure 2.
The model employs the features extracted from the signals as independent channels to generate a compressed latent representation of the input data. The latent space is deliberately formed of a limited number of variables, thereby enforcing the compression of the information contained in the input. Typically, only data from the experiment’s initial period, comprising samples still deemed "healthy" status, are utilized to train the model. From this perspective, the model can be conceptualized as a denoising autoencoder, capable of recognizing artifacts in the healthy features but gradually losing its generalization capabilities beyond this working point. As the machine status gradually deteriorates, the characteristics retrieved by the vibration signal start to deviate from stable conditions, accumulating more noise. Therefore, the changes in the latent space can be used as a reliable measure of this degradation and exploited for anomaly detection. Empirical evidence suggests that the most effective variable for replicating this decline is the model’s reconstruction loss, whose gradual deterioration across the machine’s lifetime is clear in the examples reported in the scheme
Figure 2.
Autoencoders ensure enhanced operational efficacy, particularly in terms of hardware requirements. In support of this consideration, the technical details of the implemented architectures, the training phase, and the devices utilized in the case study experiments are presented in
Appendix A. These models exhibit a contained parameter count and are designed to learn only from a portion of the data, typically spanning the initial hours of the experiment. This allows the training phase to be run completely offline, which is particularly useful in scenarios characterized by stringent online device constraints. The updated network’s weights can be deployed within a lightweight online setup. Portability is, in fact, another of the main qualities of this pipeline. Finally, autoencoders facilitate the compression of high-dimensional features, such as the frequently used MFCCs, into subsets of variables rearranged in tabular formats, expediting successive domain adaptation steps and drastically reducing the memory requirements.
In the development stage, autoencoders’ architectures were engineered to overfit the training dataset on purpose. This strategy aimed to compress input information of a healthy configuration, leveraging the divergence of points in the latent space and the reconstruction error as metrics for assessing health degradation. Notably, a small latent space leads to an efficient latent representation, enhancing model sensitivity to deviations in system health.
To enhance the resilience of diagnostic and prognostic models, the pipeline facilitates the incorporation of anomaly detection indices with the variables generated by the autoencoders. Over the years, numerous machine learning algorithms have been developed to identify irregularities and deviations from “ordinary conditions". The pipeline integrates some general and common methodologies, allowing users to increase the number of signals connected with the machinery’s health status degeneration. The available methods include One-Class SVM [
38], Local Outlier Factor [
39], Isolation Forest [
40] and Elliptic Envelope [
41], which can be applied to either the signal itself or the latent space of the autoencoders.
Each experiment is processed separately within this pipeline block, and the final features are eventually combined, rearranged, and gathered to create a distinct tabular dataset that ensures a compressed depiction of the monitored component’s health condition.
In the forthcoming case study (
Section 5), the MultiVAE model has been exclusively employed for the prognosis task, combining the compressed latent representation of MFCCs, time-domain features, and rolling RMS values from the vibration signal (as detailed in
Section 3.1, paragraph
Feature Extraction). The novel tabular data were constructed utilizing the reconstruction errors of the three autoencoders and their respective latent variables. Conversely, within the diagnostic problem, the original MFCCs underwent direct processing without intermediary compression, accordingly undergoing the domain adaptation phase.
As previously articulated, industrial applications necessitate an approach that accommodates their inherent diversity and specificity, precluding the application of a singular, universally applicable methodology. It is crucial to remark that this paper does not want to delineate an optimal strategy for resolving condition monitoring tasks; rather, it focuses on presenting a flexible and complete framework. This objective is complemented by a validation conducted in a real-world scenario, for which a valid and reliable solution is presented.
3.3. Domain Adaptation
The final module of the pipeline contains the models implemented to fulfill tasks of diagnosis and prognosis of the machinery’s health status (the complete workflow is summarized in
Figure 3).
One of the main challenges inherent in industrial systems lies in the distinctive nature of each possible setup configuration. The unique attributes encompassing operational modalities, mechanical component heterogeneity, variability in consumption and utilization, and sensor acquisition disparities further underscore each setup’s individuality. One approach to mitigate this challenge involves conducting comparative analyses with a designated reference experiment characterized by a well-documented lifespan. PM models can attempt to project this information onto the new monitored setup by extrapolating insights derived from this reference scenario. Consequently, the incorporation of domain adaptation techniques becomes essential in the development of a trustworthy and general solution [
42,
43].
Domain adaptation (DA) is a subcategory of transfer learning, providing algorithms and techniques designed to address the common problem in which the distribution of data used for training a model (source domain) differs from the distribution of data where the model is deployed (target domain). In a monitored system, the accuracy of predictive models is often compromised because the source domain, typically represented by historical data or simulated environments, diverges substantially from the operational context. Domain adaptation will facilitate the seamless knowledge transfer, thereby mitigating performance degradation and enhancing generalization capabilities across heterogeneous domains.
Finally, domain adaptation promotes the optimization of resource utilization and cost-effectiveness by eliminating the need for domain-specific model retraining in the target operational context.
This work encompasses some of the most common deep learning domain adaptation algorithms, re-adapted from the PyTorch implementation proposed in the public GitHub repository "DeepDA" [
44]. The authors presented their work as a lightweight, extendable, and easily learnable toolkit that precisely met the needs of the pipeline. Aware of the rapid development in Condition Monitoring, it would have been impractical to include a unique solution in the infrastructure, given the frequency with which new frameworks are proposed. Consequently, the pipeline incorporates a few general and well-established algorithms while remaining flexible and adaptable to setup-specific routines or emerging innovative architectures.
The fundamental infrastructure of the DA module involves essentially three networks:
- -
a Transfer Network, the main core of the setup, implements the transfer rule and produces a latent high-dimensional representation of the input data;
- -
a Domain Discriminator, to learn distinguishing the sample domain (source or target);
- -
a Predictive Model, focused on learning the main task (a classifier for diagnosis problems and a regressor for prognosis ones).
This simple design facilitates the implementation of several DA algorithms proposed in recent literature while concurrently accommodating user-defined optimization functions or case-specific transfer networks. At the time of this paper’s composition, the repository hosts two adversarial-based algorithms: Domain-Adversarial Neural Networks [
45] and Dynamic Adversarial Adaptation Networks [
46]. These algorithms, operating at the training level, aim to map input variables onto a shared latent space, rendering target samples indistinguishable from labeled source samples. Other techniques, such as Maximum Mean Discrepancy (MMD [
47]), Correlation Alignment for Unsupervised Domain Adaptation (CORAL [
48]), Batch Nuclear-norm Maximization (BNM [
49]) and Local MMD (LMMD [
50]), on the other hand, are metric-based procedures and endeavor to minimize specific loss functions to align features originating from distinct domains. For those, the Domain Discriminator is usually replaced by artefacts within the cost function. Moreover, the proposed module accommodates a multi-source modality, wherein training batches are assembled proportionally by mixing samples from various reference experiments. This approach is anticipated to enhance information exchange in scenarios characterized by the availability of multiple data sources.
The backbone of the transfer learning infrastructure is designed to be versatile, supporting both diagnosis and prognosis tasks. A detailed discussion of specific use cases will be provided in subsequent Sections, while in what follows, the main focus is on the domain adaptation step.
The considered feature set comprises a combination of the high-level variables generated by the MultiVAE (
Section 3.2), alongside the features extracted from the raw signals through digital signal processing operations. The former should be more helpful in this context, as they have been designed to be more domain invariant. Nevertheless, since each machine is processed independently, a transfer learning step can still help map the insights obtained from historical data to the ongoing experiment.
In the case study discussed later, the Domain-Adversarial Neural Network emerged as the most proficient algorithm for exploiting vibration signals. Here, the transfer network functions as a novel feature extractor aiming to align the latent spaces of the source and target domains through its transfer loss, rendering them indistinguishable. In operational terms, these networks were trained simultaneously but with different learning rates, minimizing a composite loss function comprising the regressor’s mean-squared error and the discriminator’s cross-entropy. Further details of the employed architectures, parameters, and training methodologies are provided in
Appendix A.
Selecting an appropriate source domain is a crucial step within condition monitoring applications, significantly influencing the accuracy and validity of predictive outcomes. Without a universal criterion, the choice hinges upon empirical validation. The overarching strategy involves selecting an experiment that closely mirrors the monitored machine’s mechanical attributes and usage dynamics, guided by any available prior information. This approach should facilitate the transfer of knowledge between coherent environments. In industrial contexts, to ensure adequate preparation for new components to be monitored, it is imperative to organize a diverse array of experiments, encapsulate a wide spectrum of configurations, and be able always to provide a reliable set of source data (as in the case presented in
Appendix B).
Another advantageous aspect of employing this domain adaptation framework is its minimal hardware demands, in continuity with the objectives of the autoencoder-based feature extraction methodology outlined in
Section 3.2. In addition to the modest parameter count of the neural networks involved (
Appendix A), the training phase occurs only once, after the initial data collection phase of the target experiment. In a manner analogous to the MultiVAE model, the initial data samples obtained pertain to a system still in a healthy condition, thus establishing a baseline for the machine status.
An alternative approach could have been the periodic retraining of the domain adaptation model in a continuous learning framework as new data arrive. Nonetheless, in the context of real-world components, monitoring operations may persist for extended durations, potentially spanning years, resulting in a substantial accumulation of large volumes of data, necessitating prolonged training times and increased hardware prerequisites. Moreover, different kinds of machine utilization and consumption could exacerbate model performance degradation, compounded by the absence of a standardized labeling protocol for newly acquired samples.
Consequently, it is believed that adopting a lightweight yet reliable and portable setup is imperative. This approach significantly diminishes the duration of simulations and tests, obviating the need for a progressive retraining process. It also cuts the costs associated with periodic data transmission and storage. Furthermore, online operations are reduced to only model inference and forecasting tasks by conducting the training phase offline on dedicated hardware.
3.4. Fault Diagnosis
In industrial system monitoring, the topic of fault detection and diagnosis, namely locating and understanding the underlying cause of anomalies or malfunctions in a system, is crucial. Accurately identifying parts of a complex apparatus that exhibit abnormal behavior enables timely intervention and remediation, thereby helping to prevent machine downtimes, increase dependability, and avoid costly system failures.
Most of the features extracted in the stages described until this point can be used to indicate anomalies in the sensor measurements, at least in comparison with a baseline of ordinary behavior. In the case study under consideration, the optimal strategy was to rely on the MFCC (
Section 3.1) of the vibration signal. The feature extractor network comprised one-dimensional convolutional and linear layers in this case. Although the model required two-dimensional features, the preprocessing step involved computing the mean value across the rows to reduce the fluctuation of the features in the time domain and to focus the analysis on the frequency domain. Further technical details regarding the training phase and the architecture are provided in the
Appendix A. Fault diagnosis often involves different techniques, varying according to the monitored component or available sensors. This pipeline implements a dedicated module that reframes the problem as a classification task, thereby enabling the generalization to new experiments by applying the domain adaptation algorithms presented in
Section 3.3.
The implementation requires a set of source case studies composed of samples collected (and processed) from healthy and faulty conditions. A class represents each condition, and the classifier is trained to distinguish such classes. The classifier is trained using the adversarial domain adaptation framework in which the hidden features generated by the healthy part of the target experiment are aligned with the healthy hidden features of the source sets.
Being set up as an anomaly detection classifier, the algorithms can also be trained offline, with the initial "healthy" hours of the monitored experiment providing the model with the baseline conditions. During the inference phase, in online running, the module outputs two fundamental metrics:
the classification probability of the new features representing a healthy configuration or one of the known fault classes;
the relative risk, defined as the ratio between the current probabilities the diagnosis model returned and those observed during the training phase.
As a machine component gradually deteriorates, the pipeline is expected to warn users about the specific part presenting anomalies. This is achieved by returning, for each acquisition, a vector of probabilities associated with the different health conditions. Conversely, the values associated with the healthy class will inevitably decline over time due to the inevitable consumption. An illustrative example of typical behavior extrapolated within a real-world experiment is presented in
Figure 4, which depicts a typical scenario ending with the pinion breaking. As the test progresses, it becomes evident that the probability associated with the healthy class declines while the values connected with the classes of the faults increase. Among these, the pinion class emerges as the most prominent, as evidenced by the relative risk on the right. A detailed explanation of the case study is provided subsequently in
Section 5.
3.5. RUL Estimation
Machine health prognostics is the other area of research in this context. Unlike diagnostics, prognostic models take a step further by predicting the potential evolution of a device’s damage over time, enabling timely interventions to prevent costly downtimes and maintain operational reliability. By leveraging predictive techniques, prognostic models can anticipate and warn about potential failures, optimize maintenance strategies, enhance safety, and increase the machine lifespan in industrial plants. Prognostic models are designed to provide reliable estimates of the future state of a system based on its current condition and operational history. They also allow the forecasting of critical parameters such as the Health Index (HI) and the Remaining Useful Life (RUL) [
51,
52], which are indicators of the health status of the component being actively monitored.
From an operational standpoint, the prognostic model implemented in the pipeline is based on the same domain adaptation backbone described in
Section 3.3. The setup employs domain adversarial learning to train a regressor on one or more complete run-to-failure sources. This enables the model to learn from the machine’s entire lifespan, thereby enhancing its capability of mapping the observed features’ anomalies to reliable health index values. The input variables are typically comprised of a combination of Digital Signal Processing features (
Section 3.1) and MultiVAE (
Section 3.2) features. However, training a regressor requires a characterization of the source information. In the absence of a general and objective strategy to label the training samples, empirical observations have led to the hypothesis of a linear degradation, which appears to represent a satisfactory approximation over time. Consequently, each feature in the source set is assigned a corresponding Health Index value in the interval [1,0], inversely proportional to the running time of the experiment. This behavior is then transferred to the target latent space during the domain adaptation phase, essentially enforcing the trends of the target features to collapse on such linearity and providing a punctual estimate of the Health Index for the new samples. As previously outlined in
Section 3.3, the model’s training during real-time monitoring necessitates only the target experiment’s initial “healthy" hours to deceive the discriminator in an adversarial learning setup. In contrast, the regressor is trained exclusively with the labeled source samples. The final state of the architecture should comprise a regressor model capable of functioning indiscriminately across the two domains, thereby enabling the generation of a punctual health index estimate during the forward pass.
The number of remaining useful hours until machine failure can be estimated by analyzing the general trend of the Health Index for an ongoing experiment. This is achieved by re-adapting, in the pipeline, a robust open-source forecasting algorithm, such as Facebook Prophet [
53], to support breaking-point prediction. Model predictions can show high fluctuations in real-world data, which is often characterized by large amounts of noise and discontinuities. Consequently, simple forecasting approaches such as linear interpolations can be problematic and misleading if used without further consideration. To overcome this problem, the proposed solution adopts Prophet [
53] to extrapolate the RUL. According to its documentation:
“Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series with strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend and typically handles outliers well.”
To our knowledge, no previous research has exploited this algorithm for condition monitoring applications, as it is primarily designed to analyze trends in economic contexts. However, even if machinery vibrations cannot be commonly considered seasonal data, the robustness of the model to the common frequent changes in the Health Index appears appropriate for this task. Furthermore, as the lifespan of industrial machines is commonly several years long, the algorithm could detect periodicity (or seasonality) in usage and maintenance and exploit this information to improve accuracy.
This choice is contingent on the case study and is not guaranteed to be the optimal solution for all configurations and setups. However, as the pipeline is designed to function in different environments, it allows for the parallel implementation of several alternatives, thereby enabling a fair comparison of the different methods. Potential alternatives to Prophet, depending on the type of machine and data the user will be working with, include:
- -
linear and polynomial interpolations (if the deterioration is sufficiently regular over time);
- -
Long Short-Term Memory (LSTM) networks [
54];
- -
Auto-Regressive Integrated Moving Average (ARIMA) [
55].
Figure 5 depicts an illustrative example of the anticipated behavior of the forecasting algorithm at a specific point in the operational lifetime of the monitored device. It also serves as a comprehensive display of the machine’s status, offering immediate insights to the technician responsible for the pipeline. A detailed description of the scenarios in the image, with a quantitative analysis of the experiments analyzed, is provided in
Section 5.