Submitted:
27 July 2025
Posted:
28 July 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Works
2.1. Categorization of Non-Technical Loss Detection Methods
- Category and Concept: Refers to the specific classification and related subcategories of a research study.
- Algorithms: Conceptual indications of the computational methods employed to identify non-technical losses. The nature of the data determines the selection of the appropriate algorithm, the application context, and the specific performance requirements. Certain algorithms may be more effective in environments with large datasets, while others may be optimized for greater accuracy in more limited datasets.
- Data Type: Specifies the data types required for each detection approach. This dimension is crucial when developing a new method or selecting between existing approaches.
- Dataset Size: Relates to the volume of data required for practical analysis, generally determined by the number of consumers involved.
- Features:In many scenarios, before any analysis, raw data is processed to extract essential features that will serve as input for classification techniques.
- Metrics: Represents the performance metrics adopted to evaluate the effectiveness of detection systems in different contexts, facilitating comparison between different approaches.
2.1.1. Categorization and Definitions of Data Types
2.1.2. Raw Data Used in NTL Detection
- The methods, with their data-driven foundation, primarily rely on time series and static data related to the consumer. It is notable that such methods unanimously incorporate data related to energy consumption in a temporal sequence, and approximately half of them also integrate static data.
- In turn, grid-focused methods adopt energy consumption data, whether high or medium resolution, complemented by measurements related to voltage and current. It is worth highlighting the importance attributed to measurements from other devices, such as observer meters or RTUs, and the essential need for a detailed understanding of the topology of networks.
- In line with what could be anticipated, the methods classified as hybrid explore a broad spectrum, encompassing both of the two categories previously mentioned (data-oriented and network-oriented).
2.1.3. Features Used in NTL Detection
- Basic Statistics: Comprising the average, maximum/minimum values, and standard deviation, calculated over a given interval.
- Power Factor: Established as the ratio between active (kW) and reactive (kVAr) power, this requires instantaneous power measurements and high-resolution data (preferably up to 15 minutes) for an accurate estimate.
- Load Factor: Describes the relationship between average consumption and peak active energy over a set period.
- Streaks: Denotes the frequency with which the consumption curve crosses a defined moving average.
- Consumption to Contracted Power Ratio: Relates total active energy consumption over a period to the contracted power value.
- Pearson’s Coefficient [11]: This coefficient assesses the adequacy of a linear regression between active energy consumption and time.
- Billed/Consumed Energy Ratio: Reflects the discrepancy between billed and consumed energy, normalized by contracted power.
- Consumption Projection: An estimate of future consumption or the discrepancy between a projection and observed values.
- Wavelet Coefficients [12]: Measure the discrepancy between the wavelet coefficients of a current consumption curve and those of previous periods.
- Fourier Coefficients: Similar to wavelet coefficients, but focused on Fourier analysis. The phase of the first coefficients can also be taken into consideration.
- Polynomial Coefficients: Contrast the coefficients of polynomials fitted to the current consumption curve with those of previous periods.
- Distance to the Average Consumer: Corresponds to the calculation of the Euclidean distance between an individual consumption curve and the average consumption of all consumers.
- Consumption Curve Slope: Measure of the slope of the best-fit line to the consumption curve time series.
- Principal Component Analysis (PCA): Derived from Principal Component Analysis or its "kernelized" counterpart (most recent publications). A selection of these components can be used.
- Fractional Order Dynamic Errors: These characteristics reflect the variations between a meter and real-time consumption records.
- Miss adjustment Rate: Quantifies the divergence between measurements at the MV/LV transformer and the sum of measurements from smart meters and estimated technical losses, all normalized by the nominal power of the substation.
- Seasonal Consumption Ratios: Compare energy consumption in different seasons or consumption relative to the average of consumers at the same substation in a specific season.
- Discrete Cosine Transform Coefficients: Involve the first k coefficients of this transformation.
- Percentage Change in Consumption: Represents an x% reduction in consumption during a period T compared to a previous interval or relative to the average.
- Estimated Records: Refer to the number of records made by estimate, in the event of inaccessibility to the meter.
2.2. Performance Metrics Used in Non-Technical Loss Detection
- Accuracy: (TP + TN) / (TP + TN + FP + FN) (2.3.1)
- Detection Rate (DR): TP / (TP + FN) (2.3.2)
- Precision: TP / (TP + FP) (2.3.3)
- False Positive Rate (FPR): FP / (FP + TN) (2.3.4)
- True Negative Rate (TNR): TN / (FP + TN) (2.3.5)
- False Negative Rate (FNR): FN / (FN + TP) (2.3.6)
- F1 Score: 2TP / (2TP + FP + FN) (2.3.7)
- AUC (Area Under the Curve): The area under the ROC (Receiver Operating Curve) of the binary classifier; (2.3.8)
- Recognition Rate: 1 - 0.5 x (FP/N + FN/P) (2.3.9)
- Bayesian Detection Rate: P(I) x DR / (P(I) x DR + P(¬I) x FPR) (2.3.10)
- Support: Refers to rule-driven systems. Represents the proportion of data to which a specific rule is applicable, relative to the total data set.
- Training Time (s): Represents the amount of time, in seconds, required to train an NTL detection model.
- Classification Time (s): Denotes the time, in seconds, it takes a NTL detection system to classify a single instance.
- Cost of Undetected Attack: Quantifies the financial impact of the most damaging attack that was not identified by the system.
- Energy Balance Mismatch: Corresponds to the discrepancy between the total energy consumed at the user level and the energy recorded at the substation.
- Average Bill Increase: Indicates the increase in the average bill if NTL losses were shared among all consumers.
- Normalized Labor Cost: Estimates the costs associated with inspecting all instances categorized as NTL by the system.
- Anomaly Coverage Rate: Defines the fraction of anomalous consumers under the supervision of an RTU compared to the total number of anomalous consumers.
- RTU (Remote Technical Unit) Cost: Represents the total costs for implementing and maintaining an RTU.
- Minimum Deviation Detected: Specifies the smallest deviation from a predetermined standard that can be identified by the system.
- Reduction in Stolen Electricity: Quantifies the reduction in the volume of illicitly appropriated electricity when implementing a specific FDS.
2.3. Algorithms Used in Non-Technical Loss Detection Systems
2.3.1. Data-Oriented Methods
- Data processing and model selection: From a raw dataset, the appropriate model for NTL detection needs to be selected. The presence (or absence) of previously labeled data influences the decision between supervised and unsupervised methods. Additionally, the quality and diversity of the data influence algorithm selection. This selection may eventually disregard parts of the raw dataset, as configured in the data selection step. Subsequently, there is the data cleaning step, a common practice in knowledge discovery, followed by feature extraction, if applicable.
- Modeling: The modeling approach varies depending on whether it is supervised or unsupervised. Unsupervised models do not use labeled data during training, using it only for evaluation purposes. Supervised methods, on the other hand, segment the dataset into training and testing sets. Once the training set is established (usually through cross-validation), feature selection is often employed in the training phase. Simultaneously, parameter optimization uses metrics that can be determined based on the availability of labels in the data.
- Application: New data, which are not part of the original "Raw Data" set, are used to evaluate the effectiveness and operability of the model in question. The classification results are then processed to generate a list of potential offenders—that is, a list with the associated probability of each consumer committing fraud. This step can be related to the testing phase of the NTL detection model or its simulation.
2.3.2. Network-Oriented Methods
3. Methodology Based on the Use of The Fuzzy-Art Neural Network
3.1. Construction and Characterization of the Database
- Tariff A, B, C and D:– with different prices depending on the time of day;
- Tariff E:– fixed tariff, used as a control group.
- Tariff homogeneity: no hourly or weekly differences;
- Natural consumption profiles: not affected by economic incentives;
- Temporal consistency: essential for detecting fraud or genuine anomalous patterns.
3.2. Data Processing and Pre-Processing
3.2.1. Data Cleaning and Inconsistency Treatment
3.2.2. Min-Max Normalization
- X: original value of the characteristic,
- Xmin: minimum value observed for this characteristic in the entire dataset
- Xmax: maximum value.
3.3. Generating Artificial Fraudulent Samples for Training and Testing
- Shuffling and initial division of the benign base. Previously, all vectors containing daily consumption data that had been classified as benign (after the cleaning and preprocessing steps) were randomly shuffled. Subsequently, the total set of vectors designated as benign was divided into two parts of exactly equal size. This initial division serves to create a "pool" of data that will be kept as benign and another "pool" that will be labeled as fraud.
- Definition of benign samples. The first half of the original benign vectors remained unchanged. These samples were labeled "0," indicating their normal, as well as expected, consumption status. They are essential for the Fuzzy-ART algorithm to build an accurate representation of what is "normal" in the dataset, serving as a comparison base for identifying anomalies.
- Fraud simulation base. The second half of the original benign vectors was reserved exclusively for fraud simulation. These samples were labeled "1", indicating their fraudulent nature after the application of fair manipulations.
- Type 1–
- Constant consumption reduction: In this scenario, the simulation simulates a tampering that uniformly affects all recorded consumption throughout the day. Each value of the daily consumption vector (consumption in each 30-minute interval) was multiplied by a fixed random factor. This factor was selected uniformly within the range of 0.1 to 0.3, meaning that the original consumption was reduced to 10% to 30% of its true value at each measurement point.
- Type 2–
- Partial measurement interruption: This type of fraud simulated a localized and temporary interruption in the consumption record, which could indicate a bypass or disconnection. A continuous segment of the daily vector was randomly selected at any point during the day. The length of this segment varied between 3 and 12 30-minute intervals (corresponding to a period of 1.5 to 6 hours). The consumption values within this segment were then replaced with zeros, while the rest of the vector remained unchanged. This manipulation represents scenarios involving temporary meter disconnection or an intervention aimed at eliminating consumption records for a specific period of the day.
- Type 3–
- Random point-by-point reduction: Unlike constant reduction (Type 1), this class of fraud induced a more irregular and unpredictable distortion in consumption. For each 30-minute interval of the daily vector, its value was individually reduced by a different random factor, also between 0.1 and 0.3. Each factor was sampled independently, introducing variability at each time point. The result is an asymmetric and non-uniform tampering with consumption, where the usual pattern may be preserved, but with erratic variations at each measurement point.
- Type 4–
- Consumption reduction while preserving the original shape of the consumption profile. In this type of fraud, the objective is to simulate a proportional decrease in daily consumption while maintaining the original morphology of the load profile. To do this, the average daily consumption for each vector is calculated, and then a random reduction factor of between 10% and 30% is applied. All 48 values for the day are then adjusted by a correction factor that reduces the daily average to the new target value. This way, the profile maintains peak and valley times, as well as the relative proportion between consumption intervals, making it visually similar to a real day, but with consistently lowers values.
- Type 5–
- Artificially constant consumption. This type of fraud aims to completely mask the real consumption profile and replace it with an artificially stable pattern. The average of the original vector replaced all values in the daily vector. If the averages daily consumption was X (kWh), all 48 consumption points were set equal to X. This generated a perfectly flat consumption profile, without any variations throughout the day, which would in reality be highly unlikely for a typical residential consumer, who typically experiences significant fluctuations throughout the day (e.g., consumption peaks in the morning, afternoon, and evening).
- Type 6–
- Temporal Inversion. This class of fraud exploits the temporal order of consumption data, a characteristic that is essential and predictable in real residential demand profiles. The 48 values of the daily vector were inverted, which means that consumption recorded in the early hours of the day is swapped with consumption in the later hours, and vice versa. For example, early morning consumption (which is usually low) is swapped with midday or evening consumption (which is generally high). This maneuver was designed to exploit potential vulnerabilities in pricing or recording schemes that strictly depend on the time of use. A complete inversion in the profile is highly anomalous and would not represent the actual consumption behavior of a domestic user; therefore, it is a clear indicator of manipulation.
- Type 7–
- Substitution with random minimum values. In this scenario, the fraud aims to drastically reduce the energy bill, but with sufficient complexity to evade more basic detection systems. For each value in the daily vector, it was replaced with a random number between zero and the minimum value observed in the original consumption vector for that day. The result of this manipulation is an artificially low consumption profile, but one that still presents minor random variations, simulating a severely underestimated reading. It could represent fraud where the meter is constantly forced to register values very close to zero, with minor fluctuations, to try to appear more conventional.
- −
- For each fraud typology (Types 1 to 7): The original benign database, consisting of 7,000 consumption days, was divided into two equal parts. One half (3,500 rows) was retained to represent the benign days, while the other half (the remaining 3,500 rows) was used to generate the fraudulent samples specific to that typology. After generation, the benign case base used to create the fraudulent cases was discarded, resulting in a final database for each fraud typology composed of 3,500 benign days and 3,500 fraudulent days of the same type.
- −
- For the general set: This set, which represents a balanced combination of all seven fraud types, was created similarly. A portion of the 3,500 benign days was retained, and for the fraudulent days, 500 new instances were generated for each of the seven typologies, totaling 3,500 fraudulent days.
3.4. Configuration and Operation of the Fuzzy-ART Algorithm
- Choice parameter (α):
- Defines the initial selection of the cluster most similar to a new sample, in this case, influencing the affinity calculation before the vigilance test (ρ). Higher values tend to favor larger or more general existing clusters, while lower values prioritize exact similarity, which can lead to the creation of more clusters.
- Vigilance parameter (ρ):
- This parameter can be thought of as a "similarity criterion," resonance, or tuning. It ranges from 0 to 1 and defines how similar a new consumption reading must be to an existing cluster to be added to it. A high value (close to 1) means the algorithm is very demanding: it will only accept a new reading in a cluster if it is rigorously similar to the patterns that the cluster already represents. This results in many small, particular clusters, each capturing a very distinct type of behavior. A low value (close to 0) makes the algorithm more flexible, as it will accept new readings even if they are significantly different from the cluster, resulting in fewer, but larger and more comprehensive clusters. Choosing an appropriate ρ is crucial for Fuzzy-ART to efficiently distinguish between fraudulent and standard consumption patterns without creating excessive data fragmentation.
- Learning parameter (β):
- This parameter, β, is thought of as the model’s "adaptation speed." It also ranges from 0 to 1 and determines how quickly existing groups adapt to new patterns they encounter:
- −
- A value close to 1.0 allows the model to learn quickly. Groups adapt intensely to each new reading they absorb.
- −
- A lower value results in slower learning and more stable groups. Groups are less influenced by a single reading, becoming more representative of the average of all readings in that group. In certain situations, a lower β is preferable to prevent the model from being overly influenced by atypical or "noisy" readings. The logic behind the Fuzzy-ART ANN involves a process of continuous comparison and adjustment.
3.5. Model Performance Evaluation
4. Results Obtained
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Fuzzy-Art Neural Networks
- a: normalized input vector;
- ||.||: 1-norm of a vector.
- 4.
- Choice Parameter: α > 0;
- 5.
- Training rate: β ∈ [0,1];
- 6.
- Vigilance Parameter: ρ ∈ [0,1].
References
- Carpenter, G.A. and Grossberg, S. “A self-organizing neural network for supervised learning, recognition and prediction”, IEEE Communications Magazine, Vol. 30, No. 9, pp. 38–49, 1992.
- Grossberg, S. “Adaptive resonance theory: how a brain learns to consciously attend, learn, and recognize a changing world”, Neural Net. Vol. 37 2013, pp. 1-47. [CrossRef]
- Carpenter, G.A. , & Grossberg, S. (2016). “Adaptive resonance theory”, Springer. [CrossRef]
- Jené-Vinuesa, M.; Aragüés-Peñalba, M.; Sumper, A. “Comprehensive data-driven framework for detecting and classifying non-technical distribution losses”, IEEE Access, Vol. 1, 2024. [CrossRef]
- Haykin, S. “Neural networks and learning machines”, 3. ed. Upper Saddle River: Prentice-Hall, 2008.
- Brazilian Senate Agency. “CTFC Approves limits on the inclusion of non-technical losses in electricity bills, Senate News, Nov-2021. (In Portuguese).
- Marchiori, S. C.; da Silveira, M.C; Lotufo, A.D. P, Minussi, C.R. and Lopes, M.L.M. “Neural network based on adaptive resonance theory with continuous training for multi-configuration transient stability analysis of electric power systems”, Applied Soft Computing, Vol. 11, No. 1, Jan-2011, pp. 706-715. [CrossRef]
- Zadeh, L. A. “Fuzzy sets”, Information and Control, 1965, Vol. 8, No. 3, p. 338-353. [CrossRef]
- Messinis, G. M.; Hatziargyriou, N. D. “Review of non-technical loss detection methods”, Electric Power Systems Research, 2018, Vol. 158, 2018, pp. 250–266. [CrossRef]
- Faria, L. T.; Melo, J. D.; Padilha-Feltrin, A. “Spatial-temporal estimation for nontechnical losses”, IEEE Transactions on Power Delivery, 2016, Vol. 31, No. 1, pp. 362–369.
- Pearson, K. ”On lines and planes of closest fit to systems of points in space”, Philosophical Magazine, 1901, Vol. 2, No. 6, pp. 559–572. [CrossRef]
- Daubechies, I. “Ten lectures on wavelets. Philadelphia: Society for Industrial and Applied Mathematics, 1992, 377 p.
- Axelsson, S. “The base-rate fallacy and the difficulty of intrusion detection”, ACM Transactions on Information and System Security (TISSEC), 2000, Vol. 3, No. 12, p. 186–205.
- Duda, R. O. and Stork, D.G. “Pattern classification”, 2001, 2nd ed., New York: Wiley.
- Cortes, C. and Vapnik, V. "Support-vector networks", Machine Learning, 1995, Vol. 20, No. 3, pp. 273–297. [CrossRef]
- Silveira, V.G.; Silva-Santos, A.; Lopes, M.L.M.; da Silva, J.F.R. e Faria, L.T. “Detection of non-technical losses via ARTMAP-Fuzzy neural network in electrical energy distribution systems”, XXIV Brazilian Congress of Automation, 2022, pp. 1-8. (in Portuguese).
- Quinde, S.; Rengifo, J.; Vaca-Urbano, F. “Non-technical loss detection using data mining algorithms”, IEEE PES Innovative Smart Grid Technologies Conference, Sep. 2021. [CrossRef]
- Wang, Z.; Li, G.; Wang, X.; Chen, C.; Huan, L. “Analysis of 10kV non-technical loss detection with data-driven approaches”, IEEE Innovative Smart Grid Technologies - Asia, 2019, pp. 4154–4158. [CrossRef]
- Badawi, S. A.; Takruri, M.; Al-Bashayreh, M. G.; Salameh, K.; Humam, J.; Assaf, S.; Aziz, M. R.; Albadawi, A.; Guessoum, D. E.; Elbadawi, I. A.; Al-Hattab, M. “A novel two-stage method to detect non-technical losses in smart grids”, IET Smart Cities, 2024, pp. 96-111. [CrossRef]
- Breiman, L. "Arcing the edge", Technical Report 486. Statistics Department, University of California, Berkeley, 1997, pp.1-14.
- Moreno, D. A.; Holguin, M.; Holguín, G. A.; Hernandez, B. “An industry 4.0 based data analytics framework for the detection of non-technical losses in a smart grid”. 2023 IEEE 6th Colombian Conference on Automatic Control (CCAC), 2023, pp. 1–6. [CrossRef]
- Esmael, A. A.; da Silva, H.H.; Ji, T.; Torres, R.S. “Non-technical loss detection in power grid using information retrieval approaches: A comparative study. IEEE Access, Vol. 9, pp. 40635–40648, 2021. [CrossRef]
- LeCun, Y.; Boser, B.; Denker, J. S.; Henderson, D.; Howard, R. E.; Hubbard, W.; Jackel, L. D. "Backpropagation applied to handwritten zip code recognition", neural computation, 1989, Vol. 1, No. 4, pp. 541–551. [CrossRef]
- Medeiros, M. H.; Sanz-Bobi, M. A.; Domingo, J. M.; Picchi, D. “Network oriented approaches using smart metering data for non-technical losses detection”, IEEE PowerTech Conference, 2021. [CrossRef]
- Raggi, L. M. R.; Trindade, F. C. L.; Cunha, V. C.; Freitas, W. “Non-technical loss identification by using data analytics and customer smart meters”. IEEE Transactions on Power Delivery, 2020, Vol. 35, No. 6, pp.2700-2710. [CrossRef]
- Bezerra, U. H.; Soares, T. M.; Nunes, M. V. A.; Tostes, M. E.L.; Vieira, J.P.A.; Agamez, P.; Viana, P. R. A. “Non-technical losses estimation in distribution feeders using the energy consumption bill and the load flow Power Summation Method”, IEEE International Energy Conference, 2016, pp. 1–6. [CrossRef]
- Ferreira, T. S. D.; Trindade, F. C. L.; Vieira, J. C. M. “Load flow-based method for nontechnical electrical loss detection and location in distribution systems using smart meters”. IEEE Transactions on Power Systems, Vol. 35, No. 5, pp. 3671–3681, Sept. 2020. [CrossRef]
- Pengwah, A. B.; Razzaghi, R.; Andrew, L. L. H. “Model-less non-technical loss detection using smart meter data”, IEEE Transactions on Power Delivery, Vol. 38, No. 5, Oct. 2023. [CrossRef]
- Yeckle, J.; Tang, B. “Detection of Electricity Theft in Customer Consumption Using Outlier Detection Algorithms”. 2018 1st International Conference on Data Intelligence and Security (ICDIS), South Padre Island, TX, USA, 2018, pp.135-140. [CrossRef]
- Grossberg, S. “Conscious mind, resonant brain: how each brain makes a mind”, Oxford University Press, Jul-2021, 768 p.
| Tariff | Nocturnal (23h – 08h) | Diurnal (08h–17h / 19h – 23h) | Peak (17h–19h) |
|---|---|---|---|
| A | 12 | 14 | 20 |
| B | 11 | 13,5 | 26 |
| C | 10 | 13 | 32 |
| D | 9 | 12.5 | 38 |
| Type | Accuracy | Sensitivity | Specificity | MCC | Created Clusters | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| - | C | C+B | C | C+B | C | C+B | C | C+B | C | C+B |
| 1 | 0.8767 | 0.8743 | 0.9476 | 0.9629 | 0.8057 | 0.7857 | 0.7610 | 0.7606 | 59 | 107 |
| 2 | 0.8771 | 0.6748 | 0.8410 | 0.7771 | 0.9133 | 0.5724 | 0.7563 | 0.3571 | 12 | 169 |
| 3 | 0.8790 | 0.8829 | 0.9019 | 0.9143 | 0.8562 | 0.8514 | 0.7589 | 0.7672 | 172 | 132 |
| 4 | 0.8733 | 0.8610 | 0.9257 | 0.9295 | 0.8210 | 0.7924 | 0.7508 | 0.7288 | 64 | 86 |
| 5 | 0.9495 | 0.9119 | 0.9914 | 0.9048 | 0.9076 | 0.9190 | 0.9022 | 0.8239 | 879 | 176 |
| 6 | 0.8824 | 0.7586 | 0.8248 | 0.7495 | 0.9400 | 0.7676 | 0.7699 | 0.5172 | 1126 | 720 |
| 7 | 0.9533 | 0.9357 | 0.9781 | 0.9895 | 0.9286 | 0.8819 | 0.9078 | 0.8765 | 213 | 124 |
| All | 0.8100 | 0.7529 | 0.8381 | 0.8162 | 0.7819 | 0.6895 | 0.6210 | 0.5098 | 155 | 352 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).