Submitted:
26 January 2025
Posted:
26 January 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We propose a novel PE-DOCC framework for detecting electricity theft, which incorporates unsupervised representation learning (i.e., pre-train Periodicformer encoder by recovering partially masked input sequences) and one-class classification (anomaly detection based on local outlier factor);
- To extract richer periodic features, within encoder a novel Criss-Cross Periodic Attention (CCPA) is proposed, which comprehensively considers both the horizontal and vertical periodic features;
- Extensive experiments of ETD using the Irish data set are conducted to compare the performance with various state-of-the-art methods. Appropriate metrics are selected for evaluation and the effectiveness of the proposed ETD scheme is verified.
2. Related Work
3. Proposed Framework: PE-DOCC
3.1. Periodicformer Encoder for Unsupervised Representation Learning
3.1.1. Patching Operation
3.1.2. Criss-Cross Periodic Attention
-
Period-based dependencies: Inspired by Auto-Correlation mechanism [19], for an input sequence with length L, we can obtain the autocorrelation by the following Equation 1.reflects the time-delay similarity between and its lag series . Taking the Row auto-correlation in Figure 3 as an example, for single head h and input sequence where , after the linear projector, we get query , key and value . At this point, we consider as sequences along the row direction, i.e., where is the embedding dimension of , with treated in the same manner. Then we can obtain the Row auto-correlation by the following Equation 2.where is the autocorrelation between and . The calculation of column auto-correlation is similar to that of Row auto-correlation, with the only difference being that and are treated as sequences along the column direction. The formula for calculating Column auto-correlation is provided in Equation 3.where is the autocorrelation between the transpose of and the transpose of ;
-
Criss-cross periodic attention: To comprehensively capture the different periodic features extracted by row and column auto-correlation, we use the scaled dot product attention, with the product of row and column auto-correlation serving as the new attention weights. For single head h, the Criss-cross periodic attention (CCPA) mechanism is as the following Equation 4.
3.1.3. Unsupervised Representation Learning
3.2. One-Class Classifier for Anomaly Detection
- Defination 1:
-
k-distance of data point . For a given dataset D and any positive integer k, the k-distance of denoted as , is defined as the distance between data point and data point in the following conditions:
- for at least k data points it holds that ;
- for at most objects it holds that .
- Defination 2:
- k-distance neighborhood of data point . Given the k-distance of , the k-distance neighborhood of contains every data point whose distance from is not greater than the k-distance, as described in Equation 7.
- Definition 3:
- reachability distance of data point with respect to data point . Let k be a natural number. The reachability distance of with respect to is defined as Equation 8.
- Definition 4:
- local reachability density of data point . The local reachability density of is defined as Equation 9.where is the k-distance neighborhood of data point .
- Definition 5:
- local outlier factor of data point . The local outlier factor of is defined as Equation 10.
4. Results
4.1. Dataset
4.2. Attack Model
4.3. Evaluation Metrics
4.4. Analysis of Model Hyper-Parameters
4.5. Comparison with Other Methods
- OCSVM [24]: OCSVM is a support vector machine-based algorithm designed for anomaly detection. It distinguishes normal samples from anomalies by maximizing the margin between the data and the origin in the feature space. We set the kernel function, parameters as ’linear’ and 0.2 respectively;
- iForest [25]: iForest is an efficient unsupervised anomaly detection method leveraging random forests. It isolates anomalies by exploiting their tendency to be easier to separate from the majority of the data through random partitioning. We set the contamination, maxFeatures and nEstim parameters as 0.1, 0.6 and 300 respectively;
- LOF [26]: LOF is a density-based anomaly detection algorithm. It detects outliers by comparing the local density of a sample with the densities of its neighbors, identifying anomalies as points with significantly lower densities. We set the contamination rate as 0.1 while keeping other parameters as default;
- Overall, the combination of the Periodicformer encoder and LOF achieved the best performance, with F1, AUC, Recall, and FPR scores of 0.833, 0.973, 0.877, and 0.025, respectively. Compared to the second-best approach, the combination of Autoencoder and LOF (F1, AUC, Recall, and FPR scores of 0.814, 0.952, 0.856, and 0.027, respectively), it achieved improvements of 2.3%, 1.7%, 2.4%, and 8% in F1, AUC, Recall, and FPR, respectively;
- The three approaches that rely solely on the OC classifier exhibit relatively poor anomaly detection performance, with some even failing (defined as AUC or Recall values below 0.5). This observation aligns with the findings reported in Reference [27]. This suggests that the one-class classifier fails to effectively capture the distribution of normal samples in the training dataset, making it difficult to distinguish between normal and anomalous samples in the test set. This underscores the importance of designing a robust unsupervised representation learning method to extract meaningful features, aiding the one-class classifier in better modeling the distribution of normal samples and improving anomaly detection performance;
- Comparing the proposed combination of the Periodicformer encoder and OC classifier with the combination of Autoencoder and OC classifier, specifically the combination of Periodicformer encoder and LOF versus Autoencoder and LOF, the Periodicformer encoder demonstrates superior suitability for modeling time series data. Additionally, our proposed criss-cross periodic attention further enhances the model’s ability to capture richer periodic features, which helps distinguish normal data from anomalous data. In contrast, the Autoencoder relies solely on CNN to capture local features without considering the periodicity of time series data, resulting in poorer performance;
- In the three combinations of the Periodicformer encoder with different OC classifiers, the scheme combined with iForest yields the poorest performance. According to Reference [28], iForest is sensitive only to global outliers and struggles with detecting local outliers. The distribution of power theft behavior data is particularly unfavorable for iForest detection. For instance, when considering an entire year (365 days), power theft typically occurs on specific days or weeks. These anomalies may not be noticeable when viewed in the context of the whole year. However, when compared to adjacent days or weeks, these outliers may become more apparent. This is precisely where iForest’s limitations lie. Regarding the combination with OCSVM, Reference [29] states that OCSVM is sensitive to noise and prone to false positives, which aligns with our experimental findings. Although it achieves a high recall, it also results in a relatively high false positive rate.
4.6. Ablation Analysis
- RA-DOCC: This method uses a combination of a Periodicformer encoder with only row auto-correlation and LOF;
- CA-DOCC: This method uses a combination of a Periodicformer encoder with only column auto-correlation and LOF;
- Vanilla-DOCC: This method uses a combination of a vanilla Transfomer encoder [30] and LOF.
- RA-DOCC and CA-DOCC produce comparable experimental results, but show notable improvements over Vanilla-DOCC. Specifically, F1, AUC, Recall, and FPR improve by approximately 5%, 1%, 4%, and 34%, respectively. This suggests that row auto-correlation and column auto-correlation are all effective in capturing periodic features, which help the model distinguish anomalous samples;
- Our proposed Criss-cross periodic attention effectively integrates multiple periodic features. Compared to the vanilla Transformer encoder, F1, AUC, Recall, and FPR have improved by 15.3%, 3.2%, 12.5%, and 104%, respectively.
5. Discussion and Conclusions
References
- Xia, X.; Xiao, Y.; Liang, W.; Cui, J. Detection Methods in Smart Meters for Electricity Thefts: A Survey. Proceedings of the IEEE 2022, 110, 273–319. [Google Scholar] [CrossRef]
- Yan, Z.; Wen, H. Performance Analysis of Electricity Theft Detection for the Smart Grid: An Overview. IEEE Transactions on Instrumentation and Measurement 2022, 71, 1–28. [Google Scholar] [CrossRef]
- Stracqualursi, E.; Rosato, A.; Di Lorenzo, G.; Panella, M.; Araneo, R. Systematic review of energy theft practices and autonomous detection through artificial intelligence methods. Renewable and Sustainable Energy Reviews 2023, 184, 113544. [Google Scholar] [CrossRef]
- Xia, X.; Lin, J.; Jia, Q.; Wang, X.; Ma, C.; Cui, J.; Liang, W. ETD-ConvLSTM: A Deep Learning Approach for Electricity Theft Detection in Smart Grids. IEEE Transactions on Information Forensics and Security 2023, 18, 2553–2568. [Google Scholar] [CrossRef]
- Zhu, Y.; Zhang, Y.; Liu, L.; Liu, Y.; Li, G.; Mao, M.; Lin, L. Hybrid-Order Representation Learning for Electricity Theft Detection. IEEE Transactions on Industrial Informatics 2023, 19, 1248–1259. [Google Scholar] [CrossRef]
- Liao, W.; Yang, Z.; Liu, K.; Zhang, B.; Chen, X.; Song, R. Electricity Theft Detection Using Euclidean and Graph Convolutional Neural Networks. IEEE Transactions on Power Systems 2023, 38, 3514–3527. [Google Scholar] [CrossRef]
- Takiddin, A.; Ismail, M.; Zafar, U.; Serpedin, E. Deep Autoencoder-Based Anomaly Detection of Electricity Theft Cyberattacks in Smart Grids. IEEE Systems Journal 2022, 16, 4106–4117. [Google Scholar] [CrossRef]
- Liang, Q.; Zhao, S.; Zhang, J.; Deng, H. Unsupervised BLSTM-Based Electricity Theft Detection with Training Data Contaminated. ACM Trans. Cyber-Phys. Syst. 2024, 8. [Google Scholar] [CrossRef]
- Huang, Y.; Xu, Q. Electricity theft detection based on stacked sparse denoising autoencoder. International Journal of Electrical Power & Energy Systems 2021, 125, 106448. [Google Scholar] [CrossRef]
- Cai, Q.; Li, P.; Wang, R. Electricity theft detection based on hybrid random forest and weighted support vector data description. International Journal of Electrical Power & Energy Systems 2023, 153, 109283. [Google Scholar] [CrossRef]
- Senol, N.S.; Baza, M.; Rasheed, A.; Alsabaan, M. Privacy-Preserving Detection of Tampered Radio-Frequency Transmissions Utilizing Federated Learning in LoRa Networks. Sensors 2024, 24. [Google Scholar] [CrossRef] [PubMed]
- Yi, X.; Yang, X.; Huang, Y.; Ke, S.; Zhang, J.; Li, T.; Zheng, Y. Gas-Theft Suspect Detection Among Boiler Room Users: A Data-Driven Approach. IEEE Transactions on Knowledge and Data Engineering 2022, 34, 5796–5808. [Google Scholar] [CrossRef]
- Barbariol, T.; Susto, G.A. TiWS-iForest: Isolation forest in weakly supervised and tiny ML scenarios. Information Sciences 2022, 610, 126–143. [Google Scholar] [CrossRef]
- Tirulo, A.; Chauhan, S.; Issac, B. Ensemble LOF-based detection of false data injection in smart grid demand response system. Computers and Electrical Engineering 2024, 116, 109188. [Google Scholar] [CrossRef]
- Liang, D.; Wang, J.; Gao, X.; Wang, J.; Zhao, X.; Wang, L. Self-supervised Pretraining Isolated Forest for Outlier Detection. In Proceedings of the 2022 International Conference on Big Data, Information and Computer Network (BDICN); 2022; pp. 306–310. [Google Scholar] [CrossRef]
- Kim, C.; Chang, S.Y.; Kim, J.; Lee, D.; Kim, J. Automated, Reliable Zero-Day Malware Detection Based on Autoencoding Architecture. IEEE Transactions on Network and Service Management 2023, 20, 3900–3914. [Google Scholar] [CrossRef]
- Cui, X.; Liu, S.; Lin, Z.; Ma, J.; Wen, F.; Ding, Y.; Yang, L.; Guo, W.; Feng, X. Two-Step Electricity Theft Detection Strategy Considering Economic Return Based on Convolutional Autoencoder and Improved Regression Algorithm. IEEE Transactions on Power Systems 2022, 37, 2346–2359. [Google Scholar] [CrossRef]
- Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the Proceedings of the AAAI conference on artificial intelligence, 2023, Vol. 37, pp. 11121–11128.
- Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. In Proceedings of the Advances in Neural Information Processing Systems; Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; Vaughan, J.W., Eds. Curran Associates, Inc., 2021, Vol. 34, pp. 22419–22430.
- Tuli, S.; Casale, G.; Jennings, N.R. TranAD: deep transformer networks for anomaly detection in multivariate time series data. Proc. VLDB Endow. 2022, 15, 1201–1214. [Google Scholar] [CrossRef]
- for Energy Regulation (CER), C. CER smart metering project-electricity customer behaviour trial, 2009-2010, 2012.
- Qi, R.; Zheng, J.; Luo, Z.; Li, Q. A Novel Unsupervised Data-Driven Method for Electricity Theft Detection in AMI Using Observer Meters. IEEE Transactions on Instrumentation and Measurement 2022, 71, 1–10. [Google Scholar] [CrossRef]
- Zerveas, G.; Jayaraman, S.; Patel, D.; Bhamidipaty, A.; Eickhoff, C. A Transformer-based Framework for Multivariate Time Series Representation Learning. In Proceedings of the Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, New York, NY, USA, 2021; KDD ’21, p. 2114–2124. [CrossRef]
- Schölkopf, B.; Platt, J.C.; Shawe-Taylor, J.; Smola, A.J.; Williamson, R.C. Estimating the support of a high-dimensional distribution. Neural computation 2001, 13, 1443–1471. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, 2008, pp. 413–422. [CrossRef]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: identifying density-based local outliers. SIGMOD Rec. 2000, 29, 93–104. [Google Scholar] [CrossRef]
- Kim, C.; Chang, S.Y.; Kim, J.; Lee, D.; Kim, J. Automated, Reliable Zero-Day Malware Detection Based on Autoencoding Architecture. IEEE Transactions on Network and Service Management 2023, 20, 3900–3914. [Google Scholar] [CrossRef]
- Cheng, Z.; Zou, C.; Dong, J. Outlier detection using isolation forest and local outlier factor. In Proceedings of the Proceedings of the Conference on Research in Adaptive and Convergent Systems, New York, NY, USA, 2019; RACS ’19, p. 161–168. [CrossRef]
- Lu, T.; Wang, L.; Zhao, X. Review of Anomaly Detection Algorithms for Data Streams. Applied Sciences 2023, 13. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems; Guyon, I.; Luxburg, U.V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; Garnett, R., Eds. Curran Associates, Inc., 2017, Vol. 30.






| Descriptions | Values |
| Number of smart meters | 25730 |
| Time span | 14/7/2009-31/12/2010 |
| Recording interval | 30 minutes |
| Attack Type | Modification |
|---|---|
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 |
| Samples | Actually positive | Actually negative |
| Predicted positive | TP | FP |
| Predicted negative | FN | TN |
| Hyper-parameter | Range of values | Optimal value |
| k | 100, 200, 300 | 200 |
| contamination rate | 0.01, 0.1, 0.2 | 0.01 |
| distance metric | ’manhattan’, ’euclidean’ | ’euclidean’ |
| d | 64,128 | 128 |
| H | 8,16 | 8 |
| N | 2,3,4 | 3 |
| Model for unsupervised representation learning | OC classifier | Metrics | |||
| F1 | AUC | Recall | FPR | ||
| - | OCSVM | 0.464 | 0.893 | 0.816 | 0.217 |
| iForest | - | - | - | - | |
| LOF | - | - | - | - | |
| Periodicformer encoder | OCSVM | 0.517 | 0.940 | 0.918 | 0.189 |
| iForest | 0.434 | 0.860 | 0.752 | 0.194 | |
| LOF | 0.833 | 0.973 | 0.877 | 0.025 | |
| Autoencoder | OCSVM | 0.505 | 0.917 | 0.890 | 0.209 |
| iForest | 0.420 | 0.840 | 0.743 | 0.223 | |
| LOF | 0.814 | 0.952 | 0.856 | 0.027 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).