Submitted:
25 July 2024
Posted:
26 July 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- wastewater from both circulating and open hydro-ash and slag-removal (HSR) systems of power plants that operate on solid fuels;
- blowdown water from the circulating water supply system of thermal power plants, which is discharged continuously;
- wastewater from water treatment plants, which can be discharged periodically or continuously, including reverse osmosis concentrates, wash water from mechanical filters, and eluates after the regeneration of ion exchange filters;
- blowdown water from steam boilers, evaporators, and steam converters that is discharged continuously;
- snow and rain runoff that contains suspended particles, such as various types of pollutants and petroleum products, including fuel oil;
- oily, contaminated external condensate, suitable for feeding steam evaporators after cleaning;
- spent washing solutions, acidic and alkaline, and wash waters, after chemical washing and preservation, for steam boilers, condensers, heaters, and other equipment, including periodic runoff, typically formed in the summertime”[3].
2. Literature Review
- noisy data, missing values, associated with the data transmission specifics from sensors;
- unbalanced distribution of classes (imbalanced data), associated with the peculiarity of the event (rare and missed events);
- unlabeled data related to the specifics of the wastewater excess pollutants MPC monitoring (the use of indirect rather than direct measurement sensors);
- lack of metrics for solving tasks with unlabeled data.
3. Materials and Methods
3.1. Problem Statement
- 5.
- removal of noise in the data (achieved by moving average method);
- 6.
- data normalization for stable algorithms’ operation;
- 7.
- aggregation of data when it is presented in unequal time intervals;
- 8.
- logarithm of data (to acquire the homoscedasticity);
- 9.
- differentiation of data (to remove the trend component and bring the time series to a stationary one).
3.2. The Isolation Forest
- if ;
- if ;
- if .
- 10.
- if , then an instance of the class can be considered anomalous with certainty;
- 11.
- if , then an instance of the class can be considered normal;
- 12.
- if anomaly estimation yields s ≈ 0.5 for all instances, then the sample has no anomalies.
3.3. Method IFPC (Isolation Forest – Predicates’ Conjunction)
- 1.
- Data preparation:
- 13.
- data download;
- 14.
- data concatenation;
- 15.
- preprocessing (normalization) of data.
- 2.
- For all sensor channels:
- 16.
- identifying anomalous sample instances by the basic method;
- 17.
- output of time series with marked anomalies;
- 18.
- detection of complex anomalies by the logical operation of conjunction.
- 3.
- Extracting and displaying dates where complex anomalies were detected.
- 19.
- pandas (data processing);
- 20.
- matplotlib (data visualization);
- 21.
- numpy (linear algebra problems);
- 22.
- scipy.stats (statistics);
- 23.
- statsmodels (test for stationarity of the series);
- 24.
- sklearn (machine learning methods);
- 25.
- pmdarima (ARIMA model);
- 26.
- keras (deep learning methods).
- 27.
- unlabeled dataset for 2020-2021 consisting of 12 .csv files;
- 28.
- unlabeled dataset for June-July 2022 in .json format;
- 29.
- three dates actually laboratory recorded facts of exceeding pollutants MPC - June 7th, 8th, and 9th in 2022.
4. Results and Discussion
4.1.3. σ PC Method
4.2. k-meansPC Method
4.3. IFPC Method
5. Conclusions
6. Patents
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Masoomi, B.; Sahebi, I.G.; Ghobakhloo, M.; Mosayebi, A., Do industry 5.0 advantages address the sustainable development challenges of the renewable energy supply chain?, Sustainable Production and Consumption, Volume 43, 2023, Pages 94-112, ISSN 2352-5509. [CrossRef]
- Kumar, U.; Kaswan, M.S.; Kumar, R.; Chaudhary, R.; Garza-Reyes, J.A.; Rathi, R. and Joshi, R. A systematic review of Industry 5.0 from main aspects to the execution status, 2023, The TQM Journal. [CrossRef]
- Wastewater сleaning. Wastewater from thermal power plants. Access mode: http://www.nvbvu.ru/info/category/23012 [Date of reference: 23.02. 2024.
- GN 2.1.5.1315-03 Maximum permissible concentrations (MPC) of chemical substances in water bodies of household, drinking and domestic water use. Access mode: https://files.stroyinf.ru/Data2/1/4294815/4294815336.pdf [Date of reference: 23.02.2024].
- Rashevskiy, N.; Sadovnikova, N.; Ereshchenko, T.; Parygin, D.; Ignatyev, A. Atmospheric Ecology Modeling for the Sustainable Development of the Urban Environment. Energies 2023, 16, 1766. [Google Scholar] [CrossRef]
- Daoping, H.; Yiqi, L.; Yan, L. Soft sensor research and its application in wastewater treatment. CIESC J. 2011, 62, 7–15. [Google Scholar]
- Rehbach, F.; Moritz, S.; Chandrasekaran, S.; Rebolledo, M.; Friese, M. and Bartz-Beielstein, T. GECCO 2018 Industrial Challenge: Monitoring of drinking-water quality. 2018. Access mode: http://www.spotseven.de/wpcontent/uploads/2018/03/ rulesGeccoIc2018.pdf [Date of reference: 23.02.2023].
- Dogo, E. M.; Nwulu, N.I.; Twala, B.; Aigbavboa, C.O. A survey of machine learning methods applied to anomaly detection on drinkingwater quality data, Urban Water Journal. 2019, vol. 16, (3), pp. 235-248. [CrossRef]
- Dogo, E.M., Nwulu, N.I., Twala, B., & Aigbavboa, C.O. Empirical Comparison of Approaches for Mitigating Effects of Class Imbalances in Water Quality Anomaly Detection. IEEE Access. 2020, 8, 218015-218036.
- Muharemi, F.; Logofătu, D. and Leon, F. Machine learning approaches for anomaly detection of water quality on a real-world data set, Journal of Information and Telecommunication. 2019, pp. 1-14. [CrossRef]
- Ribeiro, V. and Reynoso-Meza, G. Monitoring of drinking-water quality by means of a multi-objective ensemble learning approach, in Proc. of the Genetic and Evolutionary Computation Conference. 2018, pp. 1-2.
- Zhang, W.; Zhao, J.; Quan, P.; Wang, J.; Meng, X.; Li, Q. Prediction of influent wastewater quality based on wavelet transform and residual LSTM, Applied Soft Computing, 2023, Volume 148.
- Wu, D.; Wang, H. and Seidu, R. Smart data driven quality prediction for urban water source management, Future Generation Computer Systems, 2020, vol. 107, pp. 418-432. [CrossRef]
- Al-Gunaid, M.A.; Shcherbakov, M.; Artyushin, V.O.; Shkolny, D.V.; Belov, S.V. Detecting Anomalies in Multidimensional Time Series Using Binary Classification. In: Kravets, A.G., Shcherbakov, M.V., Groumpos, P.P. (eds) Creativity in Intelligent Technologies and Data Science. CIT&DS 2023. Communications in Computer and Information Science, vol 1909. Springer, Cham. 2023. [CrossRef]
- Tatbul, N.; Lee, T.J.; Zdonik, S.; Alam, M.; Gottschlich, J. Precision and Recall for Time Series, 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada. 2018.
- Liu F., T.; Ting K., M.; Zhou Z., H. Isolation Forest. Data Mining. ICDM’08. Eighth IEEE International Conference on. 2008, pp. 413-422.
- Elmasry, W., Wadi, M. Detection of Faults in Electrical Power Grids Using an Enhanced Anomaly-Based Method. Arab J Sci Eng 47, 14899–14914 (2022). [CrossRef]
- Artemov, A.; Burnaev, E. Ensembles of detectors for online detection of transient changes. Eighth International Conference on Machine Vision (ICMV 2015). 2015, Vol. 9875. International Society for Optics and Photonics.
- Rayushkin, E.S.; Scherbakov, M.V.; Kazakov, I.D.; Kolesnikova, V.O.: Detection of anomalies in multidimensional time series using R language package. Modeling, optimization and information technology. 2021, 9(3).
- Golovina, A.M.; Diakonov, A.G. Detection of anomalies in operation of mechanisms using machine learning methods. Proceedings of XIX International Conference Analytics and Data Management in Data-intensive areas (DAMDID/ RCDL’2017). 2017.
- Tan S. C.; Ting K. M.; Liu T. F. Fast anomaly detection for streaming data. Twenty- Second International Joint Conference on Artificial Intelligence. 2011.
- Liu F., T.; Ting K., M.; Zhou Z., H. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD). 2012, 6(1), 1–39. [Google Scholar] [CrossRef]
- Barsky, M.E.; Shikov, A.N. Research of Isolation Forest anomaly search algorithm. Fundamental and Applied Scientific Research. 2019, 113–117. [Google Scholar]
- Certificate of state registration of the computer program “Program for verifying the method of detecting anomalies of multidimensional time series with the predicates’ conjunction condition implementation” № 2023669725 dated 20.09.2023.











| Method | Dates obtained | Dates of laboratory recorded facts | Number of detection |
|---|---|---|---|
| 3σPC | ![]() |
2022-06-07, 2022-06-08, 2022-06-09 |
0/3 |
| k-meansPC | ![]() |
0/3 | |
| IFPC | ![]() |
3/3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).


