Power Transformer Insulation Assessment based on Oil-Paper Measurement Data using SVM-Classifier

Oil immersed paper insulation condition is a crucial aspect of power transformer’s life condition diagnostic. The measurement testing database collected over the years made it possible for researchers to implement classification analysis to in-service power transformer. This article presents classification analysis of transformer oil-immersed paper insulation condition. The measurements data (dielectric characteristics, dissolved gas analysis, and furanic compounds) of 149 transformers with primary voltage of 150 kV had been gathered and analyzed. The algorithm used for developing classification model is Support Vector Machine (SVM). The model has been trained and tested using different datasets. Different models have been created and the best chosen, resulting in 90.63% accuracy in predicting the oil-immersed paper insulation condition. Further implementation was executed to classify oil-paper condition of 19 Transformers which Furan data is not available. The classification results combined, reviewed, and compared to conventional assessment methods and standards, confirming that the model developed has the ability to do classification of current oil-paper condition based on Dissolved Gasses and Dielectric Characteristics.


Introduction
Oil-impregnated paper is commonly used insulator in power transformers.Evaluation of the degradation of transformer paper insulation in an oil-filled transformer is critical due to the importance of power transformer in the electrical supply chain.Whilst monitoring condition of oil insulation can be done easily, assessing the state of paper insulation is more difficult because the paper is wrapped around the conductors and cannot be sampled without taking the transformer out of service [1].Different diagnostic method using Dissolved Gas Analysis (DGA) and aging estimation from loading history has been used.The application of 2FAL (2-furaldehyde) as measurement of specific chemical indicator of the aging of paper insulation has received increased attention in the last 20 years [2].
The degradation of cellulose paper insulation in oil-filled power transformer is promoted by four agents of degradation, such as, exposure to elevated temperature, oxygen, acid, and moisture.The processes of degradation for this are thermal, oxidation, and hydrolysis.These degradations caused chain scission or depolymerization and decreasing the tensile strength of paper, yielding glucose.This glucose will further degrade to form furans and other chemical products such as water and gases.The advantage of furan assessment, is that the it does not produce by oil degradation.
Despite the fact that furan is the most accessible yet reliable transformer paper assessment, this measurement is not done periodically by the utility.In order to find out the current condition of paper insulation, it is needed another inexpensive approach.Machine learning algorithm can be employed to model the current transformer paper condition level.Several studies have been done to figure out the possibility of this method.Transformer remnant life prediction using furan in [3] used Fuzzy Logic.K-NN and Decision Tree based classification for prediction of transformer furan level [4].ANFIS was used to predict the Degree of Polymerization and further do the expected life estimation of transformer in [5], and simple multiple regression model has also developed to be compared with ANFIS model in [6].
SVM is one of the commonly adopted machine learning algorithms for data classification [7] [8] [9].SVM used in [10] to forecast electric load along with other algorithm such as Fuzzy Time Series and Global Harmony Search.A computational model was developed to estimate mass concentration of boiler flue gas in [11].Study in [12] implemented SVM to classify the results of the simulation in defining synchronization capability limits of permanent-magnet motor.In power transformer diagnosis, SVM was implemented in [13] [14] [15] and [16] for fault detection.Several machine learning approach were used in [17], SVM was utilized along with Decision Tree, ANN, KNN, and Naïve Bayes to assess transformer furan content.This publication comes out with relatively low accuracy on SVM classifier.
There is no single scientific method available to determine the condition or end-of-life of an operating power transformer, the combination of analytical, inspection and testing methods, when used together help form a complete picture of the condition of units in service [18].This article will implement classification analysis using SVM as an additional insight to help utilities assess transformer oil-impregnated paper insulation condition by using transformer oil measurements data.The main issues of developing SVM model classification analysis will be discussed, which are, data preparation, feature selection, and model validation with different models to find the best model created in order to meet accuracy level intended.The proposed model then compared to conventional methods and standards to validate the classification result.

Methodology
This section presents the step by step methodology of SVM classifier model development.The attributes observed, guidelines of 2FAL assessment, model development flowchart, preprocessing and outlier elimination, and SVM classifier algorithm is presented in this section.

Sample
Measurements data (dissolved gases, oil dielectric characteristics, and furan) of 149 in-service transformers has been gathered.Figure 1 shows one of the transformers observed in this study.All of the measurements data are from 3 phase power transformers with 150 kV primary voltage, and operating life of 3 up to 44 years.For the detail data In this article, the authors gathered measurements data consisting of dissolved gases, dielectric characteristics, and furanic compounds.

Dielectric Characteristics
Characteristics of transformer oil insulation were measured and interpreted based on [21], consists of Breakdown Voltage in kV (IEC 60156), Water Content in ppm (IEC 60814), Acidity in mg KOH/g (IEC 62021), Interfacial Tension in dyne/cm (ASTM D971), and Color Scale (ISO 2049).

Furanic Compounds
Furans are part of the degradation products of cellulose insulation paper in transformers, and they are partially soluble in the insulation fluid [2].Most often, five furanic compounds measured are 2-furaldehyde (2FAL), 5-methyl-2-furaldehyde (5M2F), 5-hydroxymethyl-2-furaldehyde (5H2F), 2-acetyl furan (2ACF), and 2-furfurol (2FOL).2FAL is considered as the main compound among these furanic compounds because of its higher generation rate and stability inside a transformer.2FAL is usually correlated to Degree of Polymerization (DP).Paper with initial DP value of approximately 1000 is expected to last the lifetime of the transformer (25-30 years), but a DP of 150-250 is regarded to be the end of life criterion for the transformer insulation because the paper is also at risk of mechanical failure [22].

Analysis Methods
2FAL is the most accessible measurement for assessing insulation paper of power transformer, however, 2FAL is not a routine test.This subsection discussed the methods of assessing oil-immersed paper in power transformer when there are furan measurements and using SVM-classifier when no furan measurement is available.

Determining Oil-Paper Condition based on Measurement Data
Table 1 shows the guidelines used for assessing the significance of 2FAL measurement, as used by several publications [23] [3] [24].The correlation between 2FAL and Degree of Polymerization with its extent of degradation is shown.Measurement data of 2FAL falls into categories in Table 1, 'Healthy', 'Moderate', 'Extensive', and 'End of Life'.When degree of polymerization of transformer paper reach the value of 250 or lower, the paper considered to lost its mechanical strength and transformer has reached its end of life.Table 2 and Figure 3 shows the number of transformers measurement data that falls into each category.Support Vector Machine (SVM) is an efficient algorithm in learning theory, especially for classification problems.The classic svm was introduced with polynomial kernels by Boser et al. in [25], and with general kernels by Cortes and Vapnik in [26].Among other linear programming, SVM is important because of its linearity and flexibility for large data setting [27].SVM is a powerful supervised learning algorithm, which is based on learning algorithm related to data analysis for classification and regression analysis.SVM is known to be efficient, particularly in large classification problems, because the training of the classified vectors does not have a distinct influence on the performance of SVM.Therefore, SVM has the required potential to handle very large feature spaces.Also, SVM-based classifiers are claimed to have good generalization properties compared with conventional classifiers, because in training the SVM classifier, the structural misclassification risk is to be minimized, whereas traditional classifiers are usually trained so that empirical risk is minimized [15].
SVM is one of the standard algorithm for data mining and machine learning based on the advances theory of statistical learning.Various different binary classification methods are implemented for the purpose of multi-category classification, such as 'one-against-all', 'one-against-one', etc [28].Linear programming SVM classifier is especially efficient for very large size samples.But little is known about its convergence, compared with the well understood quadratic programming SVM classifier [27].This study will compare the classification accuracy of both linear and quadratic SVM classifier.The three classifications shown in Table 1, 'Healthy', Moderate', and 'Extensive' will be target category for SVM classifier.'End of life' category was not included in this discussion due to no transformer measurement data collected was included in that category.The SVM model developed will be linear and quadratic SVM.Linear programming means the algorithm is based on linear programming optimization, while Quadratic programming means it is based on quadratic programming optimization [29].The difference between Linear and Quadratic SVM is thoroughly discussed in [27].Vapnik in [29] Here α = ( , • • • ,  ), ξi's are slack variables.The trade-off parameter C = C(m) > 0 depends on m and is crucial.If α = α , , … , α , , b solves the optimization problem (equation 1), the Linear-SVM classifier is given by sgn ( ) with equation 2.

Classification Model Flowchart
Figure 3 shows the process of developing classification analysis in this study.First, measurements data were accessed and explored.These data including of Transformer Profile, Dissolved Gasses, Dielectric Characteristics, and Furanic Compound.Then, the data from different sources will be composed to the same format.The outliers were eliminated using one-class SVM.The inliers data was separated to training and testing datasets.

Results and Discussion
This section presents the results of SVM model development in classification analysis of transformer paper insulation condition.In this section, the data preparation, classification result, and model validation are presented.

Data Preprocessing
Measurement data gathered to develop classification model consist of dielectric characteristics and dissolved gasses with total 15 attributes.Before developing the model, the attributes are ranked by ANOVA and chi-squared criteria:  Analysis of variance (ANOVA): the difference between average values of the feature in different classes, in order to find out if an attribute is significant for model development.Steps for ANOVA calculations [30].a. Calculate the correction factor using equation 5.

𝑀𝑆 𝐸𝑟𝑟𝑜𝑟 =
(10) g.Equation 11to calculate Variance Ratio (V.R.)  Chi-squared: dependence between the feature and the class as measure by the chi-square statistic, the calculation is done using equation 8.  3 shows the rank of attributes based on ANOVA and chi-squared.Color has the highest ANOVA and chi-square, followed by IFT, CO, CO2, accumulation of CO+CO2, TDCG, acidity, and other attributes.This rank is then used for attributes selection in SVM model development.

Data Reduction: Eliminating Outliers
As much as 149 transformer measurements data collected, were analyzed under Orange Data Mining Program to find the outliers using one-class SVM with non-linear kernel (RBF).This is an unsupervised learning algorithm that learns a decision function for novelty detection.It classifies new data as similar or different to the training set [31].The inliers data from this process (102 data) were used as SVM model development and validation.

Extensive
The inliers data resulted from outlier elimination then divided into two datasets.of 150 kV transformer testing measurements were collected.These data consist of three paper condition categories as shown in Table 2 and Figure 2.There are 54 transformers in 'Healthy' category, 39 transformers in 'Moderate' category, and 9 transformers in 'Extensive' category.
The measurements data then divided into two datasets, with 70 transformers in training datasets and 32 transformers in testing datasets.The configuration of training and testing data is shown in Table 4.

SVM Classification Model Development
Three categories of transformer paper degradation level, 'Healthy', 'Moderate', and 'Extensive' were used as target class.The attributes included were dissolved gasses and dielectric characteristics, with the total of 15.The attributes selection is shown in Table 5.The attributes selection is based on the rank discussed in subsection 3.1.The model developed is capable of classifying new transformer measurement data with 16 out of 17 transformers in 'Healthy' class correctly classified, 11 out of 12 transformers in 'Moderate' class, and 2 out of 3 transformers in 'Extensive' class.

Application of the Model Developed
The model created has the accuracy of 90.63% in classifying transformer oil-paper condition to three classes: Healthy; Moderate; and Extensive.This developed model then implemented to 19 transformers data with no furan measurements to do classification of the oil-paper insulation condition.Table 7 shows the classification results which 8 transformers classified as Healthy Transformers, 6 transformers as Moderate Ageing, and 5 transformers as Extensive Ageing.
The classification results were validated using conventional method, such as ratio of CO2/CO, level of CO and CO2 respectively, and limit of each oil-characteristics.
Out of dissolved gases parameters, Figure 5 shows that CO and CO2 both are caused by overheating of cellulose.Since the focus on this study is condition of oil-immersed paper insulation in transformer, only these two gases were considered correlated, proven by attributes rank in Table 3.The polymeric chains of solid cellulosic insulation (paper, pressboard, wood blocks) contain a large number of anhydroglucose rings, and weak C-O molecular bonds and glycosidic bonds which are thermally less stable than the hydrocarbon bonds in oil, and which decompose at lower temperatures.Significant rates of polymer chain scission occur at temperatures higher than 105 °C, with complete decomposition and carbonization above 300 °C.[20] Ratio of CO2/CO based on IEC60599 [20] is an indicator of the thermal decomposition of cellulose.As the magnitude of CO increases, the ratio of CO2/CO decreases.This may indicate an abnormality that is degrading cellulosic insulation [32].With ratio of CO2/CO less than 3, it is generally considered as indication of paper fault with some degree of carbonization [20].According to [21], transformers 150 kV observed in this study is in Category B, which is power transformers with a nominal system voltage above 72.5 kV and up to and including 170 kV.Table 9 shows recommended limits for mineral insulating oils dielectric characteristics.Transformer number 18 and 19 (TRF #18 and #19), which are two of the oldest transformer in these population, classified as E (Extensive Ageing).TRF #19 shows CO2/CO ratio of 12.86.Ratio more than 10 is an indication of thermal fault in the paper insulation on temperature less than 150℃, this temperature gives effect to the paper ageing in the long term.TRF #13 is also shows ratio higher than 10, with high level of CO and CO2, this also mean TRF #13 is also undergoing long term ageing in temperature less than 150℃.Both TRF #18 and #19, along with other extensive-classified transformer have high level of CO and CO2, exceeding major concern level of CO and CO2 concentration in oil shown in Table 8.Besides CO and CO2, most of other oil properties of these transformer are at poor condition, such as low interfacial tension and dark oil color.Even, TRF #18 and #19 have very high water content, which are up to 36.89 and 41.46 ppm respectively.
At the early stage, TRF #1 to TRF #6, which have operating life of 10 years or less, classified as healthy.From oil characteristics point of view, almost all healthy-classified transformer have relatively good oil parameters.This is in line with study in [35], that the ageing process happens during the life of transformer, decrease the condition of the transformer and changes certain parameters in oil insulation.
From the training accuracy, then validated with testing dataset, followed by implementation and comparison with conventional methods and standards, the developed SVM model can successfully classify transformer with no furan measurement and recognize the decreasing trend of transformer oil-immersed paper insulation condition as the operating time increasing.

Conclusions
Classification analysis of in-service 150 kV Power Transformers insulation condition using Support Vector Machine (SVM) is presented in this article.The proposed method is able to recognize different category of transformer oil-immersed paper insulation condition based on the dissolved gasses and dielectric characteristics measurement data.For training and testing, the measurements data have been divided into two separate datasets.After selecting the best features and cross-validating with different models, the best-performed model has been chosen, resulting in total 90.63% accuracy in distinguishing the oil-immersed paper insulation condition into three categories: Healthy; Moderate; and Extensive.Further implementation was executed to classify oil-paper condition of Transformers which Furan data is not available, and compared to conventional assessment methods and standards, confirming that the model developed has the ability to do

Figure 1 .
Figure 1.A sample of 150 kV power transformer used in this study

Table 1 .
Guidelines for Oil Immersed Insulation Paper Degradation

June 2018 doi:10.20944/preprints201806.0002.v1
Figure 2. Percentage of Data each Category 2.3.2.Support Vector Machine Classifier introduced Linear SVM algorithm associated to a Mercer Kernel K.It is based on the following linear programming optimization problem in equation 1.

Table 3 .
2 = Pearson's cumulative test statistic   = the number of observations of type .  =  = the expected (theoretical) frequency of type , asserted by the null hypothesis that the fraction of type  in the population is   = the number of cells in the table.Rank of attributes based on ANOVA and chi-squared

Table 4 .
Training and testing data separation

Table 5 .
Attributes SelectionTable6shows 12 models created using Linear and Quadratic SVM.Training and testing datasets were used to evaluate the model, with respective accuracy.The best-chosen model was number 12, with attributes of CO, CO2, IFT, and Color.This model was able to do classification of testing dataset with 90.63% accuracy.

Table 6 .
Accuracy of different SVM modelsThe ability of selected model to do classification of new data was examined.Figure4shows confusion table of selected model, when checked using entirely different transformer with dissolved gases and oil characteristics measurement data.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 June 2018 doi:10.20944/preprints201806.0002.v1Table 7 .
Assessment of 19 units of 150 kV Power Transformers without Furan measurements data.Based on SVM model developed, the category of oil-paper insulation is predicted."H"is for Healthy Transformer, "M" is for Moderate Ageing, and "E" is for Extensive Ageing.Green-colored cells show transformers with Healthy class, blue-colored cells show moderate-class transformers, and yellow is transformers with extensive condition.Red-colored cells show parameters in oil which exceeding limits shown in Table8 and Table 9.
[33]re 5. Principal layout of key-gases characteristic[33], CO and CO2 is the main gas indicator of overheating of cellulose in transformer oil.Preprints (www.

Table 9 .
Application and interpretation of dielectric characteristics tests