Ensuring Sustainability of a Mobile Legacy Network by Improving Network KPIs using Six Sigma DMAIC Framework with AHP based Analysis

: Call set up success rate (CSSR) in a mobile network is an important quality parameter, which directly influences overall key performance indicators (KPIs) of the network service providers. Network KPIs especially the CSSR is the fraction of the attempts to make a call that result in a successful connection to the dialled number. Nevertheless, all call attempts do not end with a connection to the dialled number. In this research, six sigma methodology is applied to improve the call quality of a legacy mobile network and subsequently to boost comprehensive CSSR of a telecom service provider. This research elucidates an empirical study of improving overall CSSR by employing DMAIC methodology consisting of five stages, e.g. Define, Measure, Analysis, Improve and Control. In addition, analytic hierarchy process (AHP) technique is used for determining the vital causes out of all the identified network parameters affecting overall CSSR. Furthermore, the identified vital parameters are being upgraded in order to improve the overall CSSR and finally the system achieved a higher CSSR.


Introduction
In mobile communication services, the regulators set the standards of service delivery and monitor its compliances. Besides, telecom service providers (TSPs) continuously explore the means of providing better services to their subscribers by constantly striving to boost the QoS in order to remain on top in the competition. Moreover, ensuring better services delivery by the TSP is essential as it would boost confidence of users and also meet demands of the loyal users. However, several factors bring down overall QoS delivery by the MNOs [1], few of them are call setup failures, signals loss, congestion, jitters call drops. In mobile networks, KPIs is a collection of several factors; e.g. CSSR, call drop rate (CDR), stand-alone dedicated control channel (SDCCH), call setup time (CST) and traffic channel (TCH) congestion [2]. These KPIs mirror the quality aspects of the network, these records are testimonial to achieve and maintain QoS-related issues. Periodical observations of these KPIs are highly essential as MNOs use these records to realise the QoS offered in a particular time period. In the other hand, subscribers use these KPIs to benchmark the services offered by the MNOs. These KPIs are mostly affected by increased congestions, due to various factors [3]. A mobile switching center (MSC) is a work unit larger than an individual workstation, and it is responsible for call set up, release and routing from base station. Typically, it has 3-12 people and 15-45 work stations in a compact arrangement. An ideal MSC does a narrow range of highly similar products /processes. Such an ideal MSC is self-contained with all necessary equipment and resources for processing of 2G, 3G, LTE and IMS calls. Mobile calls (2G, 3G, LTE, Volte), SMS, data browsing lines up in an initial queue when they enter the department. Such a group of sequential operations organized so that user initiated mobile operations are processed and subsequently transferred flawlessly that are maintained throughout the sequence of operations. Figure  1 depicts the detailed flow of call set up process. Popoola et al. [4] in their research on the GSM networks, selected four MNOs as their primary sources of data. These MNOs were Celtel, MTN, Glo and M-Tel. They conducted the research on three KPIs which were CDR, CSSR and Call Completion Success Rate (CCSR). Furthermore, Pavan et al. [5] worked for an improvement by considering SDCCH KPI as a factor which affected QoS and they noticed that call initialization goes through three thematic processes, which also suffered inefficiencies. Besides CSSR, paging success rate (PSR) and answered to seizure ratio (ASR) are the critical parameters to measure the performance of a network. ASR is the number of answered calls divided by the number of call attempts, Now here is the catch successful call is considered as once the call reaches the destination despite the number is busy or is not answering the call still the CSSR will be 1/1 whereas the ASR is 0/1. If the call is answered and the conversation starts than the ASR will be 1/1 CSSR will be still 1/1. ASR and CSSR shows the network quality and the customer experience of the network. If these 2 parameters are benchmarked tracked and kept high the customer experience will be good and hence a mobile network will be considered best in class. Benefits of CSSR includes significant reduction in error rate; hence call success rate improvement occurs accordingly without any surge in workload. Due to workload reduction, required quicker identification of errors/deficiencies.
Our major contributions are listed below  KPI analysis and monitoring process.  Use of six sigma tools and technology.
 Transmission connectivity with different entity.  GSM Release cause analysis The rest of the paper is organized as follows: Section 2 puts light on six sigma approach. The detailed case study along with DMAIC implementation is illustrated in Section 3. The paper is concluded in Section 4 along with future improvement.

Six Sigma Process
The term Sigma in statistics is used for featuring the distribution around the mean of a process, measures the variation and signifies the performance of the process. Analogously, Six sigma methodology equipped with DMAIC roadmap is being adopted by the manufacturing and service sectors in order to enhance the process quality paradigm and in turn to reduce the defects [6], [7]. By realizing the potential improvement of existing process capability, six sigma is implemented. Moreover, in manufacturing space, the process optimization includes controlling variation and defective products [6]- [8]. However, in the service sector, the employability of six sigma is focused to reduce human error, updating existing methodology and activities, minimizing system error, etc. Six sigma has been successfully implemented in various service domains; e.g. healthcare [9]- [13] , education [14]- [17], banking [18], [19], telecommunication [20], [21], logistics [22], [23].
Six sigma is a project-based approach, weighs heavily on employing scientific techniques in order to improve the output variables, applies to a system relying on one of the frameworks known as DMAIC-Define, Measure, Analyze, Improve and Control [24]; its stacking representation is shown in Figure 2. The data of an existing process in cross functional / uni-functional environment is subjected to these five phases, and finally the improvement is realized. Six Sigma DMAIC Framework.

Problem Definition
The case study has been conducted for TSP-Idea cellular network in the state of Odisha in the eastern region of India. The detailed architecture of cellular network is represented in Figure 3. It illustrates network connectivity of Idea Cellular Ltd for Odisha circle. MSC pool (MSC1and MSC2) includes the connection to controllers of 12 BSC's (10 TDM and 2 IP BSCs) by means of media-gateway MGW1 and MGW2. Multiple Core Network nodes with point code NI2-5560 and 5561 have been associated with BICC signalling through Node NE-40. 2G and 3G network is connected to Evolved Packet Core (EPC) architecture to connect users on a Long-Term Evolution (LTE) and IMS (IP multimedia services) network via IPRAN1 and IPRAN2. Further the MSS connected to SMSC for SMS (short message service) and Intelligent Network (IN) for prepaid services and their servers located at Kolkata associated through STP (signal transfer point) pair located at Durgapur and Kolkata. Access provider to make inter-circle / International Long Distance (ILD) calls through NLD router arrangements done for carrying a call from a PLMN Network situated in Odisha Telecom circle for call routing, handling and operations procedures two different vendors MGW's are located at Bhubaneswar for carrying NLD and ILD Calls and connected to respective MSC's located at Delhi. Within a service area, for routing intra circle calls the location of POI, established with other operators like Airtel, RJIO and BSNL for calls between PSTN and other TSPs. The steering of roaming solution provided by signalling based methods is being used by establishing connectivity with NTR and GLR for steering of their roaming traffic. For both in roamer and out roamer, for security auditing, billing or reporting purposes connected to its IT Cloud via Cisco Firewall and Cisco L3 switch. CSSR is an indicator of call connectivity in the mobile network, is a critical parameter affecting customer satisfaction and revenue. In order to achieve and meet this, it is important to work on switch parameters and all complex parameters. In between, 15th Sept 19 and 15th Dec 19, CSSR stands at 99.76% which means approximately 0.24% calls (50K calls) in the span of 24Hr gets failed in the network due to multiple reasons. The aim of the project was to increase CSSR of network to 99.97% by June 20. In the current study, call setup failures due to technical issues within the network come under scope of the research. However, call setup failures due to external network or subscriber behaviour do not fall under the purview of our study.

Data Collection and Volume
Unit of measurement: Monthly CSSR, calculated as a percentage on daily basis. % CSSR, No. of successful calls, No. of calls CSSR = (No. of calls seized) / (Total number of attempts) Sample size in Months: Mid of September to mid of December and 100% data is taken for baseline.
It is expedient for the interest of technology to say that from any mobile operators' switching centre equipment performance reports, it can be well analysed total day call failures in the network after perusing various raw reports and especially cause code report and preparing data there off. We have consolidated data by analyzing following raw reports from the system as listed below: 1. CPU SEIZURE RATE_STATS: CPU seizure rate statistics 2. PAGESUCRATE_STAT: Paging success rate statistics 3. Furthermore, we properly analysed cause code failure report, LAC wise PSR report, destination dialing report and other equivalents, inspected the analyser for further performance management and other alarm, looked over the fault management and also connected to other ancillary information. Such dates are procured by logging in the performance management system of MSS, MGW and UDR and sending query commands and timeline etc. It is important to consider the volume in a guided and sophisticated manner, to do the same with some fixed time instead of round the clock. It is also equally important to know the callers' psychology for their engagement in different climate, environment and the like so. It is good enough to consider the same based on the season where in one can expect maximum call seizures/attempts. Experience says that the call seizure is high in between time consistent busy hours (TCBH) time, which has to be considered at various places in globe and at the different time as per the time zone.  Figure 4 shortly and understandably represents CSSR. While preparing CSSR diagram, emphasis is given in two phases. In one phase, it represents setup failure and in second phase it represents setup success as intended by the author. The set up failure is more fully described in downward and size ward step in graphical representation. Accordingly in the CSSR failure system, we have identified following issues: Intentional miss call

Root Cause Analysis of Call Drop
Fish bone diagram is mostly utilised for cause effect analysis [25], useful in identifying possible contributing causes to occur for a problem and those causes are being grouped into different categories [26]. The major problem is represented along the mouth of the fish [27]. In our case, CSSR is placed along the head of the fish. Next, the major causes are identified, they are machine, material, environment, method and human error and they are represented as branches from main arrow. All other individual causes are represented as bones and they are put in either of the identified five causes. For an example; far end congestion, level opening issues, circuit congestion, parameter mismatch, improper digit length are being identified as member of category "machine". Similarly, numerous causes are identified by brainstorming, consulting with team members having knowledge of the ensuing system and the underlying process, subsequently they are assigned to one of the remaining branches of fish e.g. human error, method, machine, material and environment and the detailed representation is shown in Figure 5.

Analytic Hierarchy Process (AHP)
AHP is suitable for analysis of a problem which consists of the variables whose weightages are to be prejudiced by human decision makers [28]. This kind of investigation is known as Multi-Criteria Decision Analysis (MCDA) approach [29]. Even though, quite a number of MCDA techniques are in practice, however, popularity of AHP is attributed to its simple, structured and systematic evaluation [30] roadmap. The generic flow of AHP is shown in Figure 6. Moreover, AHP passes through phases of decomposition, prioritization and weight determination.

Decomposition
In this phase, a hierarchical structure is framed such that Level 1 of the hierarchy is positioned by CSSR since the attainment of high value of CSSR is the aim of the study.
Next, in Level 2, the general categories, e.g. method, machine, material, human error and environment are placed followed by level 3, where the above parameters are further decomposed into detailed causes and the full hierarchy is shown in Table 1. The analysis of splitting up was carried out by one of the authors along with his colleagues working in Vodafone Idea Company.

Prioritization
First, causes in level 2 are prioritized in order to create a pairwise comparison matrix (PCM), and it is shown in Table 2. From the second row of the table, the intersection cell between method and machine is of the value 9, indicates method is nine times significant than machine for CSSR. Likewise, weightage of method is three times than that of material and weightage of method is five times of environment. In the same way, all the causes placed in level 3 are prioritized, and it is exhibited as PCM in Table 3.

Weight Determination
This phase involves the computation of the weights of identified causes placed in level 2 and level 3 by calculating respective Eigen values. Firstly, the values in each column of the PCM are summed up. Secondly, each element in the PCM is divided by its column total. Third, the average of the elements in each row of the PCM are calculated. These three steps are executed for the entries in Table 2, consequently, the weights of parameters machine, method, material, environment and human error are obtained and it is represented graphically in Figure 6  The resulting weights for the categories based on pairwise comparison also represented in Table 4. It is pertinent from the result that category method has highest preference of 49.1% followed by human error having 20.1% priority. The Consistency Ratio (CR) calculated as 6.2% (0.062) which is below than 0.1 indicating viability and validity of our experimental result. The principal Eigen value is 5.281. In the same way, priorities for all the causes present in level 3 based on pairwise comparisons are calculated and they are presented in Table 5. The obtained CR value is 9.4% which is less than 10% reflects the validity of our experimental result. Principal Eigen value is 20.58. The results are also shown graphically in Figure 7.    Next, considering all the possible causes e.g. Technical problems, subscriber's behavior and external factors, we have investigated the past data to realize the weightages of each factors, now basing on the data , we have drawn a Pareto chart [31] to identify the most frequent root causes. A Pareto Chart shows the frequency with which different signalling release causes occurs. It is a bar graph where each frequency (or frequency range) is shown in a descending order of importance of causes, from left to right. This is based on the Pareto principle, also called 80-20 rule or rule of vital few. It's useful to find the defects to prioritize in order to observe the overall CSSR improvement in a mobile network. Likewise, Pareto chat from Figure 9 reveals first six causes amounts to 79% of the issues. The prominent root causes are Release before answer (38%), user decide to busy (11%), Release before ring (9%) and No acknowledge from callee (8%).  After all the possible causes are identified, a brainstorming session with the team members was organized and after a complete discussion, whether the root causes can be countered or not along with their impacts are tabulated and it is represented in Table 6. Impact control matrix relates to the ways and means and to understand factors for cent percentage success in call set up from source to sink and the reasons of failure/decrease of the same. Various factors are responsible and various factors also controls such call setup from source to sink. For e.g if a caller dials to a callee even if the handsets are perfect in all respect, there is change of misdialling like dialling 9 digit number instead of 10 digit causing set back to the said call. Similarly other factors which are narrated below are responsible for setback of failure of calls. Those main causes are represented in a matrix form in Figure 9 for better understanding Figure 10. Impact Control Matrix.

Strategic Improvment
From the root cause analysis section; from Table 5 and subsequently from Figure 7, the causes having more than 5% weightages are targeted in order to improve them so that overall CSSR value will be boosted. Now, we have seven vital causes whose priorities are greater than 5%. These vital causes are listed below:  Meet error in send routing info procedure  No route or circuit applied available  Failure due to termination error  Assignment failure  Level opening issues  No route available  Unsuccessful paging Any technology never and cannot be allowed to remain static without further improvement and innovative efficiency. Similar is the case of call set up through mobile communication. Mobile communication is involved in sending message and to be recovered the same to complete its cycle. Message is sent in signalling system i.e. by electromagnetic wave through standard through instrument which can send and receive such signal. In process signals are send through binary 0, 1 coding system and received it being decoded. Whatever the case may be some errors in instrument, signalling process , transmission are bound to occur due to various reasons and factors and though reasons and factors needs to be eradicated for its development and output efficiency. Therefore observation, experiment and inferences are always necessary for developing the system time and again. The errors, its eradication and to get more input of efficiency, the measures undertaken are described in Table 7.

. System Performance and Control
After detecting the root causes for low CSSR and taking apposite measures to fix them, this phase further exhibits the end result of process improvement and also tesimonials the implemented solution as a long lasting one and to become a usual practice in the day-to-day operation. We analyzed call failure before and after implementation as below.
Meet Error due to send routing info (SRI) failure: In order to check this error, MAP reports are analyzed and DN flag is changed. Once it is implemented, call failures before the measures taken and call failures after the measures adopted are compared and it is represented in Figure 11. Meet Error due to SRI Failure.
The X-axis of the graph indicates the date and Y-axis represents the number of call failures, it is clear from the graph that, in November 2019, the daily call failures were 93584, 105015, 180307 and so. However, after CAUSE115 is fixed, CAUSE115 is system code of SRI failure, the call failures in July 2020 are recorded as 5355, 5127, 5493…... signifies a reduction of call failures around 95%.
No Route to Applied circuit available: After countering this particular issue, we have compared the call failures before and after and it is represented in Figure 12. In first week of November 2019, the number of daily call failures were 44758, 40644, 43634, 43699… and so on. However, after checking this issue, in July 2020, the daily call failures were reduced to 31862, 31870, 35301…reflecting an average drop of failures by 25%. Call Failure due to no route to applied circuit available.
Termination Error: Once the measures were taken for checking this error, resultant affect was measured subsequently by percentage of call drop before and afterwards and it is represented in Figure 13. Y-axis stands for call failure percent and X-axis symbolize the date of failure. In November first week, failures were calculated a 0.212%, 0.209%, 0.204%… and so on. However, after fine tuning, in July 2020, the value was reduced to 0.165%, 0.168%, 0.171%… etc Similarly, the measures were taken for other vital parameters also based on the recommended solution in order to reduce the call failures. Furthermore, an action plan characterized by the individual parameter's improvement modalities has been constitutedd, for implementing and maintaining the process for the next three months. Figure 14 shows the control chart, for pre and post implementation. It is obvious that average CSSR is increased from 99.71 % (Baseline) to 99.97% (After improvement) and the achieved CSSR keeps constant for three consecutive months.

CSSR -BEFORE & AFTER
The improvement of average CSSR in phases are also exhibited in Figure 15 as the form of Boxplot and in Figure 16 in continuous cycles .

Conclusion
Every invention inspired for easiness to facilitate the overcoming procedures which are faced in different kind of activities or field. In such process, machineries, instruments and other ancillary provisions are also taken up and achieved. In other words, subsequent to invention, it is intended for faultless instrumentation and operation procedure. In such a course in every field fault, failures in mechanism, instrumentation efficiencies and other related things are bound to be experienced. Hence it is necessary to find its causes, consequences and eradication afresh always and periodically. In case of CSSR, it is also applicable and cannot be exception as CSSR many times experienced failure. Therefore, the causes of its failure are searched for in case of searching it becomes necessary to reduce such procedure into a database system. To do so phase wise experiment was necessary. Hence every day during peak period, different types of causes for failure in CSSR were noted for the period of seven days against these causes rectifications were also noted simultaneously. Thereafter, an average was calculated for different causes of failure and also post rectification. Accordingly, the data was maintained and the same procedure was again repeated. It is important to mention here that after rectification, there was a decrease in failure causes.
Therefore, to achieve zero faults we have to strive hard, and Devop's mind are required for rectification for no fault, which seems not that easy. However, there is every possibility that faults may tend to zero. This is only possible when in the innovation of new technology, 5G application is improved to achieve fault -free driving with introduction of Advanced Network, mmWave System, Multi-radio Access, Advanced MIMO, Multiple Access, Advanced D2D, Network Slicing and Advanced Small Cell. Clearly, it is a continuous journey in the path of improvement, and therefore, it has to be continued with more and more innovations to achieve development in mobile communication where there will be no failure in CSSR. This process has to be continued thus we achieve the ways and means to make an error -free communication system in place.