Submitted:
13 September 2024
Posted:
17 September 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background
2.1. Definitions of CDR & XDR
| Header Field Name | Value |
|---|---|
| Caller ID | 101032 |
| Caller IMSI | 12039810283012 |
| Caller Carrier | Verizon |
| Receiver ID | 101303 |
| Receiver IMSI | 04920398402302 |
| Receiver Carrier | Three |
| Duration (Minutes, Seconds) | 30:12 |
| Call Initiated | 2-50/18/02/2022 |
| Call Terminated | 3-20/18/02/2022 |
2.2. Dataset Availability and Considerations
2.3. Wider Research Using CDR Data
3. Analysis of CDR/XDR
3.1. Testing Environment
3.2. Machine Learning Methods
- K-Means Clustering: This unsupervised method of clustering groups similar data points together based on some underlying assumptions about the data. In the case of k-means clustering, the assumption is that k groups exist, which is either user-specified or may consist of a range of values that are tested for. In Figure 4 we show a simple example of applying k-means clustering to the NoDoBo calls dataset, to establish threshold boundaries between groups of data, where .
- Random Forests & Decision Trees: Using Random (RF) & Decision Trees (DT) can provide predictions using data points from CDR and XDR data. DT can provide regression metrics for call data records by examining features such as user ID or telecom network, then attempt to provide estimated call length or receiver ID. Success of this will primarily depend on the quality of the training stage. With the use RF to mitigate common problems such as overfitting, these predictive methods can be compared to other predictive ML techniques for call data record testing.
- Linear Regression: One of the most common methods of machine learning is Linear Regression, easily providing predictive analytics when applied to trained data gathered from existing CDR and XDR data. Tasks such as linear regression can be applied to anomaly detection of call data records, by building a predictive model of expected traffic, and comparing new traffic against this model to estimate the likelihood of anomalous activity. This again, can be compared to other predictive ML methods, attempting to provide higher accuracy when compared.
3.3. Applying ML Methods to CDR XDR Data
4. Key Research Challenges In CDR Analysis
- Contextualization of Call Activity Whilst CDR and XDR provides a wealth of information about call records, the classification of calling behaviour of users can quite easily lack contextual information about why behaviours are exhibited (e.g., [19]). Whilst there exist research works that have attempted to use CDR data for predicting house prices [8] and for predicting criminal activity [9], researchers would clearly need to couple multiple data sources together to obtain a suitable level of understanding on the underlying intentions of users.
- Real time results analysis Given the shear volume of information being captured globally, there remains an on-going research challenge into how to manage CDR data in real time (e.g., [20]). Proposed solutions include a distributed approach to data mining of call detail records, however this requires further investigation to understand issues of scalability. Alternatively, a hierarchical approach that provides levels of analysis based on the real-time requirement of the analysis could well be explored in further research.
- CDR with VoIP As new technologies continue to evolve, including Voice over IP, and more recently, video conferencing services offered by the likes of Microsoft Teams, Google Meet and Zoom, researchers have noted that the changing distribution function of the duration of calls [21], based on user behavioural changes, will remain as an open research challenge.
- VoIP Classification Challenges Focusing on the classification of VoIP traffic U. Anwar et al [22] refers to the issue that presently there may exist a lack of application layer techniques to detect certain protocols and traffic within VoIP. As this research attempts to classify “illegal” traffic, understanding the context of this data is considered a crucial goal.
- Finding User Habits Research conducted in the field of data mining of call data records includes an active attempt of identifying user habits Bianchi et al [9] highlights the key summary of "automatically identifying meaningful information" from meaningful data[23]. Evidence of this can be found in machine learning experiments deriving results in the form of abstract data.
4.1. Future Developing Usage of Mobile Technology
5. Conclusions
Acknowledgments
References
- Bell, S.; McDiarmid, A.; Irvine, J. Nodobo: Mobile Phone as a Software Sensor for Social Network Research. 2011 IEEE 73rd Vehicular Technology Conference (VTC Spring), 2011, pp. 1–5. [CrossRef]
- Sultan, K.; Ali, H.; Zhang, Z. Call Detail Records Driven Anomaly Detection and Traffic Prediction in Mobile Cellular Networks. IEEE Access 2018, 6, 41728–41737. [CrossRef]
- Brandusoiu, I.B.; Toderean, G. Applying Principal Component Analysis on Call Detail Records. Acta Technica Napocensis: Electronica - Telecomunicatii 2014, 55, 25–28.
- Ruholla Jafari-Marandi, Joshua Denton, A.I.B.K.S..A.K. Optimum profit-driven churn decision making: innovative artificial neural networks in telecom industry. Neural Computing and Applications 2020, 32, 14929–14962. [CrossRef]
- Songailaitee, M.; Krilaviciusb, T. IVUS2021: Information Society and University Studies 2021, April 23, 2021, Kaunas, Lithuania. 2021.
- Leo, Y.; Busson, A.; Sarraute, C.; Fleury, E. Call detail records to characterize usages and mobility events of phone users. Computer Communications 2016, 95, 43–53. Mobile Traffic Analytics. [CrossRef]
- Rhoads, Daniel.; Serrano, Ivan.; Borge-Holthoefer, Javier.; Solé-Ribalta, Albert. Measuring and mitigating behavioural segregation using Call Detail Records. EPJ Data Sci. 2020, 9, 5. [CrossRef]
- Pinter, G.; Mosavi, A.; Felde, I. Artificial Intelligence for Modeling Real Estate Price Using Call Detail Records and Hybrid Machine Learning Approach. Entropy 2020, 22. [CrossRef]
- Kozik, R.; Choraś, M.; Pawlicki, M.; Pawlicka, A.; Warczak, W.; Mazgaj, G. Proposition of Innovative and Scalable Information System for Call Detail Records Analysis and Visualisation. 13th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2020); Herrero, Á.; Cambra, C.; Urda, D.; Sedano, J.; Quintián, H.; Corchado, E., Eds.; Springer International Publishing: Cham, 2021; pp. 174–183. [CrossRef]
- Abba, E.; Aibinu, A.; Alhassan, J. Development of multiple mobile networks call detailed records and its forensic analysis. Digital Communications and Networks 2019, 5, 256–265. [CrossRef]
- Nair, S.C.; Elayidom, M.S.; Gopalan, S. Impact of CDR data analysis using big data technologies for the public: An analysis. 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS), 2017, pp. 1–6. [CrossRef]
- Lu, Y.; Ma, Y.; Shi, L.; Chen, L. A Deep Learning Approach for M2M Traffic Classification Using Call Detail Records. 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP), 2021, pp. 964–968. [CrossRef]
- Mokhtari, A.; Sadighi, L.; Bahrak, B.; Eshghie, M. Hybrid Model for Anomaly Detection on Call Detail Records by Time Series Forecasting, 2021, [arXiv:cs.LG/2006.04101]. [CrossRef]
- Kaur, N.; Ojha, N. Robust fuzzy based clustering approach in data mining using on call data records. 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), 2017, pp. 1111–1117. [CrossRef]
- Zhang, S.; Yin, D.; Zhang, Y.; Zhou, W. Computing on Base Station Behavior Using Erlang Measurement and Call Detail Record. IEEE Transactions on Emerging Topics in Computing 2015, 3, 444–453. [CrossRef]
- Khaefi, M.R.; Hendrik.; Burra, D.D.; Dianco, R.F.; Alkarisya, D.M.P.; Muztahid, M.R.; Zahara, A.; Hodge, G.; Idzalika, R. Modelling Wealth from Call Detail Records and Survey Data with Machine Learning: Evidence from Papua New Guinea. 2019 IEEE International Conference on Big Data (Big Data), 2019, pp. 2855–2864. [CrossRef]
- Jaffry, S.; Shah, S.T.; Hasan, S.F. Data-Driven Semi-Supervised Anomaly Detection Using Real-World Call Data Record. 2020 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), 2020, pp. 1–6. [CrossRef]
- Ficek, M.; Kencl, L. Inter-Call Mobility model: A spatio-temporal refinement of Call Data Records using a Gaussian mixture model. 2012 Proceedings IEEE INFOCOM, 2012, pp. 469–477. [CrossRef]
- A Gender-Centric Analysis of Calling Behaviour in a Developing Economy Using Call Detail Records.
- Goergen, D.; Mendiratta, V.; State, R.; Engel, T. Analysis of Large Call Data Records with Big Data. Proceedings of the Conference on Principles, Systems and Applications of IP Telecommunications; Association for Computing Machinery: New York, NY, USA, 2014; IPTComm ’14. [CrossRef]
- Aziz, Z.; Bestak, R. Analysis of Call Detail Records of International Voice Traffic in Mobile Networks. 2018 Tenth International Conference on Ubiquitous and Future Networks (ICUFN), 2018, pp. 475–480. [CrossRef]
- Anwar, U.; Shabbir, G.; Ali, M. Data Analysis and Summarization to Detect Illegal VOIP Traffic with Call Detail Records. International Journal of Computer Applications 2014, 89. [CrossRef]
- Bianchi, F.M.; Rizzi, A.; Sadeghian, A.; Moiso, C. Identifying user habits through data mining on call data records. Engineering Applications of Artificial Intelligence 2016, 54, 49–61. [CrossRef]
- Rao, R.M.; Fontaine, M.; Veisllari, R. A Reconfigurable Architecture for Packet Based 5G Transport Networks. 2018 IEEE 5G World Forum (5GWF), 2018, pp. 474–477. [CrossRef]
- Pereira, O.M.; Capitão, M.; Regateiro, D.D.; Aguiar, R.L.; Osório, J.B. Mediator framework for inserting xDRs into Hadoop. 2016 IEEE Symposium on Computers and Communication (ISCC), 2016, pp. 547–554. [CrossRef]
- Ling, F.; Sun, T.; Zhu, X.; Chen, Q.; Tang, X.; Ke, X. Mining travel behaviors of tourists with mobile phone data: A case study in Hainan. 2016, pp. 1524–1529. [CrossRef]
- Li, J.Y.; Yeh, M.Y.; Chen, M.S.; Lin, J.H. Modeling social influences from call records and mobile web browsing histories. 2015 IEEE International Conference on Big Data (Big Data), 2015, pp. 1357–1361. [CrossRef]
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 |




Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).