Laboratorio Geoespacial, Facultad de Ciencias de la Ingenieria, Universidad Catolica del Maule, Talca 3605, Chile
: Received: 12 August 2016 / Approved: 13 August 2016 / Online: 13 August 2016 (11:28:39 CEST)
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
How to cite:
Hernandez, S.; Vergara, D.; Jorquera, F. Incremental Learning for Large Scale Churn Prediction. Preprints2016, 2016080142 (doi: 10.20944/preprints201608.0142.v1).
Hernandez, S.; Vergara, D.; Jorquera, F. Incremental Learning for Large Scale Churn Prediction. Preprints 2016, 2016080142 (doi: 10.20944/preprints201608.0142.v1).
Modern companies accumulate a vast amount of customer data that can be used for creating a personalized experience. Analyzing this data is difficult and most business intelligence tools cannot cope with the volume of the data. One example is churn prediction, where the cost of retaining existing customers is less than acquiring new ones. Several data mining and machine learning approaches can be used, but there is still little information about the different algorithm settings to be used when the dataset doesn't fit into a single computer memory. Because of the difficulties of applying feature selection techniques at a large scale, Incremental Probabilistic Component Analysis (IPCA) is proposed as a data preprocessing technique. Also, we present a new approach to large scale churn prediction problems based on the mini-batch Stochastic Gradient Decent (SGD) algorithm. Compared to other techniques, the new method facilitates training with large data volumes using a small memory footprint while achieving good prediction results.
churn prediction; incremental principal component analysis; stochastic gradient descent