Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

# Conditional Mixture Model and Its Application for Regression Model

Version 1 : Received: 24 October 2020 / Approved: 27 October 2020 / Online: 27 October 2020 (11:41:42 CET)
Version 2 : Received: 28 October 2020 / Approved: 28 October 2020 / Online: 28 October 2020 (11:18:04 CET)

How to cite: Nguyen, L. Conditional Mixture Model and Its Application for Regression Model. Preprints 2020, 2020100550. https://doi.org/10.20944/preprints202010.0550.v2 Nguyen, L. Conditional Mixture Model and Its Application for Regression Model. Preprints 2020, 2020100550. https://doi.org/10.20944/preprints202010.0550.v2

## Abstract

Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating statistical parameter when data sample contains hidden part and observed part. EM is applied to learn finite mixture model in which the whole distribution of observed variable is average sum of partial distributions. Coverage ratio of every partial distribution is specified by the probability of hidden variable. An application of mixture model is soft clustering in which cluster is modeled by hidden variable whereas each data point can be assigned to more than one cluster and degree of such assignment is represented by the probability of hidden variable. However, such probability in traditional mixture model is simplified as a parameter, which can cause loss of valuable information. Therefore, in this research I propose a so-called conditional mixture model (CMM) in which the probability of hidden variable is modeled as a full probabilistic density function (PDF) that owns individual parameter. CMM aims to extend mixture model. I also propose an application of CMM which is called adaptive regressive model (ARM). Traditional regression model is effective when data sample is scattered equally. If data points are grouped into clusters, regression model tries to learn a unified regression function which goes through all data points. Obviously, such unified function is not effective to evaluate response variable based on grouped data points. The concept “adaptive” of ARM means that ARM solves the ineffectiveness problem by selecting the best cluster of data points firstly and then evaluating response variable within such best cluster. In order words, ARM reduces estimation space of regression model so as to gain high accuracy in calculation.

## Keywords

expectation maximization (EM) algorithm; finite mixture model; conditional mixture model; regression model; adaptive regressive model (ARM)

## Subject

Computer Science and Mathematics, Probability and Statistics

Comment 1
Commenter: Loc Nguyen
Commenter's Conflict of Interests: Author
Comment: Fixing bugs in mathematical formulas and adding equation 3.17 as alternate regressive evaluation.
+ Respond to this comment