Submitted:
31 January 2024
Posted:
01 February 2024
You are already at the latest version
Abstract
Keywords:
MSC: 62H12; 62F15
1. Introduction
2. Model Description
2.1. Two-Part Latent Variable Model
2.2. Bayesian Feature Selection
3. Bayesian Inference
3.1. Prior Specification and MCMC Sampling
3.2. MCMC Sampling
- draw from ,
- draw from ,
- draw from ,
- draw from , and
- draw from .
4. Simulation Study
5. China Household Finance Survey Data
| SS | BaLsso | SS | BaLsso | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Par | Est. | SD | Est. | SD | Par | Est. | SD | Est. | SD | |||
| -0.835 | 0.078 | -0.838 | 0.080 | 9.782 | 0.152 | 9.670 | 0.125 | |||||
| 0.050 | 0.063 | 0.076 | 0.070 | -0.137 | 0.103 | -0.107 | 0.088 | |||||
| -0.750 | 0.099 | -0.757 | 0.102 | -0.147 | 0.141 | -0.015 | 0.081 | |||||
| 0.107 | 0.085 | 0.147 | 0.088 | -0.022 | 0.065 | -0.006 | 0.075 | |||||
| 0.428 | 0.062 | 0.072 | 0.070 | -0.019 | 0.060 | -0.029 | 0.069 | |||||
| 0.577 | 0.070 | 0.082 | 0.081 | 0.259 | 0.123 | 0.322 | 0.107 | |||||
| 0.004 | 0.040 | 0.005 | 0.052 | 0.035 | 0.058 | 0.053 | 0.067 | |||||
| 0.118 | 0.079 | 0.130 | 0.079 | 0.043 | 0.072 | 0.281 | 0.113 | |||||
| 0.747 | 0.073 | 0.092 | 0.077 | 0.384 | 0.132 | 0.188 | 0.118 | |||||
| -0.059 | 0.112 | -0.039 | 0.092 | 1.205 | 0.106 | 1.910 | 0.104 | |||||
| 0.312 | 0.150 | 0.300 | 0.152 | |||||||||
| -0.791 | 0.062 | -0.714 | 0.057 | |||||||||
| -0.865 | 0.067 | -0.625 | 0.068 | |||||||||
6. Discussion
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
| TPM | Two-part model |
| TPLVM | Two-part latent variable model |
| SS | Spike and slab bimodal prior |
| BaLsso | Bayesian lasso |
| MCMC | Markov Chains Monte Carlo |
| CHFS | China household finance survey |
Appendix A
References
- Deb, P.; Munkin, M.K.; Trivedic, R.K. Bayesian analysis of the two-part model with endogeneity: Application to health care expenditure. J. Appl. Econ. 2006, 21, 1081–1099. [Google Scholar] [CrossRef]
- Cragg, J.G. Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 2006, 39, 829–844. [Google Scholar] [CrossRef]
- Neelon, B.; Zhu, L.; Neelon, S.E.B. Bayesian two-part spatial models for semicontinuous data with application to emergency department expenditures. Biostatistics 2015, 16, 465–479. [Google Scholar] [CrossRef]
- Manning, W.G.; et al. (1981). A two-part model of the demand for medical Care: preliminary results from the health insurance experiment, in Health, Economics, and Health Economics, eds. van der Gaag, J. and Perlman, M., p. 103-104, Amsterdam: North-Holland.
- Su, L.; Tom, B.D.; Farewell, V.T. Bias in 2-part mixed models for longitudinal semi-continuous data. Biostatistics 2009, 10, 374–389. [Google Scholar] [CrossRef]
- Su, L.; Tom, B.D.; Farewell, V.T. A likelihood-based two-part marginal model for longitudinal semi-continuous data. Statiscal Methods in Medical Research 2015, 24, 194–205. [Google Scholar] [CrossRef]
- Liu, L.; Cowen, M.E.; Strawderman, R.L.; Shih, Y.C.T. A flexible two-part random effects model for correlated medical costs. Journal of Health Economics 2010, 29, 110–123. [Google Scholar] [CrossRef]
- Smith, V.A.; Neelon, B.; Preisser, J.S.; Maciejewski, L. A marginalized two-part model for semicontinuous data. Statistics in Medicine 2015, 33, 4891–4903. [Google Scholar] [CrossRef] [PubMed]
- Tooze, J.A.; Grunwald, J.K.; Jones, R.H. Analysis of repeated measures data with clumping at zero. Statistical Methods in Medical Research 2002, 11, 341–355. [Google Scholar] [CrossRef]
- Brown, R.A.; Monti, P.M.; Myers, M.G.; Martin, R.A.; Rivinus, T.; Dubreuil, M.E.T.; Rohsenow, D.J. Depression among cocaine abusers in treatment: Relation to cocaine and alcohol use and treatment outcome. American Journal of Psychiatry 1998, 155, 220–225. [Google Scholar] [CrossRef]
- Olsen, M.K.; Schafer, J.L. A two-part random-effects model for semicontinuous longitudinal data. Journal of the American Statistical Association 2001, 96, 730–745. [Google Scholar] [CrossRef]
- Xing, D.Y.; Huang, Y.X.; Chen, H.N.; Zhu, Y.L.; Dagen, G.A.; Baldwin, J. Bayesian inference for two-part mixed effects model using skew distributions, with application to longitudinal semi-continuous alcohol data. Statistical Methods in Medical Research 2017, 26, 1838–1853. [Google Scholar] [CrossRef]
- Chen, J.Y.; Zheng, L.Y.; Xia, Y.M. Bayesian analysis for two-part latent variable model with application to fractional data. Communications in Statistics - Theory and Methods 2023. [Google Scholar] [CrossRef]
- Kim, Y.; Muthén, B.O. Two-Part Factor Mixture Modeling: Application to an Aggressive Behavior Measurement Instrument. Structural Equation Modeling: A Multidisciplinary Journal 2009, 16, 602–624. [Google Scholar] [CrossRef]
- Feng, X.; Lu, B.; Song, X.; Ma, S. Financial literacy and household finances: A Bayesian two-part latent variable modeling approach. Journal of Empirical Finance 2019, 51, 119–137. [Google Scholar] [CrossRef]
- Xia, Y.M.; Tang, N.S. Bayesian analysis for mixture of latent variable hidden Markov models with multivariate longitudinal data. Computational Statistics & Data Analysis 2019, 132, 190–211. [Google Scholar]
- Gou, J.W.; Xia, Y.M.; Jiang, D.P. Bayesian analysis of two-part nonlinear latent variable model: Semiparametric method. Statistical Moddelling 2023, 23, 721–741. [Google Scholar] [CrossRef]
- Xiong, S.C.; Xia, Y.M.; Lu, B. Bayesian Analysis of Two-Part Latent Variable Model with Mixed Data. Communications in Mathematics and Statistics in press. 2023. [Google Scholar] [CrossRef]
- Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Fu, W.J. Penalized regression: the bridge versus the lasso. Journal of computational and Graphical Statistics 1998, 7, 109–148. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements ofStatistical Learning; Springer-Verlag: New York, NY, 2009. [Google Scholar]
- Hastie, T. , Tibshirani, R. and Wainwright, M. (2015). Statistical Learning with Sparsity - The Lasso and Generalization, CRC Press: New York.
- Kuo, L.; Mallick, B.K. Variable selection for regression models. Sankhya, Ser. B 1998, 60, 65–81. [Google Scholar]
- Tibshirani, R. Regression shrinkage and selection via theLasso. J. R. Stat. Soc. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selectionvia the elastic net. Zou, H., and Hastie, T. 2005, 67, 301–320. [Google Scholar]
- Zou, H. The adaptive Lasso and its oracle properties. Journal of the American statistical Association 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
- Zhang, W.; Ota, T.; Shridhar, V.; Chien, J.; Wu, B.; et al. Networkbased survival analysis reveals subnetwork signatures for predicting outcomes of ovarian cancer treatment. PLOS Comput. Biol. 2013, 9, e1002975. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Q.; Shi, X.; Xie, Y.; Huang, J.; Shia, B.; et al. Combiningmultidimensional genomic measurements for predicting cancerprognosis: observations from TCGA.Brief. Bioinform 2014, 16, 291–303. [Google Scholar]
- George, E.I.; McCulloch, R.E. Variable selection via Gibbs sampling. Journal of the American Statistical Association 1993, 88, 881–889. [Google Scholar] [CrossRef]
- George, E.I.; McCulloch, R.E. Approaches for Bayesianvariable selection. Stat. Sin. 1997, 7, 339–373. [Google Scholar]
- Chipman, H.A. Bayesian variable selection with related predictors. Canad. J. Statist. 1996, 24, 17–36. [Google Scholar] [CrossRef]
- Ishwaran, H.; Rao, J.S. Spike and Slab gene selcetion for multigroup microarray data. Journal of the American Statistical Association 2005, 87, 371–390. [Google Scholar]
- Ishwaran, H.; Rao, J.S. Spike and Slab variable selection: frequentist and Bayesian strageies. The Annals of Statistics 2005, 33, 730–773. [Google Scholar] [CrossRef]
- Mitchell, T.J.; Beauchamp, J.J. Bayesian variable selection in linear regression. Journal of the American Statistical Association 1988, 83, 1023–1032. [Google Scholar] [CrossRef]
- Rockova, V.; George, E.I. EMVS: The EM approach toBayesian variable selection. Journal of the American Statistical Association 2014, 109, 828–846. [Google Scholar] [CrossRef]
- Tang, Z.X.; Shen, Y.P.; Xinyan Zhang, X.Y.; Nengjun Yi, N.J. The Spike-and-Slab Lasso Generalized Linear Modelsfor Prediction and Associated Genes Detection. Genetics 2017, 205, 77–88. [Google Scholar] [CrossRef]
- Park, T.; Casella, G. The Bayesian Lasso, Journal of the American Statistical Association 2008, 482, 681–686. [Google Scholar] [CrossRef]
- Skrondal, A.; Rabe-Hesketh, S. Generalized latent variable modelling: multilevel, longitudinal and structural equation models; Chapman & Hall/CRC: London.
- Bollen, K.A. Structural Equations with Latent Variables; John Wiley & Sons: New York, 1989. [Google Scholar]
- Lee, S. Y. (2007). Structural Equation Modeling: A Bayesian Approach, John Wiley & Sons: New York.
- Polson, N.G.; Scott, J.G.; Windle, J. Bayesian Inference for Logistic Models Using Polya-Gamma Latent Variables. Journal of the American Statistical Association 2013, 108, 1339–1349. [Google Scholar] [CrossRef]
- Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, John Wiley & Sons: New York.
- Sha, N.J.; Dechi, B.O. A Bayes inference for ordinal response with latent variable approach. Stats 2019, 2, 321–331. [Google Scholar] [CrossRef]
- Tanner, M.A.; Wong, W.H. The calculation of posterior distributions by data augmentation(with discussion). Journal of the American statistical Association 1987, 82, 528–550. [Google Scholar] [CrossRef]
- Gelfand, A.E.; Smith, A.F.M. Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 1990, 85, 398–409. [Google Scholar] [CrossRef]
- Geman, S.; Geman, D. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 1984, PAMI-6, 721–741. [Google Scholar] [CrossRef]
- Gelman, A.; Rubin, D.B. Inference from iterative simulation using multiple sequences (with discussion). Statistical Science 1992, 7, 457–511. [Google Scholar] [CrossRef]
- Feng, X.; Wang, Y.F.; Lu, B.; Song, X.Y. Bayesian regularized quantile structural equation models. Journal of Multivariate Analysis 2017, 154, 234–248. [Google Scholar] [CrossRef]
- Little, R.J.A.; Rubin, D.B. Statistical analysis with missing data, second Edition ed; John Wiley & Sons: New York, 2002. [Google Scholar]
- Song, X.Y.; Lee, S.Y. A tutorial on the Bayesian approach for analyzing structural equation models. Journal of Mathematical Psychology 2012, 56, 135–148. [Google Scholar] [CrossRef]
- Song, X.Y.; Xia, Y.M.; Zhu, H.T. Hidden Markov latent variable models with multivariate longitudinal data. Biometrics 2017, 73, 313–323. [Google Scholar] [CrossRef]
- Devroye, L. (1986). Non-Uniform Random Variate Generation, Springer-Verlag: New York.
- Ross, S. M. (1991). A Course in Simulation, MacMillan: New York.
- Chhikara, R. S. , and Folks, L. (1989). The Inverse Gaussian Distribution: Theory, Methodology, and Applications, Marcel Dekker: New York.
- Duan, N.; Manning, W.G.; Morris, C.N.; Newhouse, J.P. A Comparison of alternative models for the demand for medical Care. Journal of Business and Economic Statistics 1983, 1, 115–126. [Google Scholar] [CrossRef]




| SS | BaLsso | ||||||
|---|---|---|---|---|---|---|---|
| PAR | BIAS | RMS | SD | BIAS | RMS | SD | |
| -0.015 | 0.097 | 0.129 | 0.028 | 0.150 | 0.134 | ||
| -0.056 | 0.143 | 0.142 | -0.152 | 0.217 | 0.136 | ||
| -0.001 | 0.021 | 0.061 | -0.019 | 0.042 | 0.079 | ||
| -0.144 | 0.216 | 0.145 | -0.122 | 0.251 | 0.148 | ||
| 0.005 | 0.030 | 0.064 | -0.008 | 0.040 | 0.078 | ||
| -0.091 | 0.147 | 0.137 | -0.045 | 0.135 | 0.137 | ||
| 0.017 | 0.028 | 0.075 | 0.026 | 0.055 | 0.096 | ||
| -0.187 | 0.237 | 0.184 | -0.126 | 0.209 | 0.184 | ||
| 0.010 | 0.079 | 0.084 | 0.008 | 0.063 | 0.085 | ||
| -0.035 | 0.079 | 0.077 | -0.011 | 0.065 | 0.074 | ||
| 0.005 | 0.032 | 0.051 | -0.018 | 0.031 | 0.054 | ||
| -0.007 | 0.061 | 0.070 | -0.021 | 0.085 | 0.069 | ||
| -0.007 | 0.029 | 0.049 | -0.003 | 0.031 | 0.053 | ||
| -0.070 | 0.093 | 0.077 | -0.018 | 0.082 | 0.075 | ||
| -0.040 | 0.086 | 0.089 | -0.02 | 0.069 | 0.088 | ||
| -0.011 | 0.033 | 0.062 | 0.014 | 0.036 | 0.069 | ||
| 0.085 | 0.129 | 0.117 | 0.038 | 0.082 | 0.111 | ||
| 0.042 | 0.078 | 0.073 | 0.058 | 0.098 | 0.071 | ||
| 0.030 | 0.072 | 0.071 | 0.034 | 0.063 | 0.072 | ||
| 0.058 | 0.079 | 0.072 | 0.052 | 0.090 | 0.073 | ||
| 0.031 | 0.060 | 0.072 | 0.037 | 0.064 | 0.073 | ||
| 0.014 | 0.041 | 0.074 | 0.018 | 0.058 | 0.076 | ||
| Total | - | 1.870 | 1.975 | - | 2.016 | 2.035 | |
| SS | BaLsso | ||||||
|---|---|---|---|---|---|---|---|
| PAR | BIAS | RMS | SD | BIAS | RMS | SD | |
| 0.052 | 0.096 | 0.087 | 0.009 | 0.092 | 0.087 | ||
| 0.005 | 0.069 | 0.089 | 0.055 | 0.117 | 0.090 | ||
| 0.003 | 0.048 | 0.058 | 0.032 | 0.052 | 0.060 | ||
| 0.007 | 0.086 | 0.093 | -0.045 | 0.076 | 0.091 | ||
| 0.004 | 0.015 | 0.049 | -0.020 | 0.043 | 0.060 | ||
| 0.010 | 0.071 | 0.086 | 0.013 | 0.074 | 0.085 | ||
| -0.003 | 0.029 | 0.059 | 0.032 | 0.064 | 0.077 | ||
| 0.002 | 0.102 | 0.120 | -0.042 | 0.108 | 0.114 | ||
| 0.017 | 0.042 | 0.053 | 0.030 | 0.056 | 0.054 | ||
| -0.023 | 0.038 | 0.046 | -0.016 | 0.039 | 0.047 | ||
| -0.007 | 0.019 | 0.033 | -0.005 | 0.018 | 0.037 | ||
| -0.028 | 0.060 | 0.042 | -0.014 | 0.026 | 0.043 | ||
| -0.007 | 0.023 | 0.033 | 0.000 | 0.018 | 0.036 | ||
| -0.005 | 0.035 | 0.046 | 0.003 | 0.043 | 0.047 | ||
| -0.031 | 0.058 | 0.053 | -0.039 | 0.063 | 0.054 | ||
| -0.001 | 0.031 | 0.045 | -0.025 | 0.081 | 0.053 | ||
| 0.018 | 0.049 | 0.068 | 0.041 | 0.053 | 0.071 | ||
| 0.021 | 0.041 | 0.045 | 0.033 | 0.038 | 0.045 | ||
| 0.016 | 0.049 | 0.045 | 0.028 | 0.038 | 0.045 | ||
| 0.032 | 0.049 | 0.045 | 0.054 | 0.057 | 0.045 | ||
| 0.043 | 0.059 | 0.046 | 0.043 | 0.054 | 0.046 | ||
| 0.016 | 0.043 | 0.049 | 0.005 | 0.037 | 0.048 | ||
| Total | - | 1.112 | 1.29 | - | 1.247 | 1.335 | |
| SS | BaLsso | ||||||
|---|---|---|---|---|---|---|---|
| PAR | |||||||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 98 | 96 | 85 | 88 | 86 | 76 | ||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 96 | 95 | 86 | 93 | 93 | 85 | ||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 96 | 94 | 93 | 97 | 92 | 87 | ||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 99 | 100 | 100 | 100 | 100 | 100 | ||
| 100 | 99 | 95 | 100 | 98 | 93 | ||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 100 | 100 | 97 | 98 | 100 | 91 | ||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 100 | 100 | 100 | 100 | 100 | 100 | ||
| 100 | 98 | 97 | 97 | 96 | 96 | ||
| Variable. | Description. | Mean. | Max. | Min. | SD |
| Gender () | =1, male; =0, otherwise | 0.756 | 1 | 0 | 0.430 |
| Age () | 51.81 | 91 | 19 | 14.931 | |
| Marital status () | =1, married; 0, otherwise | 0.863 | 1 | 0 | 0.344 |
| Health condition ( | =1, good; 0, otherwise | 0.833 | 1 | 0 | 0.373 |
| Education degree ( | =1, high school or above; | ||||
| =0, otherwise | 0.352 | 1 | 0 | 0.478 | |
| Employment () | =1, yes; 0, otherwise | 0.092 | 1 | 0 | 0.290 |
| No. of adults () | 3.002 | 3 | 0 | 1.301 | |
| Annual Income (CYN) | 0 |
| Part one | Part two | |||||
|---|---|---|---|---|---|---|
| VAR | SS | BaLsso | SS | BaLsso | ||
| Gender | 0 | 0 | 1 | 1 | ||
| Age | 1 | 1 | 1 | 0 | ||
| Material status | 1 | 1 | 0 | 0 | ||
| Health condition | 1 | 0 | 0 | 0 | ||
| Education | 1 | 0 | 1 | 1 | ||
| Employment | 0 | 0 | 0 | 0 | ||
| No. of Adults | 1 | 1 | 0 | 1 | ||
| Income | 1 | 0 | 1 | 1 | ||
| Family culture | 0 | 0 | 1 | 1 | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).