Clustering and Prediction With Variable Dimension Covariates

No Thumbnail Available
Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In many applied fields incomplete covariate vectors are commonly encountered. It is well known that this can be problematic when making inference on model parameters, but its impact on prediction performance is less understood. We develop a method based on covariate dependent random partition models that seamlessly handles missing covariates while completely avoiding any type of imputation. The method we develop allows in-sample as well as out-of-sample predictions, even if the missing pattern in the new subjects'incomplete covariate vectorwas not seen in the training data. Any data type, including categorical or continuous covariates are permitted. In simulation studies, the proposed method compares favorably. We illustrate themethod in two application examples. Supplementary materials for this article are available here.
Description
Keywords
Bayesian nonparametrics, Dependent random partition models, Indicator-missing, Pattern missing
Citation