Browsing by Author "Quintana, Fernando Andres"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
- ItemA Projection Approach to Local Regression with Variable-Dimension Covariates(2024) Heiner, Matthew J.; Page, Garritt L.; Quintana, Fernando AndresIncomplete covariate vectors are known to be problematic for estimation and inferences on model parameters, but their impact on prediction performance is less understood. We develop an imputation-free method that builds on a random partition model admitting variable-dimension covariates. Cluster-specific response models further incorporate covariates via linear predictors, facilitating estimation of smooth prediction surfaces with relatively few clusters. We exploit marginalization techniques of Gaussian kernels to analytically project response distributions according to any pattern of missing covariates, yielding a local regression with internally consistent uncertainty propagation that uses only one set of coefficients per cluster. Aggressive shrinkage of these coefficients regulates uncertainty due to missing covariates. The method allows in- and out-of-sample prediction for any missingness pattern, even if the pattern in a new subject's incomplete covariate vector was not seen in the training data. We develop an MCMC algorithm for posterior sampling that improves a computationally expensive update for latent cluster allocation. Finally, we demonstrate the model's effectiveness for nonlinear point and density prediction under various circumstances by comparing with other recent methods for regression of variable dimensions on synthetic and real data. Supplemental materials for this article are available online.
- ItemBiclustering via Semiparametric Bayesian Inference(2022) Murua, Alejandro; Quintana, Fernando AndresMotivated by classes of problems frequently found in the analysis of gene expression data, we propose a semiparametric Bayesian model to detect biclusters, that is, subsets of individuals sharing similar patterns over a set of conditions. Our approach is based on the well-known plaid model by Lazzeroni and Owen (2002). By assuming a truncated stick-breaking prior we also find the number of biclusters present in the data as part of the inference. Evidence from a simulation study shows that the model is capable of correctly detecting biclusters and performs well compared to some competing approaches. The flexibility of the proposed prior is demonstrated with applications to the analysis of gene expression data (continuous responses) and histone modifications data (count responses).
- ItemChildhood obesity in Singapore: A Bayesian nonparametric approach(2024) Beraha, Mario; Guglielmi, Alessandra; Quintana, Fernando Andres; De Iorio, Maria; Eriksson, Johan Gunnar; Yap, FabianOverweight and obesity in adults are known to be associated with increased risk of metabolic and cardiovascular diseases. Obesity has now reached epidemic proportions, increasingly affecting children. Therefore, it is important to understand if this condition persists from early life to childhood and if different patterns can be detected to inform intervention policies. Our motivating application is a study of temporal patterns of obesity in children from South Eastern Asia. Our main focus is on clustering obesity patterns after adjusting for the effect of baseline information. Specifically, we consider a joint model for height and weight over time. Measurements are taken every six months from birth. To allow for data-driven clustering of trajectories, we assume a vector autoregressive sampling model with a dependent logit stick-breaking prior. Simulation studies show good performance of the proposed model to capture overall growth patterns, as compared to other alternatives. We also fit the model to the motivating dataset, and discuss the results, in particular highlighting cluster differences. We have found four large clusters, corresponding to children sub-groups, though two of them are similar in terms of both height and weight at each time point. We provide interpretation of these clusters in terms of combinations of predictors.
- ItemMultipartition model for multiple change point identification(2023) Pedroso, Ricardo C.; Loschi, Rosangela H.; Quintana, Fernando AndresThe product partition model (PPM) is widely used for detecting multiple change points. Because changes in different parameters may occur at different times, the PPM fails to identify which parameters experienced the changes. To solve this limitation, we introduce a multipartition model to detect multiple change points occurring in several parameters. It assumes that changes experienced by each parameter generate a different random partition along the time axis, which facilitates identifying those parameters that changed and the time when they do so. We apply our model to detect multiple change points in Normal means and variances. Simulations and data illustrations show that the proposed model is competitive and enriches the analysis of change point problems.
- ItemRegression with Variable Dimension Covariates(2024) Mueller, Peter; Quintana, Fernando Andres; Page, Garritt L.Regression is one of the most fundamental statistical inference problems. A broad definition of regression problems is as estimation of the distribution of an outcome using a family of probability models indexed by covariates. Despite the ubiquitous nature of regression problems and the abundance of related methods and results there is a surprising gap in the literature. There are no well established methods for regression with a varying dimension covariate vectors, despite the common occurrence of such problems. In this paper we review some recent related papers proposing varying dimension regression by way of random partitions.