An accelerated algorithm for density estimation in large databases using Gaussian mixtures

dc.contributor.authorSoto, Alvaro
dc.contributor.authorZavala, Felipe
dc.contributor.authorAraneda, Anita
dc.date.accessioned2025-01-21T01:05:38Z
dc.date.available2025-01-21T01:05:38Z
dc.date.issued2007
dc.description.abstractToday, with the advances of computer storage and technology, there are huge datasets available, offering an opportunity to extract valuable information. Probabilistic approaches are specially suited to learn from data by representing knowledge as density functions. In this paper, we choose Gaussian mixture models (GMMs) to represent densities, as they possess great flexibility to adequate to a wide class of problems. The classical estimation approach for GMMs corresponds to the iterative algorithm of expectation maximization (EM). This approach, however, does not scale properly to meet the high demanding processing requirements of large databases. In this paper we introduce an EM-based algorithm, that solves the scalability problem. Our approach is based on the concept of data condensation which, in addition to substantially diminishing the computational load, provides sound starting values that allow the algorithm to reach convergence faster. We also focus on the model selection problem. We test our algorithm using synthetic and real databases, and find several advantages, when compared to other standard existing procedures.
dc.fuente.origenWOS
dc.identifier.doi10.1080/01969720601138928
dc.identifier.issn0196-9722
dc.identifier.urihttps://doi.org/10.1080/01969720601138928
dc.identifier.urihttps://repositorio.uc.cl/handle/11534/95979
dc.identifier.wosidWOS:000244685800001
dc.issue.numero2
dc.language.isoen
dc.pagina.final139
dc.pagina.inicio123
dc.revistaCybernetics and systems
dc.rightsacceso restringido
dc.subject.ods03 Good Health and Well-being
dc.subject.odspa03 Salud y bienestar
dc.titleAn accelerated algorithm for density estimation in large databases using Gaussian mixtures
dc.typeartículo
dc.volumen38
sipa.indexWOS
sipa.trazabilidadWOS;2025-01-12
Files