Regularizing conjunctive features for classification

dc.contributor.authorBarcelo, Pablo
dc.contributor.authorBaumgartner, Alexander
dc.contributor.authorDalmau, Victor
dc.contributor.authorKimelfeld, Benny
dc.date.accessioned2025-01-20T23:51:29Z
dc.date.available2025-01-20T23:51:29Z
dc.date.issued2021
dc.description.abstractWe consider the feature-generation task wherein we are given a database with entities labeled as positive and negative examples, and we want to find feature queries that linearly separate the two sets of examples. We focus on conjunctive feature queries, and explore two problems: (a) deciding if separating feature queries exist (separability), and (b) generating such queries when they exist. To restrict the complexity of the generated classifiers, we explore various ways of regularizing them by limiting their dimension, the number of joins in feature queries, and their generalized hypertreewidth (ghw). We show that the separability problem is tractable for bounded ghw; yet, the generation problem is not because feature queries might be too large. So, we explore a third problem: classifying new entities without necessarily generating the feature queries. Interestingly, in the case of bounded ghw we can efficiently classify without explicitly generating such queries. (C) 2021 Elsevier Inc. All rights reserved.
dc.fuente.origenWOS
dc.identifier.doi10.1016/j.jcss.2021.01.003
dc.identifier.eissn1090-2724
dc.identifier.issn0022-0000
dc.identifier.urihttps://doi.org/10.1016/j.jcss.2021.01.003
dc.identifier.urihttps://repositorio.uc.cl/handle/11534/94822
dc.identifier.wosidWOS:000634149800007
dc.language.isoen
dc.pagina.final124
dc.pagina.inicio97
dc.revistaJournal of computer and system sciences
dc.rightsacceso restringido
dc.subjectClassification
dc.subjectFeature generation
dc.subjectConjunctive queries
dc.subjectSeparability
dc.subjectGeneralized hypertree width
dc.subject.ods03 Good Health and Well-being
dc.subject.odspa03 Salud y bienestar
dc.titleRegularizing conjunctive features for classification
dc.typeartículo
dc.volumen119
sipa.indexWOS
sipa.trazabilidadWOS;2025-01-12
Files