Learning Sentence-Level Representations with Predictive Coding

dc.contributor.authorAraujo, Vladimir
dc.contributor.authorMoens, Marie-Francine
dc.contributor.authorSoto, Alvaro
dc.date.accessioned2025-01-20T20:15:54Z
dc.date.available2025-01-20T20:15:54Z
dc.date.issued2023
dc.description.abstractLearning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.
dc.fuente.origenWOS
dc.identifier.doi10.3390/make5010005
dc.identifier.eissn2504-4990
dc.identifier.urihttps://doi.org/10.3390/make5010005
dc.identifier.urihttps://repositorio.uc.cl/handle/11534/92292
dc.identifier.wosidWOS:000959706300001
dc.issue.numero1
dc.language.isoen
dc.pagina.final77
dc.pagina.inicio59
dc.revistaMachine learning and knowledge extraction
dc.rightsacceso restringido
dc.subjectdeep learning
dc.subjectrepresentation learning
dc.subjectnatural language processing
dc.subjectlanguage models
dc.subjectBERT
dc.subjectpredictive coding
dc.titleLearning Sentence-Level Representations with Predictive Coding
dc.typeartículo
dc.volumen5
sipa.indexWOS
sipa.trazabilidadWOS;2025-01-12
Files