On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation

Labarca Silva, Álvaro; Parra Santander, Denis; Toro Icarte, Rodrigo Andrés

On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation

dc.catalogador	pva
dc.contributor.author	Labarca Silva, Álvaro
dc.contributor.author	Parra Santander, Denis
dc.contributor.author	Toro Icarte, Rodrigo Andrés
dc.date.accessioned	2025-06-12T20:01:50Z
dc.date.available	2025-06-12T20:01:50Z
dc.date.issued	2024
dc.description.abstract	In recent years, Reinforcement Learning (RL) has shown great promise in session-based recommendation. Sequential models that use RL have reached state-of-the-art performance for the Next-item Prediction (NIP) task. This result is intriguing, as the NIP task only evaluates how well the system can correctly recommend the next item to the user, while the goal of RL is to find a policy that optimizes rewards in the long term - sometimes at the expense of suboptimal short-term performance. Then, how can RL improve the system's performance on short-term metrics? This article investigates this question by exploring proxy learning objectives, which we identify as goals RL models might be following, and thus could explain the performance boost. We found that RL - when used as an auxiliary loss - promotes the learning of embeddings that capture information about the user's previously interacted items. Subsequently, we replaced the RL objective with a straightforward auxiliary loss designed to predict the number of items the user interacted with. This substitution results in performance gains comparable to RL. These findings pave the way to improve performance and understanding of RL methods for recommender systems.
dc.description.funder	National Center for Artificial Intelligence CENIA
dc.description.funder	Fondecyt
dc.fechaingreso.objetodigital	2025-06-12
dc.format.extent	19 páginas
dc.fuente.origen	SCOPUS
dc.identifier.issn	2640-3498
dc.identifier.scopusid	SCOPUS_ID:85203845318
dc.identifier.uri	https://proceedings.mlr.press/v235/silva24b.html
dc.identifier.uri	https://repositorio.uc.cl/handle/11534/104663
dc.information.autoruc	Escuela de Ingeniería; Labarca Silva, Álvaro; S/I; 1025772
dc.information.autoruc	Escuela de Ingeniería; Parra Santander, Denis; 0000-0001-9878-8761; 1011554
dc.information.autoruc	Escuela de Ingeniería; Toro Icarte, Rodrigo Andrés; 0000-0002-7734-099X; 170373
dc.language.iso	en
dc.nota.acceso	contenido completo
dc.pagina.final	45450
dc.pagina.inicio	45432
dc.publisher	ML Research Press
dc.relation.ispartof	Proceedings of the 41st International Conference on Machine Learning
dc.revista	PMLR
dc.rights	acceso abierto
dc.rights.license	Attribution 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject.ddc	620
dc.subject.dewey	Ingeniería	es_ES
dc.title	On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
dc.type	comunicación de congreso
dc.volumen	235
sipa.codpersvinculados	1025772
sipa.codpersvinculados	1011554
sipa.codpersvinculados	170373
sipa.trazabilidad	SCOPUS;2024-09-22

Files

Original bundle

Now showing 1 - 1 of 1

Name:: On the Unexpected Effectiveness of Reinforcement Learning.pdf
Size:: 342.36 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Artículos de conferencia