Imitating Human Reasoning to Extract 5W1H in News

Muñoz Castro, Carlos José; Mendoza Rocha, Marcelo; Löbel Díaz, Hans-Albert; Keith, Brian

Imitating Human Reasoning to Extract 5W1H in News

Date

2025

Authors

Muñoz Castro, Carlos José

Mendoza Rocha, Marcelo

Löbel Díaz, Hans-Albert

Keith, Brian

Publisher

ACM Digital Library

Abstract

Extracting key information from news articles is crucial for advancing search systems. Historically, the 5W1H framework, which organises information based on ’Who’, ’What’, ’When’, ’Where’, ’Why’, and ’How’, has been a predominant method in digital journalism empowering search tools. The rise of Large Language Models (LLMs) has sparked new research into their potential for performing such information extraction tasks effectively. Our study examines a novel approach to employing LLMs in the 5W1H extraction process, particularly focusing on their capacity to mimic human reasoning. We introduce two innovative Chain-of-Thought (COT) prompting techniques to extract 5W1H in news: extractive reasoning and question-level reasoning. The former directs the LLM to pinpoint and highlight essential details from texts, while the latter encourages the model to emulate human-like reasoning at the question-response level. Our research methodology includes experiments with leading LLMs using prompting strategies to ascertain the most effective approach. The results indicate that COT prompting significantly outperforms other methods. In addition, we show that the effectiveness of LLMs in such tasks depends greatly on the nature of the questions posed.

Keywords

5W1H, LLM, Imitative reasoning, News

URI

https://doi.org/10.1145/3701716.3715532
https://repositorio.uc.cl/handle/11534/104487

Collections

Artículos de conferencia

Full item page