Imitating Human Reasoning to Extract 5W1H in News

No Thumbnail Available
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
ACM Digital Library
Abstract
Extracting key information from news articles is crucial for advancing search systems. Historically, the 5W1H framework, which organises information based on ’Who’, ’What’, ’When’, ’Where’, ’Why’, and ’How’, has been a predominant method in digital journalism empowering search tools. The rise of Large Language Models (LLMs) has sparked new research into their potential for performing such information extraction tasks effectively. Our study examines a novel approach to employing LLMs in the 5W1H extraction process, particularly focusing on their capacity to mimic human reasoning. We introduce two innovative Chain-of-Thought (COT) prompting techniques to extract 5W1H in news: extractive reasoning and question-level reasoning. The former directs the LLM to pinpoint and highlight essential details from texts, while the latter encourages the model to emulate human-like reasoning at the question-response level. Our research methodology includes experiments with leading LLMs using prompting strategies to ascertain the most effective approach. The results indicate that COT prompting significantly outperforms other methods. In addition, we show that the effectiveness of LLMs in such tasks depends greatly on the nature of the questions posed.
Description
Keywords
5W1H, LLM, Imitative reasoning, News
Citation