A comparative dataset: Bridging COVID-19 and other diseases through epistemonikos and CORD-19 evidence

dc.article.number109720
dc.contributor.authorCarvallo A.
dc.contributor.authorParra D.
dc.contributor.authorLobel H.
dc.contributor.authorRada G.
dc.contributor.otherCEDEUS (Chile)
dc.date.accessioned2025-05-01T10:33:19Z
dc.date.available2025-05-01T10:33:19Z
dc.date.issued2023
dc.description.abstract© 2023 The Author(s)The COVID-19 pandemic has underlined the need for reliable information for clinical decision-making and public health policies. As such, evidence-based medicine (EBM) is essential in identifying and evaluating scientific documents pertinent to novel diseases, and the accurate classification of biomedical text is integral to this process. Given this context, we introduce a comprehensive, curated dataset composed of COVID-19-related documents. This dataset includes 20,047 labeled documents that were meticulously classified into five distinct categories: systematic reviews (SR), primary study randomized controlled trials (PS-RCT), primary study non-randomized controlled trials (PS-NRCT), broad synthesis (BS), and excluded (EXC). The documents, labeled by collaborators from the Epistemonikos Foundation, incorporate information such as document type, title, abstract, and metadata, including PubMed id, authors, journal, and publication date. Uniquely, this dataset has been curated by the Epistemonikos Foundation and is not readily accessible through conventional web-scraping methods, thereby attesting to its distinctive value in this field of research. In addition to this, the dataset also includes a vast evidence repository comprising 427,870 non-COVID-19 documents, also categorized into SR, PS-RCT, PS-NRCT, BS, and EXC. This additional collection can serve as a valuable benchmark for subsequent research. The comprehensive nature of this open-access dataset and its accompanying resources is poised to significantly advance evidence-based medicine and facilitate further research in the domain.
dc.description.funderCENIA
dc.description.funderEpistemonikos Foundation
dc.description.funderIMFD
dc.description.funderNational Center of Artificial Intelligence, Chile
dc.format.extent18 páginas
dc.fuente.origenScopus
dc.identifier.doi10.1016/j.dib.2023.109720
dc.identifier.eisbn9780128194706
dc.identifier.eissn2046-2069
dc.identifier.isbn9783031764011
dc.identifier.issn23523409
dc.identifier.pubmedid40156360
dc.identifier.scieloidS0718-69242020000300109
dc.identifier.scopusidSCOPUS_ID:85175453006
dc.identifier.urihttps://doi.org/10.1016/j.dib.2023.109720
dc.identifier.urihttps://repositorio.uc.cl/handle/11534/103988
dc.identifier.wosidWOS:001105270200001
dc.information.autorucEscuela de Ingeniería; Lobel Diaz Hans Albert; 0000-0003-3514-9414; 131278
dc.issue.numero40
dc.language.isoen
dc.nota.accesoContenido completo
dc.pagina.final474
dc.pagina.inicio467
dc.publisherSpringer
dc.relation.ispartofIntersections Interdisciplinary Research on Architecture, Design, City and Territory
dc.revistaData in Brief
dc.rightsacceso abierto
dc.subjectBiomedical text classification
dc.subjectCovid-19
dc.subjectEvidence based medicine
dc.subjectNatural language processing
dc.subject.ddc610
dc.subject.deweyMedicina y saludes_ES
dc.subject.ods03 Good health and well-being
dc.subject.odspa03 Salud y bienestar
dc.titleA comparative dataset: Bridging COVID-19 and other diseases through epistemonikos and CORD-19 evidence
dc.typeartículo
dc.volumen51
sipa.codpersvinculados131278
sipa.indexScopus
sipa.trazabilidadCarga WOS-SCOPUS;01-05-2025
Files