From Web to RheumaLpack: Creating a Linguistic Corpus for Exploitation and Knowledge Discovery in Rheumatology DOI Creative Commons
Alfredo Madrid-García, Beatriz Merino‐Barbancho, D. Freites

et al.

Computers in Biology and Medicine, Journal Year: 2024, Volume and Issue: 179, P. 108920 - 108920

Published: July 23, 2024

This study introduces RheumaLinguisticpack (RheumaLpack), the first specialised linguistic web corpus designed for field of musculoskeletal disorders. By combining mining (i.e., scraping) and natural language processing (NLP) techniques, as well clinical expertise, RheumaLpack systematically captures curates structured unstructured data across a spectrum sources including trials registers ClinicalTrials.gov), bibliographic databases PubMed), medical agencies (i.e. European Medicines Agency), social media Reddit), accredited health websites MedlinePlus, Harvard Health Publishing, Cleveland Clinic). Given complexity rheumatic diseases (RMDs) their significant impact on quality life, this resource can be proposed useful tool to train algorithms that could mitigate diseases' effects. Therefore, aims improve training artificial intelligence (AI) facilitate knowledge discovery in RMDs. The development involved systematic six-step methodology covering identification, characterisation, selection, collection, processing, description. result is non-annotated, monolingual, dynamic corpus, featuring almost 3 million records spanning from 2000 2023. represents pioneering contribution rheumatology research, providing advanced AI NLP applications. highlights value address challenges posed by diseases, illustrating corpus's potential research treatment paradigms rheumatology. Finally, shown replicated obtain other specialities. code details how build are also provided dissemination such resource.

Language: Английский

Review or perish, regardless of your attempts to publish DOI
Cesar Ramos‐Remus, Aldo Barajas‐Ochoa

The Lancet Rheumatology, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 1, 2025

Language: Английский

Citations

0

From Web to RheumaLpack: Creating a Linguistic Corpus for Exploitation and Knowledge Discovery in Rheumatology DOI Creative Commons
Alfredo Madrid-García, Beatriz Merino‐Barbancho, D. Freites

et al.

Computers in Biology and Medicine, Journal Year: 2024, Volume and Issue: 179, P. 108920 - 108920

Published: July 23, 2024

This study introduces RheumaLinguisticpack (RheumaLpack), the first specialised linguistic web corpus designed for field of musculoskeletal disorders. By combining mining (i.e., scraping) and natural language processing (NLP) techniques, as well clinical expertise, RheumaLpack systematically captures curates structured unstructured data across a spectrum sources including trials registers ClinicalTrials.gov), bibliographic databases PubMed), medical agencies (i.e. European Medicines Agency), social media Reddit), accredited health websites MedlinePlus, Harvard Health Publishing, Cleveland Clinic). Given complexity rheumatic diseases (RMDs) their significant impact on quality life, this resource can be proposed useful tool to train algorithms that could mitigate diseases' effects. Therefore, aims improve training artificial intelligence (AI) facilitate knowledge discovery in RMDs. The development involved systematic six-step methodology covering identification, characterisation, selection, collection, processing, description. result is non-annotated, monolingual, dynamic corpus, featuring almost 3 million records spanning from 2000 2023. represents pioneering contribution rheumatology research, providing advanced AI NLP applications. highlights value address challenges posed by diseases, illustrating corpus's potential research treatment paradigms rheumatology. Finally, shown replicated obtain other specialities. code details how build are also provided dissemination such resource.

Language: Английский

Citations

1