A quantitative analysis of the use of anonymization in biomedical research DOI Creative Commons
Thierry Meurers,

Karen Otte,

Hammam Abu Attieh

et al.

npj Digital Medicine, Journal Year: 2025, Volume and Issue: 8(1)

Published: May 14, 2025

Summary Anonymized biomedical data sharing faces several challenges. This systematic review analyzes 1084 PubMed-indexed studies (2018–2022) using anonymized to quantify usage trends across geographic, regulatory, and cultural regions identify effective approaches inform implementation agendas. We identified a significant yearly increase in such with slope of 2.16 articles per 100,000 when normalized against the total number ( p = 0.021). Most used from US, UK, Australia (78.2%). trend remained by country-specific research output. Cross-border was rare (10.5% studies). twelve common sources, primarily US (seven) UK (three), including commercial public entities (five). The prevalence anonymization suggests their practices could guide broader adoption. Rare cross-border differences between countries comparable regulations underscore need for global standards.

Language: Английский

From data to diagnosis: skin cancer image datasets for artificial intelligence DOI Creative Commons
David Wen, Andrew A. S. Soltan, Emanuele Trucco

et al.

Clinical and Experimental Dermatology, Journal Year: 2024, Volume and Issue: 49(7), P. 675 - 685

Published: March 29, 2024

Abstract Artificial intelligence (AI) solutions for skin cancer diagnosis continue to gain momentum, edging closer towards broad clinical use. These AI models, particularly deep-learning architectures, require large digital image datasets development. This review provides an overview of the used develop algorithms and highlights importance dataset transparency evaluation algorithm generalizability across varying populations settings. Current challenges curation clinically valuable are detailed, which include shifts arising from demographic variations differences in data collection methodologies, along with inconsistencies labelling. can lead differential performance, compromise utility, propagation discriminatory biases when developed implemented mismatched populations. Limited representation rare cancers minoritized groups existing highlighted, further skew performance. Strategies address these presented, improving transparency, interoperability. Federated learning generative methods, may improve size diversity without compromising privacy, also examined. Lastly, we discuss model-level techniques that entrained through use derived routine care. As role becomes more prominent, ensuring robustness underlying is increasingly important.

Language: Английский

Citations

8

Landscape analysis of available European data sources amenable for machine learning and recommendations on usability for rare diseases screening DOI Creative Commons
Ralitsa Raycheva, Костадин Костадинов,

Elena Mitova

et al.

Orphanet Journal of Rare Diseases, Journal Year: 2024, Volume and Issue: 19(1)

Published: April 6, 2024

Abstract Background Patient registries and databases are essential tools for advancing clinical research in the area of rare diseases, as well enhancing patient care healthcare planning. The primary aim this study is a landscape analysis available European data sources amenable to machine learning (ML) their usability Rare Diseases screening, terms findable, accessible, interoperable, reusable(FAIR), legal, business considerations. Second, recommendations will be proposed provide better understanding health ecosystem. Methods In period March 2022 December 2022, cross-sectional using semi-structured questionnaire was conducted among potential respondents, identified main contact person health-related databases. design self-completed survey instrument based on information drawn from relevant scientific publications, quantitative qualitative research, scoping review challenges mapping disease (RD) To determine database characteristics associated with adherence FAIR principles, legal aspects management Bayesian models were fitted. Results total, 330 unique replies processed analyzed, reflecting same number distinct (no duplicates included). geographical scope, we observed 24.2% ( n = 80) national, 10.0% 33) regional, 8.8% 29) European, 5.5% 18) international coordinated Europe. Over 80.0% 269) still active, approximately 60.0% 191) established after year 2000 71.0% last collected new 2022. Regarding highest overall adherence, while regional “other” scope ranked at bottom list lowest proportion. Responders’ willingness share contribution goals Screen4Care project evaluated end survey. This question completed by 108 respondents; however, only 18 them (16.7%) expressed direct contribute sharing Among them, an equal split between pro-bono paid services observed. Conclusions most important results our demonstrate not enough sufficient principles low EU information, combined some legislation incapacities, resulting barriers secondary use data.

Language: Английский

Citations

5

Competing interests: digital health and indigenous data sovereignty DOI Creative Commons
Ashley Cordes, Marieke Bak, Mataroria Lyndon

et al.

npj Digital Medicine, Journal Year: 2024, Volume and Issue: 7(1)

Published: July 4, 2024

Digital health is increasingly promoting open data. Although this approach promises a number of benefits, it also leads to tensions with Indigenous data sovereignty movements led by peoples around the world who are asserting control over use as part self-determination. has role in improving access services and delivering improved outcomes for communities. However, we argue that order be effective ethical, essential field engages more peoples´ rights interests. We discuss challenges possible improvements acquisition, management, analysis, integration they pertain communities world.

Language: Английский

Citations

5

SAGES video acquisition framework—analysis of available OR recording technologies by the SAGES AI task force DOI
Filippo Filicori, Daniel P. Bitner, Hans F. Fuchs

et al.

Surgical Endoscopy, Journal Year: 2023, Volume and Issue: 37(6), P. 4321 - 4327

Published: Feb. 2, 2023

Language: Английский

Citations

13

The MAIDA initiative: establishing a framework for global medical-imaging data sharing DOI Creative Commons

Agustina Saenz,

Emma Chen,

Henrik Marklund

et al.

The Lancet Digital Health, Journal Year: 2023, Volume and Issue: 6(1), P. e6 - e8

Published: Nov. 16, 2023

A central question in developing artificial intelligence (AI) for the interpretation of medical images is whether these algorithms will work safely and effectively across diverse patient populations clinical settings.1Rajpurkar P Lungren MP The current future state AI images.N Engl J Med. 2023; 388: 1981-1990Crossref PubMed Scopus (22) Google Scholar Public datasets are basis training validating models, making them essential rigorous assessment performance reliability that required by regulatory bodies such as US Food Drug Administration.2Seastedt KP Schwab O'Brien Z et al.Global healthcare fairness: we should be sharing more, not less, data.PLOS Digit Health. 2022; 1e0000102Crossref Scholar, 3Wu E Wu K Daneshjou R Ouyang D Ho DE Zou How devices evaluated: limitations recommendations from an analysis FDA approvals.Nat 2021; 27: 582-584Crossref (166) However, public seldom have diversity to adequately evaluate algorithmic generalisability.4Kaushal Altman Langlotz C Geographic distribution cohorts used train deep learning algorithms.JAMA. 2020; 324: 1212-1213Crossref (97) More comprehensive varied would improve models their ability generalise demographics, environments, imaging equipment, geographical regions. scarcity data also impedes optimal deployment strategies specific settings. For example, there undergo so-called site-specific fine-tuning, which refers process further a pre-trained model on local target site. This additional can help avoid decreased could result differences population between original site before approval new hospitals. issue has broad implications immediate care regulation. Comprehensive evaluations needed inform guidelines regarding dataset composition validation requirements regional adaptation.5Glocker B Robinson Castro DC Dou Q Konukoglu Machine with multi-site data: empirical study impact scanner effects.arXiv. 2019; (published online Oct 10) (preprint).https://doi.org/10.48550/arXiv.1910.04597Google Here, introduce Medical Data All (MAIDA) initiative, pioneering framework global medical-imaging address shortage health enable evaluation all populations. Similar its Hindi namesake flour, MAIDA aims provide key ingredients thoroughly assess through rich, datasets. initiative development coordinating partners while remaining locally adaptable curate comprehensive, representative collaborative effort assembling at scale assessments collaboratively engaged range hospitals worldwide release focused but dataset. Our collection strategy was acquire 100 scans per setting, number balance obtaining sample each institution managing logistical considerations. size enabled us test robustness our shifts. outreach, individual researchers radiology departments worldwide. Furthermore, attracted interest individuals who learnt about various ways, including website, social media, or conference presentations. We refer individuals, whom directly collaborate institutions, champions. Although most participants were doctors, welcomed computer scientists primarily affiliated large academic centres. discovered knowledge invaluable maintaining quality. To project goals, conducted 30-min meetings outlined offered standard templates institutional review boards (IRBs) transfer use agreements (DTUAs) simplify compliance (figure). Despite efforts, protocols differed among institutions. Federal Demonstration Partnership (FDP) template standardise sharing. Challenges emerged, institutions unwilling publicly share modifications DTUAs, resulted delays. Local champions organisations often pivotal advancing process. majority accepted templates, taught flexible transparent communication consider timelines data-sharing preferences. provided de-identification. documentation delineated detailed inclusion exclusion criteria manual downloading data-recording worksheets, randomising selections, collecting samples met stipulations. guidelines, fully written documentation. this issue, detail. practices meaning universally applicable. countries other than USA rarely collect race patients, some chest x-rays patients indiscriminately, reports might absent settings.6Pinto AD Eissa Kiran T Mashford-Pringle Needham Dhalla I Considerations Indigenous identity during card renewal Canadian jurisdictions.CMAJ. 195: E880-E882Crossref (0) 7Burute N Jankharia Teleradiology: Indian perspective.Indian Radiol Imaging. 2009; 19: 16-18Crossref Early detection issues allowed adapt solutions cases instructional meetings. mitigate risk exposure protected information (PHI) transmission, complete de-identification adapted different types. structured data, worksheets designed omit variables classified PHI instructed modify any potentially identifying data. free-form reports, asked manually remove identifiers, names dates. images, specified displaying jaws teeth, partial skulls, jewellery. developed tool enhance digital communications medicine (DICOM) files, commonly contain extensive metadata—some PHI. preclude inadvertent disclosure PHI, extracted pixel values non-PHI metadata saved separately PNG CSV files. Partners only shared files metadata, thereby eliminating embedded DICOM metadata. permitted hospital security teams both content operational logic via Python scripts, if required. Before sharing, in-person partner validate correct execution data-collection data-de-identification procedures. Upon receipt meticulous research team confirm absence identifiers completeness Substantial time invested liaising partners. efforts streamline offering IRB DTUA well limiting requests records timeframe completion several months. Most delays attributable waiting approvals signatures, especially monthly ethics-committee complex legal-review mechanisms. According feedback multiple partners, actual tasks typically accomplished within 1 week. Chest widely radiological tests worldwide, yet prone error.8Gefter WB Post BA Hatabu H Commonly missed findings radiographs: causes consequences.Chest. 163: 650-661Summary Full Text PDF substantial focus research, generalisability existing settings remains insufficiently evaluated insufficient size, diversity, scope reliable annotations (appendix). quality breadth three settings: intensive unit (ICU), neonatal ICU, emergency department. In adult focuses automating endotracheal-tube frequent misplacements severe complications. precise placements, considering minimal error margin vulnerable group. department, targets quick consistent pneumonia machine efficiency collaborations clinicians AI.9Agarwal Moehring Rajpurkar Salz Combining human expertise intelligence: experimental evidence radiology.https://www.nber.org/system/files/working_papers/w31422/w31422.pdfDate: 2023Date accessed: November 9, 2023Google assemble dataset, collects includes reason x-ray, available demographic details (eg, age, race, sex). nuanced understanding, clinically relevant vital signs specialised results, included available. Through plan progressively been collected regions environments. first expected early 2024, partnerships expand. aim make open into assessing improving imaging. On insights gained suggest advance. Engaging legal administrative processes. Anticipating adequate secure finalise contracts prudent, processes require time. Maintaining delineates emphasises removal promote consistency minimising potential confounders subsequent analyses. recognising vary being when applying prove beneficial. Thus, proactively working unique challenges advisable. Direct interactions useful questions show proper techniques, less acquainted complexities declare no competing interests. thank Wendy Erselius (Department Biomedical Informatics, Harvard School, University, Boston, MA, USA) instrumental role logistics establish coalition initiative. Cassandra Perry Jennifer Sullivan helping develop dedicated partnering datasets, benefit people Download .pdf (.15 MB) Help pdf Supplementary appendix

Language: Английский

Citations

13

Good practices for clinical data warehouse implementation: A case study in France DOI Creative Commons
Matthieu Doutreligne, Adeline Degrémont,

Pierre-Alain Jachiet

et al.

PLOS Digital Health, Journal Year: 2023, Volume and Issue: 2(7), P. e0000298 - e0000298

Published: July 6, 2023

Real-world data (RWD) bears great promises to improve the quality of care. However, specific infrastructures and methodologies are required derive robust knowledge brings innovations patient. Drawing upon national case study 32 French regional university hospitals governance, we highlight key aspects modern clinical warehouses (CDWs): transparency, types data, reuse, technical tools, documentation, control processes. Semi-structured interviews as well a review reported studies on CDWs were conducted in semi-structured manner from March November 2022. Out France, 14 have CDW production, 5 experimenting, prospective project, 8 did not any project at time writing. The implementation France dates 2011 accelerated late 2020. From this study, draw some general guidelines for CDWs. actual orientation towards research requires efforts governance stabilization, standardization schema, development documentation. Particular attention must be paid sustainability warehouse teams multilevel governance. transparency tools transformation allow successful multicentric reuses routine

Language: Английский

Citations

12

Image quality assessment of retinal fundus photographs for diabetic retinopathy in the machine learning era: a review DOI
Mariana Batista Gonçalves, Luis Filipe Nakayama, Daniel Ferraz

et al.

Eye, Journal Year: 2023, Volume and Issue: 38(3), P. 426 - 433

Published: Sept. 4, 2023

Language: Английский

Citations

11

Strengthening health data governance: new equity and rights-based principles DOI Creative Commons
Louise Holly, Shannon Thom, Mohamed Elzemety

et al.

International Journal of Health Governance, Journal Year: 2023, Volume and Issue: 28(3), P. 225 - 237

Published: March 7, 2023

Purpose This paper introduces a new set of equity and rights-based principles for health data governance (HDG) makes the case their adoption into global, regional national policy practice. Design/methodology/approach discusses need unified approach to HDG that maximises value whole populations. It describes unique process employed develop principles. The highlights lessons learned from principle development proposes steps incorporate them policies Findings More than 200 individuals 130 organisations contributed principles, which are clustered around three interconnected objectives protecting people, promoting prioritising equity. build on existing norms guidelines by bringing human rights lens HDG. Practical implications offer strong vision reaps public good benefits whilst safeguarding individual rights. They can be used governments other actors as guide equitable collection use data. inclusive model replicated strengthen future approaches. Originality/value article first bottom-up effort

Language: Английский

Citations

10

Operationalizing digital self-determination DOI Creative Commons
Stefaan Verhulst

Data & Policy, Journal Year: 2023, Volume and Issue: 5

Published: Jan. 1, 2023

Abstract A proliferation of data-generating devices, sensors, and applications has led to unprecedented amounts digital data. We live in an era datafication, one which life is increasingly quantified transformed into intelligence for private or public benefit. When used responsibly, this offers new opportunities good. The potential data evident the possibilities offered by open collaboratives—both instances how wider access can lead positive often dramatic social transformation. However, three key forms asymmetry currently limit potential, especially already vulnerable marginalized groups: asymmetries, information agency asymmetries. These asymmetries human both a practical psychological sense, leading feelings disempowerment eroding trust technology. Existing methods (such as consent) well some alternatives under consideration (data ownership, collective personal management systems) have limitations adequately address challenges at hand. principle practice self-determination (DSD) therefore required. study DSD remain its infancy. characteristics we outlined here are only exploratory, much work remains be done so better understand what works does not. suggest need research framework agenda explore it imbalances, inequalities—both society more generally—that emerging policy our era.

Language: Английский

Citations

10

A new tool for evaluating health equity in academic journals; the Diversity Factor DOI Creative Commons
Jack Gallifant, Joe Zhang, Stephen Whebell

et al.

PLOS Global Public Health, Journal Year: 2023, Volume and Issue: 3(8), P. e0002252 - e0002252

Published: Aug. 14, 2023

Current methods to evaluate a journal's impact rely on the downstream citation mapping used generate Impact Factor. This approach is fragile metric prone being skewed by outlier values and does not speak researcher's contribution furthering health outcomes for all populations. Therefore, we propose implementation of Diversity Factor fulfill this need supplement current metrics. It composed four key elements: dataset properties, author country, gender departmental affiliation. Due significance each individual element, they should be assessed independently other as opposed combined into simplified score optimized. Herein, discuss necessity such metrics, provide framework build upon, landscape through lens element publish findings freely available website that enables further evaluation. The OpenAlex database was extract metadata papers published from 2000 until August 2022, Natural language processing identify elements. Features were then displayed individually static dashboard developed using TableauPublic, which at www.equitablescience.com. In total, 130,721 identified 7,462 journals where significant underrepresentation LMIC Female authors demonstrated. These are pervasive show no positive correlation with Journal's systematic collection concept would allow more detailed analysis, highlight gaps in knowledge, reflect confidence translation related research. Conversion an active pipeline account fact how define those most risk will change over time quantify responses particular initiatives. continuous measurement across groups investigating never lose importance. Moving forward, encourage revision improvement diverse order better refine concept.

Language: Английский

Citations

10