Imputation of cancer proteomics data with a deep model that learns from many datasets DOI Creative Commons
Lincoln Harris, William Stafford Noble

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Авг. 28, 2024

Abstract Missing values are a major challenge in the analysis of mass spectrometry proteomics data. hinder reproducibility, decrease statistical power for identifying differentially expressed (DE) proteins and make it challenging to analyze low-abundance proteins. We present Lupine, deep learning-based method imputing, or estimating, missing tandem tag (TMT) Lupine is, our knowledge, first imputation that is designed learn jointly from many datasets, we provide evidence this approach leads more accurate predictions. validated by applying TMT data > 1,000 cancer patient samples spanning ten types Clinical Proteomics Tumor Atlas Consortium (CPTAC). outperforms state art imputation, identifies DE than other methods, corrects batch effects, learns meaningful representation samples. implemented as an open source Python package.

Язык: Английский

Nomination of a novel plasma protein biomarker panel capable of classifying Alzheimer’s disease dementia with high accuracy in an African American cohort DOI Creative Commons
Lindsey A. Kuchenbecker, Kevin J. Thompson, Cheyenne Hurst

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июль 29, 2024

Abstract Introduction African Americans (AA) are widely underrepresented in plasma biomarker studies for Alzheimer’s disease (AD) and current diagnostic candidates do not reflect the heterogeneity of AD. Methods Untargeted proteome measurements were obtained using SomaScan 7k platform to identify novel biomarkers AD a cohort AA clinically diagnosed as dementia (n=183) or cognitively unimpaired (CU, n=145). Machine learning approaches implemented set proteins that yields best classification accuracy. Results A protein panel achieved an area under curve (AUC) 0.91 classify vs CU. The reproducibility this finding was observed ANMerge AMP-AD Diversity brain datasets (AUC=0.83; AUC=0.94). Discussion This study demonstrates potential discovery through untargeted proteomics machine approaches. Our findings also highlight importance matrisome cerebrovascular dysfunction pathophysiology.

Язык: Английский

Процитировано

1

Structural variants linked to Alzheimer’s Disease and other common age-related clinical and neuropathologic traits DOI Creative Commons
Ricardo A. Vialle, Kátia de Paiva Lopes, Yan Li

и другие.

medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Авг. 13, 2024

Advances have led to a greater understanding of the genetics Alzheimer's Disease (AD). However, gap between predicted and observed genetic heritability estimates when using single nucleotide polymorphisms (SNPs) small indel data remains. Large genomic rearrangements, known as structural variants (SVs), potential account for this missing heritability. By leveraging from two ongoing cohort studies aging dementia, Religious Orders Study Rush Memory Aging Project (ROS/MAP), we performed genome-wide association analysis testing around 20,000 common SVs 1,088 participants with whole genome sequencing (WGS) data. A range Related Disorders (AD/ADRD) clinical pathologic traits were examined. Given limited sample size, no significant was found, but mapped across 81 AD risk loci discovered 22 in linkage disequilibrium (LD) GWAS lead directly associated AD/ADRD phenotypes (nominal

Язык: Английский

Процитировано

0

Imputation of cancer proteomics data with a deep model that learns from many datasets DOI Creative Commons
Lincoln Harris, William Stafford Noble

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Авг. 28, 2024

Abstract Missing values are a major challenge in the analysis of mass spectrometry proteomics data. hinder reproducibility, decrease statistical power for identifying differentially expressed (DE) proteins and make it challenging to analyze low-abundance proteins. We present Lupine, deep learning-based method imputing, or estimating, missing tandem tag (TMT) Lupine is, our knowledge, first imputation that is designed learn jointly from many datasets, we provide evidence this approach leads more accurate predictions. validated by applying TMT data > 1,000 cancer patient samples spanning ten types Clinical Proteomics Tumor Atlas Consortium (CPTAC). outperforms state art imputation, identifies DE than other methods, corrects batch effects, learns meaningful representation samples. implemented as an open source Python package.

Язык: Английский

Процитировано

0