Leveraging genetic correlations and multiple populations to improve genetic risk prediction for non-European populations DOI Creative Commons
Hongyu Zhao, Leqi Xu, Geyu Zhou

et al.

Research Square (Research Square), Journal Year: 2023, Volume and Issue: unknown

Published: Dec. 25, 2023

The disparity in genetic risk prediction accuracy between European and non-European individuals highlights a critical challenge health inequality. To bridge this gap, we introduce JointPRS, novel method that models multiple populations jointly to improve predictions for individuals. JointPRS has three key features. First, it encompasses all diverse accuracy, rather than relying solely on the target population with singular auxiliary group. Second, autonomously estimates leverages chromosome-wise cross-population correlations infer effect sizes of variants. Lastly, provides an auto version comparable performance tuning accommodate situation no validation dataset. Through extensive simulations real data applications 22 quantitative traits four binary East Asian populations, nine one trait African South demonstrate outperforms state-of-art methods, improving both populations.

Language: Английский

Polygenic risk alters the penetrance of monogenic kidney disease DOI Creative Commons
Atlas Khan, Ning Shang, Jordan G. Nestor

et al.

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: Dec. 14, 2023

Chronic kidney disease (CKD) is determined by an interplay of monogenic, polygenic, and environmental risks. Autosomal dominant polycystic (ADPKD) COL4A-associated nephropathy (COL4A-AN) represent the most common forms monogenic diseases. These disorders have incomplete penetrance variable expressivity, we hypothesize that polygenic factors explain some this variability. By combining SNP array, exome/genome sequence, electronic health record data from UK Biobank All-of-Us cohorts, demonstrate genome-wide score (GPS) significantly predicts CKD among ADPKD variant carriers. Compared to middle tertile GPS for noncarriers, carriers in top a 54-fold increased risk CKD, while bottom only 3-fold CKD. Similarly, COL4A-AN The 2.5-fold higher not different average population risk. results suggest accounting improves stratification disease.

Language: Английский

Citations

26

Tuning parameters for polygenic risk score methods using GWAS summary statistics from training data DOI Creative Commons
Wei Jiang, Ling Chen, Matthew J. Girgenti

et al.

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: Jan. 2, 2024

Various polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) predict genetic risks for common diseases, using data collected from genome-wide association studies (GWAS). Some require external individual-level GWAS dataset parameter tuning, posing privacy and security-related concerns. Leaving out partial tuning can also reduce model prediction accuracy. In this article, we propose PRStuning, a method that tunes parameters different PRS summary statistics training data. PRStuning predicts performance with parameters, then selects best-performing parameters. Because directly tends overestimate in testing data, adopt an empirical Bayes approach shrinking predicted accordance architecture disease. Extensive simulations real applications demonstrate PRStuning's accuracy across

Language: Английский

Citations

9

Optimizing and benchmarking polygenic risk scores with GWAS summary statistics DOI Creative Commons
Zijie Zhao,

Tim Gruenloh,

Meiyi Yan

et al.

Genome biology, Journal Year: 2024, Volume and Issue: 25(1)

Published: Oct. 8, 2024

Polygenic risk score (PRS) is a major research topic in human genetics. However, significant gap exists between PRS methodology and applications practice due to often unavailable individual-level data for various tasks including model fine-tuning, benchmarking, ensemble learning.

Language: Английский

Citations

4

The large-scale whole-genome sequencing era expedited medical discovery and clinical translation DOI Creative Commons
Qingxin Yang,

Shuhan Duan,

Yuguo Huang

et al.

Deleted Journal, Journal Year: 2025, Volume and Issue: 2(1), P. 100055 - 100055

Published: Feb. 8, 2025

Language: Английский

Citations

0

Evaluating Multi-Ancestry Genome-Wide Association Methods: Statistical Power, Population Structure, and Practical Implications DOI Creative Commons
Julie-Alexia Dias, Tony Chen, Xing Hua

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: March 12, 2025

Abstract The increasing availability of diverse biobanks has enabled multi-ancestry genome-wide association studies (GWAS), enhancing the discovery genetic variants across traits and diseases. However, choice an optimal method remains debated due to challenges in statistical power differences ancestral groups approaches account for population structure. Two primary strategies exist: (1) Pooled analysis, which combines individuals from all backgrounds into a single dataset while adjusting stratification using principal components, sample size but requiring careful control stratification. (2) Meta-analysis, performs ancestry-group-specific GWAS subsequently summary statistics, potentially capturing fine-scale structure, facing limitations handling admixed individuals. Using large-scale simulations with varying sizes ancestry compositions, we compare these methods alongside real data analyses eight continuous five binary UK Biobank (N≈324,000) All Us Research Program (N≈207,000). Our results demonstrate that pooled analysis generally exhibits better effectively We further present theoretical framework linking allele frequency variations populations. These findings, validated both biobanks, highlight as robust scalable strategy GWAS, improving maintaining rigorous structure control.

Language: Английский

Citations

0

Fast and scalable ensemble learning method for versatile polygenic risk prediction DOI Creative Commons
Tony Chen, Haoyu Zhang,

Rahul Mazumder

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2024, Volume and Issue: 121(33)

Published: Aug. 7, 2024

Polygenic risk scores (PRS) enhance population stratification and advance personalized medicine, but existing methods face several limitations, encompassing issues related to computational burden, predictive accuracy, adaptability a wide range of genetic architectures. To address these issues, we propose Aggregated L0Learn using Summary-level data (ALL-Sum), fast scalable ensemble learning method for computing PRS summary statistics from genome-wide association studies (GWAS). ALL-Sum leverages L0L2 penalized regression across tuning parameters flexibly model traits with diverse In extensive large-scale simulations polygenicity GWAS sample sizes, consistently outperformed popular alternative in terms prediction runtime, memory usage by 10%, 20-fold, threefold, respectively, demonstrated robustness We validated the performance real analysis 11 complex nine sources, including Global Lipids Genetics Consortium, Breast Cancer Association FinnGen Biobank, validation UK Biobank. Our results show that on average, obtained 25% higher accuracy 15 times faster computation half than current state-of-the-art methods, had robust diseases. Furthermore, our demonstrates stable when linkage disequilibrium computed different sources. is available as user-friendly R software package publicly reference streamlined analysis.

Language: Английский

Citations

3

Genomic Insights for Personalized Care: Motivating At-Risk Individuals Toward Evidence-Based Health Practices DOI Open Access
Tony Chen, Giang Pham, Louis Fox

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: March 20, 2024

Lung cancer and tobacco use pose significant global health challenges, necessitating a comprehensive translational roadmap for improved prevention strategies. Polygenic risk scores (PRSs) are powerful tools patient stratification but have not yet been widely used in primary care lung cancer, particularly diverse populations.

Language: Английский

Citations

2

Comparison of Methods for Building Polygenic Scores for Diverse Populations DOI Creative Commons
Sophia Gunn, Xin Wang, Daniel Posner

et al.

Human Genetics and Genomics Advances, Journal Year: 2024, Volume and Issue: 6(1), P. 100355 - 100355

Published: Sept. 25, 2024

Polygenic scores (PGSs) are a promising tool for estimating individual-level genetic risk of disease based on the results genome-wide association studies (GWASs). However, their promise has yet to be fully realized because most currently available PGSs were built with data from predominantly European-ancestry populations, and PGS performance declines when applied target populations different which they derived. Thus, there is great need improve in under-studied populations. In this work we leverage two large diverse cohorts Million Veterans Program (MVP) All Us (AoU), providing us unique opportunity compare methods building multi-ancestry across multiple traits. We build five continuous traits binary using both single-ancestry approaches popular Bayesian MVP META GWAS population-specific respective African, European, Hispanic evaluate these three AoU genetically similar Admixed American, European 1000 Genomes Project superpopulations. Using correlation-based tests, make formal comparisons conclude that combine produce perform better than utilize smaller single-population matched population, specifically PRS-CSx outperform other

Language: Английский

Citations

2

Polygenic scores and their applications in kidney disease DOI
Atlas Khan, Krzysztof Kiryluk

Nature Reviews Nephrology, Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 13, 2024

Language: Английский

Citations

1

Benchmarking multi-ancestry prostate cancer polygenic risk scores in a real-world cohort DOI Creative Commons
Yajas Shah, Scott Kulm, Jones T. Nauseef

et al.

PLoS Computational Biology, Journal Year: 2024, Volume and Issue: 20(4), P. e1011990 - e1011990

Published: April 10, 2024

Prostate cancer is a heritable disease with ancestry-biased incidence and mortality. Polygenic risk scores (PRSs) offer promising advancements in predicting risk, including prostate cancer. While their accuracy continues to improve, research aimed at enhancing effectiveness within African Asian populations remains key for equitable use. Recent algorithmic developments PRS derivation have resulted improved pan-ancestral prediction several diseases. In this study, we benchmark the predictive power of six widely used algorithms, four which adjust ancestry, against cases controls from UK Biobank All Us cohorts. We find modest improvement discriminatory ability when compared simple method that prioritizes variants, clumping, published polygenic scores. Our findings underscore importance improving upon algorithms sampling diverse

Language: Английский

Citations

0