Research Square (Research Square),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Dec. 25, 2023
The
disparity
in
genetic
risk
prediction
accuracy
between
European
and
non-European
individuals
highlights
a
critical
challenge
health
inequality.
To
bridge
this
gap,
we
introduce
JointPRS,
novel
method
that
models
multiple
populations
jointly
to
improve
predictions
for
individuals.
JointPRS
has
three
key
features.
First,
it
encompasses
all
diverse
accuracy,
rather
than
relying
solely
on
the
target
population
with
singular
auxiliary
group.
Second,
autonomously
estimates
leverages
chromosome-wise
cross-population
correlations
infer
effect
sizes
of
variants.
Lastly,
provides
an
auto
version
comparable
performance
tuning
accommodate
situation
no
validation
dataset.
Through
extensive
simulations
real
data
applications
22
quantitative
traits
four
binary
East
Asian
populations,
nine
one
trait
African
South
demonstrate
outperforms
state-of-art
methods,
improving
both
populations.
Nature Communications,
Journal Year:
2023,
Volume and Issue:
14(1)
Published: Dec. 14, 2023
Chronic
kidney
disease
(CKD)
is
determined
by
an
interplay
of
monogenic,
polygenic,
and
environmental
risks.
Autosomal
dominant
polycystic
(ADPKD)
COL4A-associated
nephropathy
(COL4A-AN)
represent
the
most
common
forms
monogenic
diseases.
These
disorders
have
incomplete
penetrance
variable
expressivity,
we
hypothesize
that
polygenic
factors
explain
some
this
variability.
By
combining
SNP
array,
exome/genome
sequence,
electronic
health
record
data
from
UK
Biobank
All-of-Us
cohorts,
demonstrate
genome-wide
score
(GPS)
significantly
predicts
CKD
among
ADPKD
variant
carriers.
Compared
to
middle
tertile
GPS
for
noncarriers,
carriers
in
top
a
54-fold
increased
risk
CKD,
while
bottom
only
3-fold
CKD.
Similarly,
COL4A-AN
The
2.5-fold
higher
not
different
average
population
risk.
results
suggest
accounting
improves
stratification
disease.
Nature Communications,
Journal Year:
2024,
Volume and Issue:
15(1)
Published: Jan. 2, 2024
Various
polygenic
risk
scores
(PRS)
methods
have
been
proposed
to
combine
the
estimated
effects
of
single
nucleotide
polymorphisms
(SNPs)
predict
genetic
risks
for
common
diseases,
using
data
collected
from
genome-wide
association
studies
(GWAS).
Some
require
external
individual-level
GWAS
dataset
parameter
tuning,
posing
privacy
and
security-related
concerns.
Leaving
out
partial
tuning
can
also
reduce
model
prediction
accuracy.
In
this
article,
we
propose
PRStuning,
a
method
that
tunes
parameters
different
PRS
summary
statistics
training
data.
PRStuning
predicts
performance
with
parameters,
then
selects
best-performing
parameters.
Because
directly
tends
overestimate
in
testing
data,
adopt
an
empirical
Bayes
approach
shrinking
predicted
accordance
architecture
disease.
Extensive
simulations
real
applications
demonstrate
PRStuning's
accuracy
across
Genome biology,
Journal Year:
2024,
Volume and Issue:
25(1)
Published: Oct. 8, 2024
Polygenic
risk
score
(PRS)
is
a
major
research
topic
in
human
genetics.
However,
significant
gap
exists
between
PRS
methodology
and
applications
practice
due
to
often
unavailable
individual-level
data
for
various
tasks
including
model
fine-tuning,
benchmarking,
ensemble
learning.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: March 12, 2025
Abstract
The
increasing
availability
of
diverse
biobanks
has
enabled
multi-ancestry
genome-wide
association
studies
(GWAS),
enhancing
the
discovery
genetic
variants
across
traits
and
diseases.
However,
choice
an
optimal
method
remains
debated
due
to
challenges
in
statistical
power
differences
ancestral
groups
approaches
account
for
population
structure.
Two
primary
strategies
exist:
(1)
Pooled
analysis,
which
combines
individuals
from
all
backgrounds
into
a
single
dataset
while
adjusting
stratification
using
principal
components,
sample
size
but
requiring
careful
control
stratification.
(2)
Meta-analysis,
performs
ancestry-group-specific
GWAS
subsequently
summary
statistics,
potentially
capturing
fine-scale
structure,
facing
limitations
handling
admixed
individuals.
Using
large-scale
simulations
with
varying
sizes
ancestry
compositions,
we
compare
these
methods
alongside
real
data
analyses
eight
continuous
five
binary
UK
Biobank
(N≈324,000)
All
Us
Research
Program
(N≈207,000).
Our
results
demonstrate
that
pooled
analysis
generally
exhibits
better
effectively
We
further
present
theoretical
framework
linking
allele
frequency
variations
populations.
These
findings,
validated
both
biobanks,
highlight
as
robust
scalable
strategy
GWAS,
improving
maintaining
rigorous
structure
control.
Proceedings of the National Academy of Sciences,
Journal Year:
2024,
Volume and Issue:
121(33)
Published: Aug. 7, 2024
Polygenic
risk
scores
(PRS)
enhance
population
stratification
and
advance
personalized
medicine,
but
existing
methods
face
several
limitations,
encompassing
issues
related
to
computational
burden,
predictive
accuracy,
adaptability
a
wide
range
of
genetic
architectures.
To
address
these
issues,
we
propose
Aggregated
L0Learn
using
Summary-level
data
(ALL-Sum),
fast
scalable
ensemble
learning
method
for
computing
PRS
summary
statistics
from
genome-wide
association
studies
(GWAS).
ALL-Sum
leverages
L0L2
penalized
regression
across
tuning
parameters
flexibly
model
traits
with
diverse
In
extensive
large-scale
simulations
polygenicity
GWAS
sample
sizes,
consistently
outperformed
popular
alternative
in
terms
prediction
runtime,
memory
usage
by
10%,
20-fold,
threefold,
respectively,
demonstrated
robustness
We
validated
the
performance
real
analysis
11
complex
nine
sources,
including
Global
Lipids
Genetics
Consortium,
Breast
Cancer
Association
FinnGen
Biobank,
validation
UK
Biobank.
Our
results
show
that
on
average,
obtained
25%
higher
accuracy
15
times
faster
computation
half
than
current
state-of-the-art
methods,
had
robust
diseases.
Furthermore,
our
demonstrates
stable
when
linkage
disequilibrium
computed
different
sources.
is
available
as
user-friendly
R
software
package
publicly
reference
streamlined
analysis.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: March 20, 2024
Lung
cancer
and
tobacco
use
pose
significant
global
health
challenges,
necessitating
a
comprehensive
translational
roadmap
for
improved
prevention
strategies.
Polygenic
risk
scores
(PRSs)
are
powerful
tools
patient
stratification
but
have
not
yet
been
widely
used
in
primary
care
lung
cancer,
particularly
diverse
populations.
Human Genetics and Genomics Advances,
Journal Year:
2024,
Volume and Issue:
6(1), P. 100355 - 100355
Published: Sept. 25, 2024
Polygenic
scores
(PGSs)
are
a
promising
tool
for
estimating
individual-level
genetic
risk
of
disease
based
on
the
results
genome-wide
association
studies
(GWASs).
However,
their
promise
has
yet
to
be
fully
realized
because
most
currently
available
PGSs
were
built
with
data
from
predominantly
European-ancestry
populations,
and
PGS
performance
declines
when
applied
target
populations
different
which
they
derived.
Thus,
there
is
great
need
improve
in
under-studied
populations.
In
this
work
we
leverage
two
large
diverse
cohorts
Million
Veterans
Program
(MVP)
All
Us
(AoU),
providing
us
unique
opportunity
compare
methods
building
multi-ancestry
across
multiple
traits.
We
build
five
continuous
traits
binary
using
both
single-ancestry
approaches
popular
Bayesian
MVP
META
GWAS
population-specific
respective
African,
European,
Hispanic
evaluate
these
three
AoU
genetically
similar
Admixed
American,
European
1000
Genomes
Project
superpopulations.
Using
correlation-based
tests,
make
formal
comparisons
conclude
that
combine
produce
perform
better
than
utilize
smaller
single-population
matched
population,
specifically
PRS-CSx
outperform
other
PLoS Computational Biology,
Journal Year:
2024,
Volume and Issue:
20(4), P. e1011990 - e1011990
Published: April 10, 2024
Prostate
cancer
is
a
heritable
disease
with
ancestry-biased
incidence
and
mortality.
Polygenic
risk
scores
(PRSs)
offer
promising
advancements
in
predicting
risk,
including
prostate
cancer.
While
their
accuracy
continues
to
improve,
research
aimed
at
enhancing
effectiveness
within
African
Asian
populations
remains
key
for
equitable
use.
Recent
algorithmic
developments
PRS
derivation
have
resulted
improved
pan-ancestral
prediction
several
diseases.
In
this
study,
we
benchmark
the
predictive
power
of
six
widely
used
algorithms,
four
which
adjust
ancestry,
against
cases
controls
from
UK
Biobank
All
Us
cohorts.
We
find
modest
improvement
discriminatory
ability
when
compared
simple
method
that
prioritizes
variants,
clumping,
published
polygenic
scores.
Our
findings
underscore
importance
improving
upon
algorithms
sampling
diverse