Genome biology,
Journal Year:
2024,
Volume and Issue:
25(1)
Published: Dec. 18, 2024
Cloud
computing
allows
storing
the
ever-growing
genotype-phenotype
datasets
crucial
for
precision
medicine.
Due
to
sensitive
nature
of
this
data
and
varied
laws
regulations,
additional
security
measures
are
needed
ensure
privacy.
We
develop
SQUiD,
a
secure
queryable
database
analyzing
data.
SQUiD
storage
querying
in
low-security,
low-cost
public
cloud
using
homomorphic
encryption
multi-client
setting.
demonstrate
SQUiD's
practical
usability
scalability
synthetic
UK
Biobank
Nucleic Acids Research,
Journal Year:
2024,
Volume and Issue:
53(D1), P. D998 - D1005
Published: Nov. 12, 2024
The
NHGRI-EBI
GWAS
Catalog
serves
as
a
vital
resource
for
the
genetic
research
community,
providing
access
to
most
comprehensive
database
of
human
results.
Currently,
it
contains
close
7
000
publications
>15
traits,
from
which
more
than
625
lead
associations
have
been
curated.
Additionally,
85
full
genome-wide
summary
statistics
datasets-containing
association
data
all
variants
in
analysis-are
available
downstream
analyses
such
meta-analysis,
fine-mapping,
Mendelian
randomisation
or
development
polygenic
risk
scores.
As
centralised
repository
results,
sets
and
implements
standards
submission
harmonisation,
encourages
use
consistent
descriptors
samples
methodologies.
We
share
processes
vocabulary
with
PGS
Catalog,
improving
interoperability
growing
user
group.
Here,
we
describe
latest
changes
content,
improvements
our
interface,
implementation
GWAS-SSF
standard
format
statistics.
address
challenges
handling
rapid
increase
large-scale
molecular
quantitative
trait
need
sensitivity
population
cohort
while
maintaining
reusability.
JAMA,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 16, 2024
Importance
Polygenic
risk
scores
(PRSs)
for
coronary
heart
disease
(CHD)
are
a
growing
clinical
and
commercial
reality.
Whether
existing
provide
similar
individual-level
assessments
of
susceptibility
remains
incompletely
characterized.
Objective
To
characterize
the
agreement
CHD
PRSs
that
perform
similarly
at
population
level.
Design,
Setting,
Participants
Cross-sectional
study
participants
from
diverse
backgrounds
enrolled
in
All
Us
Research
Program
(AOU),
Penn
Medicine
BioBank
(PMBB),
University
California,
Los
Angeles
(UCLA)
ATLAS
Precision
Health
Biobank
with
electronic
health
record
genotyping
data.
Exposures
published
new
developed
separately
testing
samples.
Main
Outcomes
Measures
performed
population-level
prediction
were
identified
by
comparing
calibration
discrimination
models
prevalent
CHD.
Individual-level
was
tested
intraclass
correlation
coefficient
(ICC)
Light
κ.
Results
A
total
48
calculated
171
095
AOU
participants.
The
mean
(SD)
age
56.4
(16.8)
years.
104
947
(61.3%)
female.
35
590
(20.8%)
most
genetically
to
an
African
reference
population,
29
801
(17.4%)
admixed
American
100
493
(58.7%)
European
remaining
Central/South
Asian,
East
Middle
Eastern
populations.
There
17
589
(10.3%)
153
506
without
(89.7%)
When
included
model
CHD,
46
had
practically
equivalent
Brier
area
under
receiver
operator
curves
(region
practical
equivalence
±0.02).
Twenty
percent
least
1
score
both
top
bottom
5%
risk.
Continuous
individual
predictions
poor
(ICC,
0.373
[95%
CI,
0.372-0.375]).
κ,
used
evaluate
consistency
assignment,
did
not
exceed
0.56.
Analysis
among
41
193
PMBB
53
092
yielded
different
sets
scores,
which
also
lacked
agreement.
Conclusions
Relevance
level
demonstrated
highly
variable
estimates
Recognizing
may
generate
incongruent
estimates,
effective
implementation
will
require
refined
statistical
methods
quantify
uncertainty
strategies
communicate
this
patients
clinicians.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 6, 2025
The
growing
availability
of
pre-trained
polygenic
risk
score
(PRS)
models
has
enabled
their
integration
into
real-world
applications,
reducing
the
need
for
extensive
data
labeling,
training,
and
calibration.
However,
selecting
most
suitable
PRS
model
a
specific
target
population
remains
challenging,
due
to
issues
such
as
limited
transferability,
het-erogeneity,
scarcity
observed
phenotype
in
settings.
Ensemble
learning
offers
promising
avenue
enhance
predictive
accuracy
genetic
assessments,
but
existing
methods
often
rely
on
or
additional
genome-wide
association
studies
(GWAS)
from
optimize
ensemble
weights,
limiting
utility
real-time
implementation.
Here,
we
present
UN
supervised
en
Semble
(
UNSemblePRS
),
an
unsupervised
framework,
that
combines
without
requiring
summaries
population.
Unlike
traditional
approaches,
aggregates
based
prediction
concordance
across
curated
subset
candidate
models.
We
evaluated
using
both
continuous
binary
traits
All
Us
database,
demonstrating
its
scalability
robust
performance
diverse
populations.
These
results
underscore
accessible
tool
integrating
contexts,
offering
broad
applicability
continues
expand.
PeerJ,
Journal Year:
2025,
Volume and Issue:
13, P. e18985 - e18985
Published: Feb. 12, 2025
Background
The
Polygenic
Score
(PGS)
Catalog
is
a
public
database
dedicated
to
storing
polygenic
risk
scores.
To
date,
the
has
included
5,022
scores
associated
with
656
different
traits.
Although
PGS
offers
an
official
resource
representational
state
transfer
(REST)
application
programming
interface
(API),
there
no
ready-made
data
client
tailored
for
any
specific
language.
Researchers
are
thus
required
invest
time
in
becoming
familiar
structure
of
REST
API
and
implement
corresponding
their
language
choice
integrate
into
analytical
workflows.
Methods
In
this
work
we
introduce
pandasPGS,
Python
package
that
provides
programmatic
access
data.
After
being
called
by
researcher,
pandasPGS
will
automatically
select
appropriate
uniform
locator
(URL)
request
based
on
name
parameters
function,
merge
obtained
pagination
addition,
also
further
pre-processing
functions.
According
data,
it
can
convert
several
hierarchical
pandas.DataFrame
objects,
which
convenient
analysis
researchers.
Results
This
tool
allows
researchers
easily
analyze
using
Python.
It
alleviates
cost
learn
APIs
Catalog.
source
codes
be
found
https://github.com/tianzelab/pandaspgs
,
documentations
https://tianzelab.github.io/pandaspgs/
.
Alzheimer s & Dementia,
Journal Year:
2025,
Volume and Issue:
21(4)
Published: April 1, 2025
Africa,
home
to
1.4
billion
people
and
the
highest
genetic
diversity
globally,
harbors
unique
variants
crucial
for
understanding
complex
diseases
like
neurodegenerative
disorders.
However,
African
populations
remain
underrepresented
in
induced
pluripotent
stem
cell
(iPSC)
collections,
limiting
exploration
of
population-specific
disease
mechanisms
therapeutic
discoveries.
To
address
this
gap,
we
established
an
open-access
Somatic
Stem
Cell
Bank.
In
initial
phase,
generated
10
rigorously
characterized
iPSC
lines
from
fibroblasts
representing
five
Nigerian
ethnic
groups
both
sexes.
These
underwent
extensive
profiling
pluripotency,
stability,
differentiation
potential,
Alzheimer's
Parkinson's
risk
variants.
Clustered
regularly
interspaced
palindromic
repeats
(CRISPR)/CRISPR-associated
protein
9
technology
was
used
introduce
frontotemporal
dementia-associated
MAPT
mutations
(P301L
R406W).
This
collection
offers
a
renewable,
genetically
diverse
resource
investigate
pathogenicity
populations,
facilitating
breakthroughs
research,
drug
discovery,
regenerative
medicine.
We
were
characterized.
dementia-causing
mutations.
The
Bank
is
research.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 23, 2024
Abstract
The
NHGRI-EBI
GWAS
Catalog
serves
as
a
vital
resource
for
the
genetic
research
community,
providing
access
to
most
comprehensive
database
of
human
results.
Currently,
it
contains
close
7,000
publications
more
than
15,000
traits,
from
which
625,000
lead
associations
have
been
curated.
Additionally,
85,000
full
genome-wide
summary
statistics
datasets
-
containing
association
data
all
variants
in
analysis
are
available
downstream
analyses
such
meta-analysis,
fine-mapping,
Mendelian
randomisation
or
development
polygenic
risk
scores.
As
centralised
repository
results,
sets
and
implements
standards
submission
harmonisation,
encourages
use
consistent
descriptors
samples
methodologies.
We
share
processes
vocabulary
with
PGS
Catalog,
improving
interoperability
growing
user
group.
Here,
we
describe
latest
changes
content,
improvements
our
interface,
implementation
GWAS-SSF
standard
format
statistics.
address
challenges
handling
rapid
increase
large-scale
molecular
quantitative
trait
need
sensitivity
population
cohort
while
maintaining
reusability.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 2, 2024
Abstract
Background
Individual
weight
loss
response
to
the
GLP-1
receptor
agonist
semaglutide
varies
considerably,
with
many
possible
contributing
factors.
Leveraging
multiple
clinico-genomic
cohorts,
we
analyzed
differences
in
trajectories
according
patient
characteristics,
including
a
polygenic
score
(PGS)
and
metabolic
risk
factors,
initiators
BMI
≥27
kg/m
2
.
Methods
This
longitudinal
study
utilized
clinical-grade
exome
sequencing
electronic
health
record
data
from
six
U.S.
cohorts
within
Helix
Research
Network
(n=134,806).
A
PGS
was
calculated
using
26,941
variants.
Twelve-month
were
modeled
mixed
effects
models,
associations
demographics,
PGS,
comorbidities,
medications,
laboratory
results
evaluated.
Findings
Among
1,923
users,
mean
pretreatment
38.4
For
those
on
doses
≥1.7
mg,
body
reduction
7.3%
at
6
months
9.9%
12
months.
Over
months,
low
associated
an
adjusted
1.5%
1.8%
additional
compared
intermediate
high
respectively
(both
p<0.01).
Male
sex,
type
diabetes,
hypertension,
obstructive
sleep
apnea,
non-alcoholic
fatty
liver
disease
each
1.2%-1.9%
less
(all
p<0.05).
In
1%-increase
hemoglobin
A1c
0.6%
(p=0.0019).
Interpretation
adults
overweight
or
obesity,
lower
genetic
predisposition
obesity
is
linked
greater
semaglutide.
Additionally,
significantly
impacts
drug’s
effectiveness.
These
findings
underscore
importance
of
precision
medicine
management.
Funding
Renown
Health
Foundation.
Nevada
Governor’s
Office
Economic
Development.
HealthPartners.
Nucleic Acids Research,
Journal Year:
2024,
Volume and Issue:
53(D1), P. D10 - D19
Published: Nov. 28, 2024
The
European
Molecular
Biology
Laboratory's
Bioinformatics
Institute
(EMBL-EBI)
is
one
of
the
world's
leading
sources
public
biomolecular
data.
Based
at
Wellcome
Genome
Campus
in
Hinxton,
UK,
EMBL-EBI
six
sites
Laboratory,
Europe's
only
intergovernmental
life
sciences
organization.
This
overview
summarizes
latest
developments
services
that
data
resources
provide
to
scientific
communities
globally
(https://www.ebi.ac.uk/services).
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 29, 2024
ABSTRACT
Genome-wide
association
studies
(GWAS)
and
polygenic
score
(PGS)
development
are
typically
constrained
by
the
data
available
in
biobank
repositories
which
European
cohorts
vastly
overrepresented.
Here,
we
increase
utility
of
non-European
participant
within
UK
Biobank
(UKB)
characterizing
genetic
affinities
UKB
participants
who
self-identify
as
Bangladeshi,
Indian,
Pakistani,
“White
Asian”
(WA),
“Any
Other
(AOA),
towards
creating
a
more
robust
South
Asian
sample
size
for
future
analyses.
We
assess
relationships
between
structure
self-selected
ethnic
identities
resulting
consistent
patterns
clustering
used
to
train
support
vector
machine
(SVM).
The
SVM
model
was
utilized
reassign
n
=
1,853
AOA
WA
at
subcontinental
level,
group
1,381
additional
participants.
then
leverage
these
samples
GWAS
performance
PGS
development.
further
include
environmental
covariates
height
implementing
rigorous
covariate
selection
procedure,
compare
outputs
two
models:
null
env
.
show
that
derived
from
environmentally
adjusted
yields
comparable
prediction
models
developed
with
an
order
magnitude
larger
training
dataset
(
R
2
=0.021
vs
0.026).
Models
7
-
8
double
variance
explained
alone.
In
summary,
demonstrate
how
can
be
improved
leveraging
ambiguous
ethnicity
codes,
ancestry
matched
imputation
panels,
including
covariates.