Annual Review of Biomedical Data Science,
Journal Year:
2023,
Volume and Issue:
6(1), P. 153 - 171
Published: April 27, 2023
Artificial
intelligence
(AI)
and
other
data-driven
technologies
hold
great
promise
to
transform
healthcare
confer
the
predictive
power
essential
precision
medicine.
However,
existing
biomedical
data,
which
are
a
vital
resource
foundation
for
developing
medical
AI
models,
do
not
reflect
diversity
of
human
population.
The
low
representation
in
data
has
become
significant
health
risk
non-European
populations,
growing
application
opens
new
pathway
this
manifest
amplify.
Here
we
review
current
status
inequality
present
conceptual
framework
understanding
its
impacts
on
machine
learning.
We
also
discuss
recent
advances
algorithmic
interventions
mitigating
disparities
arising
from
inequality.
Finally,
briefly
newly
identified
disparity
quality
among
ethnic
groups
potential
Nature,
Journal Year:
2022,
Volume and Issue:
609(7925), P. 109 - 118
Published: Aug. 24, 2022
Abstract
Individual
differences
in
brain
functional
organization
track
a
range
of
traits,
symptoms
and
behaviours
1–12
.
So
far,
work
modelling
linear
brain–phenotype
relationships
has
assumed
that
single
such
relationship
generalizes
across
all
individuals,
but
models
do
not
equally
well
participants
13,14
A
better
understanding
whom
fail
why
is
crucial
to
revealing
robust,
useful
unbiased
relationships.
To
this
end,
here
we
related
activity
phenotype
using
predictive
models—trained
tested
on
independent
data
ensure
generalizability
15
—and
examined
model
failure.
We
applied
data-driven
approach
neurocognitive
measures
new,
clinically
demographically
heterogeneous
dataset,
with
the
results
replicated
two
independent,
publicly
available
datasets
16,17
Across
three
datasets,
find
reflect
unitary
cognitive
constructs,
rather
scores
intertwined
sociodemographic
clinical
covariates;
is,
stereotypical
profiles,
when
individuals
who
defy
them.
Model
failure
reliable,
specific
generalizable
datasets.
Together,
these
highlight
pitfalls
one-size-fits-all
effect
biased
phenotypic
18–20
interpretation
utility
resulting
models.
present
framework
address
issues
so
may
reveal
neural
circuits
underlie
phenotypes
ultimately
identify
individualized
targets
for
intervention.
In
this
work,
we
expand
the
normative
model
repository
introduced
in
Rutherford
et
al.,
2022a
to
include
models
charting
lifespan
trajectories
of
structural
surface
area
and
brain
functional
connectivity,
measured
using
two
unique
resting-state
network
atlases
(Yeo-17
Smith-10),
an
updated
online
platform
for
transferring
these
new
data
sources.
We
showcase
value
with
a
head-to-head
comparison
between
features
output
by
modeling
raw
several
benchmarking
tasks:
mass
univariate
group
difference
testing
(schizophrenia
versus
control),
classification
regression
(predicting
general
cognitive
ability).
Across
all
benchmarks,
show
advantage
features,
strongest
statistically
significant
results
demonstrated
tasks.
intend
accessible
resources
facilitate
wider
adoption
across
neuroimaging
community.
NeuroImage,
Journal Year:
2023,
Volume and Issue:
273, P. 120010 - 120010
Published: March 12, 2023
Resting-state
fMRI
is
commonly
used
to
derive
brain
parcellations,
which
are
widely
for
dimensionality
reduction
and
interpreting
human
neuroscience
studies.
We
previously
developed
a
model
that
integrates
local
global
approaches
estimating
areal-level
cortical
parcellations.
The
resulting
local-global
parcellations
often
referred
as
the
Schaefer
However,
lack
of
homotopic
correspondence
between
left
right
parcels
has
limited
their
use
lateralization
Here,
we
extend
our
previous
Using
resting-fMRI
task-fMRI
across
diverse
scanners,
acquisition
protocols,
preprocessing
demographics,
show
homogeneous
while
being
more
than
five
publicly
available
Furthermore,
weaker
correlations
associated
with
greater
in
resting
network
organization,
well
language
motor
task
activation.
Finally,
agree
boundaries
number
areas
estimated
from
histology
visuotopic
fMRI,
capturing
sub-areal
(e.g.,
somatotopic
visuotopic)
features.
Overall,
these
results
suggest
represent
neurobiologically
meaningful
subdivisions
cerebral
cortex
will
be
useful
resource
future
Multi-resolution
1479
participants
(https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/brain_parcellation/Yan2023_homotopic).
Proceedings of the National Academy of Sciences,
Journal Year:
2023,
Volume and Issue:
120(6)
Published: Jan. 30, 2023
Despite
the
great
promise
that
machine
learning
has
offered
in
many
fields
of
medicine,
it
also
raised
concerns
about
potential
biases
and
poor
generalization
across
genders,
age
distributions,
races
ethnicities,
hospitals,
data
acquisition
equipment
protocols.
In
current
study,
context
three
brain
diseases,
we
provide
evidence
which
suggests
when
properly
trained,
models
can
generalize
well
diverse
conditions
do
not
necessarily
suffer
from
bias.
Specifically,
by
using
multi-study
magnetic
resonance
imaging
consortia
for
diagnosing
Alzheimer's
disease,
schizophrenia,
autism
spectrum
disorder,
find
well-trained
have
a
high
area-under-the-curve
(AUC)
on
subjects
different
subgroups
pertaining
to
attributes
such
as
gender,
age,
racial
groups,
clinical
studies
are
unbiased
under
multiple
fairness
metrics
demographic
parity
difference,
equalized
odds
equal
opportunity
difference
etc.
We
incorporate
multi-source
demographic,
clinical,
genetic
factors
cognitive
scores
unbiased.
These
better
predictive
AUC
than
those
trained
only
with
features
but
there
situations
these
additional
help.
Nature Communications,
Journal Year:
2024,
Volume and Issue:
15(1)
Published: Feb. 28, 2024
Abstract
Predictive
modeling
is
a
central
technique
in
neuroimaging
to
identify
brain-behavior
relationships
and
test
their
generalizability
unseen
data.
However,
data
leakage
undermines
the
validity
of
predictive
models
by
breaching
separation
between
training
Leakage
always
an
incorrect
practice
but
still
pervasive
machine
learning.
Understanding
its
effects
on
can
inform
how
affects
existing
literature.
Here,
we
investigate
five
forms
leakage–involving
feature
selection,
covariate
correction,
dependence
subjects–on
functional
structural
connectome-based
learning
across
four
datasets
three
phenotypes.
via
selection
repeated
subjects
drastically
inflates
prediction
performance,
whereas
other
have
minor
effects.
Furthermore,
small
exacerbate
leakage.
Overall,
our
results
illustrate
variable
underscore
importance
avoiding
improve
reproducibility
modeling.
Nature Mental Health,
Journal Year:
2024,
Volume and Issue:
2(1), P. 63 - 75
Published: Jan. 2, 2024
Abstract
Aging
diminishes
social
cognition,
and
changes
in
this
capacity
can
indicate
brain
diseases.
However,
the
relative
contribution
of
age,
diagnosis
reserve
to
especially
among
older
adults
global
settings,
remains
unclear
when
considering
other
factors.
Here,
using
a
computational
approach,
we
combined
predictors
cognition
from
diverse
sample
1,063
across
nine
countries.
Emotion
recognition,
mentalizing
overall
were
predicted
via
support
vector
regressions
various
factors,
including
(subjective
cognitive
complaints,
mild
impairment,
Alzheimer’s
disease
behavioral
variant
frontotemporal
dementia),
demographics,
cognition/executive
function,
motion
artifacts
functional
magnetic
resonance
imaging
recordings.
Higher
cognitive/executive
functions
education
ranked
top
predictors,
outweighing
reserve.
Network
connectivity
did
not
show
predictive
values.
The
results
challenge
traditional
interpretations
age-related
decline,
patient–control
differences
associations
emphasizing
importance
heterogeneous
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Feb. 18, 2024
Abstract
A
pervasive
dilemma
in
neuroimaging
is
whether
to
prioritize
sample
size
or
scan
time
given
fixed
resources.
Here,
we
systematically
investigate
this
trade-off
the
context
of
brain-wide
association
studies
(BWAS)
using
functional
magnetic
resonance
imaging
(fMRI).
We
find
that
total
duration
(sample
×
per
participant)
robustly
explains
individual-level
phenotypic
prediction
accuracy
via
a
logarithmic
model,
suggesting
and
are
broadly
interchangeable
up
20-30
min
data.
However,
returns
diminish
relative
size,
which
explain
with
principled
theoretical
derivations.
When
accounting
for
overhead
costs
associated
each
participant
(e.g.,
recruitment,
non-imaging
measures),
many
small-scale
some
large-scale
BWAS
might
benefit
from
longer
than
typically
assumed.
These
results
generalize
across
domains,
scanners,
acquisition
protocols,
racial
groups,
mental
disorders,
age
as
well
resting-state
task-state
connectivity.
Overall,
our
study
emphasizes
importance
time,
ignored
standard
power
calculations.
Standard
calculations
maximize
at
expense
can
result
sub-optimal
accuracies
inefficient
use
Our
empirically
informed
reference
available
future
design:
WEB_APPLICATION_LINK