Foundation models in bioinformatics
National Science Review,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 25, 2025
With
the
adoption
of
foundation
models
(FMs),
artificial
intelligence
(AI)
has
become
increasingly
significant
in
bioinformatics
and
successfully
addressed
many
historical
challenges,
such
as
pre-training
frameworks,
model
evaluation
interpretability.
FMs
demonstrate
notable
proficiency
managing
large-scale,
unlabeled
datasets,
because
experimental
procedures
are
costly
labor
intensive.
In
various
downstream
tasks,
have
consistently
achieved
noteworthy
results,
demonstrating
high
levels
accuracy
representing
biological
entities.
A
new
era
computational
biology
been
ushered
by
application
FMs,
focusing
on
both
general
specific
issues.
this
review,
we
introduce
recent
advancements
employed
a
variety
including
genomics,
transcriptomics,
proteomics,
drug
discovery
single-cell
analysis.
Our
aim
is
to
assist
scientists
selecting
appropriate
bioinformatics,
according
four
types:
language
vision
graph
multimodal
FMs.
addition
understanding
molecular
landscapes,
AI
technology
can
establish
theoretical
practical
for
continued
innovation
biology.
Language: Английский
Towards the Next Generation of Data‐Driven Therapeutics Using Spatially Resolved Single‐Cell Technologies and Generative AI
Avital Rodov,
No information about this author
Hosna Baniadam,
No information about this author
Robert Zeiser
No information about this author
et al.
European Journal of Immunology,
Journal Year:
2025,
Volume and Issue:
55(2)
Published: Feb. 1, 2025
ABSTRACT
Recent
advances
in
multi‐omics
and
spatially
resolved
single‐cell
technologies
have
revolutionised
our
ability
to
profile
millions
of
cellular
states,
offering
unprecedented
opportunities
understand
the
complex
molecular
landscapes
human
tissues
both
health
disease.
These
developments
hold
immense
potential
for
precision
medicine,
particularly
rational
design
novel
therapeutics
treating
inflammatory
autoimmune
diseases.
However,
vast,
high‐dimensional
data
generated
by
these
present
significant
analytical
challenges,
such
as
distinguishing
technical
variation
from
biological
or
defining
relevant
questions
that
leverage
added
spatial
dimension
improve
understanding
tissue
organisation.
Generative
artificial
intelligence
(AI),
specifically
variational
autoencoder‐
transformer‐based
latent
variable
models,
provides
a
powerful
flexible
approach
addressing
challenges.
models
make
inferences
about
cell's
intrinsic
state
effectively
identifying
patterns,
reducing
dimensionality
modelling
variability
datasets.
This
review
explores
current
landscape
technologies,
application
generative
AI
analysis
their
transformative
impact
on
By
combining
with
advanced
methodologies,
we
highlight
insights
into
pathogenesis
disorders
outline
future
directions
leveraging
achieve
goal
AI‐powered
personalised
medicine.
Language: Английский
Unified integration of spatial transcriptomics across platforms
Eldad Haber,
No information about this author
Ajinkya Deshpande,
No information about this author
Jian Ma
No information about this author
et al.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 5, 2025
Spatial
transcriptomics
(ST)
has
transformed
our
understanding
of
tissue
architecture
and
cellular
interactions,
but
integrating
ST
data
across
platforms
remains
challenging
due
to
differences
in
gene
panels,
sparsity,
technical
variability.
Here,
we
introduce
LLOKI,
a
novel
framework
for
imaging-based
from
diverse
without
requiring
shared
panels.
LLOKI
addresses
integration
through
two
key
alignment
tasks:
feature
technologies
batch
datasets.
Feature
constructs
graph
based
on
spatial
proximity
expression
propagate
features
impute
missing
values.
Optimal
transport
adjusts
sparsity
match
scRNA-seq
references,
enabling
single-cell
foundation
models
such
as
scGPT
generate
unified
features.
Batch
then
refines
scGPT-transformed
embeddings,
mitigating
effects
while
preserving
biological
Evaluations
mouse
brain
samples
five
different
demonstrate
that
outperforms
existing
methods
is
effective
cross-technology
program
identification
slice
alignment.
Applying
ovarian
cancer
datasets,
identify
an
integrated
indicative
tumor-infiltrating
T
cells
Together,
provides
robust
cross-platform
studies,
with
the
potential
scale
large
atlas
deeper
insights
into
organization
environments.
Language: Английский
New horizons at the interface of artificial intelligence and translational cancer research
Cancer Cell,
Journal Year:
2025,
Volume and Issue:
43(4), P. 708 - 727
Published: April 1, 2025
Language: Английский
Foundation models for bioinformatics
Ziyu Chen,
No information about this author
Lin Wei,
No information about this author
Ge Gao
No information about this author
et al.
Quantitative Biology,
Journal Year:
2024,
Volume and Issue:
12(4), P. 339 - 344
Published: July 24, 2024
Abstract
Transformer‐based
foundation
models
such
as
ChatGPTs
have
revolutionized
our
daily
life
and
affected
many
fields
including
bioinformatics.
In
this
perspective,
we
first
discuss
about
the
direct
application
of
textual
on
bioinformatics
tasks,
focusing
how
to
make
most
out
canonical
large
language
mitigate
their
inherent
flaws.
Meanwhile,
go
through
transformer‐based,
bioinformatics‐tailored
for
both
sequence
non‐sequence
data.
particular,
envision
further
development
directions
well
challenges
models.
Language: Английский
scProAtlas: an atlas of multiplexed single-cell spatial proteomics imaging in human tissues
Tiangang Wang,
No information about this author
Xuanmin Chen,
No information about this author
Yujuan Han
No information about this author
et al.
Nucleic Acids Research,
Journal Year:
2024,
Volume and Issue:
53(D1), P. D582 - D594
Published: Nov. 11, 2024
Spatial
proteomics
can
visualize
and
quantify
protein
expression
profiles
within
tissues
at
single-cell
resolution.
Although
spatial
only
detect
a
limited
number
of
proteins
compared
to
transcriptomics,
it
provides
comprehensive
information
with
By
studying
the
distribution
cells,
we
clearly
obtain
context
multiple
scales.
includes
composition
cell
types,
functional
structures,
communication
between
regions,
all
which
are
crucial
for
patterns
cellular
distribution.
Here,
constructed
annotation
knowledgebase,
scProAtlas
(https://relab.xidian.edu.cn/scProAtlas/#/),
is
designed
help
users
comprehensively
understand
different
tissue
types
resolution
across
contains
modules,
including
neighborhood
analysis,
proximity
analysis
network,
construct
maps
multi-modal
integration,
gene
identification,
cell-cell
interaction
pathway
display
variable
genes.
data
from
eight
imaging
techniques
15
detailed
17
468
394
cells
945
region
interests.
The
aim
offer
new
insight
into
structure
various
annotation.
Language: Английский
Review: Single Cell Advances in investigating and understanding Chronic Kidney Disease and Diabetic Kidney Disease
American Journal Of Pathology,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Aug. 1, 2024
Language: Английский
Identifying Differential Spatial Expression Patterns across Different Slices, Conditions and Developmental Stages with Interpretable Deep Learning
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Aug. 7, 2024
Abstract
Spatially
resolved
transcriptomics
technologies
enable
the
mapping
of
multiplexed
gene
expression
profiles
within
tissue
contexts.
To
explore
spatial
patterns
in
complex
tissues,
computational
methods
have
been
developed
to
identify
spatially
variable
genes
single
slices.
However,
there
is
a
lack
designed
with
differential
(DSEPs)
across
multiple
slices
or
conditions,
which
becomes
increasingly
common
experimental
designs.
The
challenges
include
complexity
cross-slice
and
information
modeling,
scalability
issues
constructing
large-scale
cell
graphs,
mixed
factors
inter-slice
heterogeneity.
We
propose
DSEP
identification
as
new
task
develop
River,
an
interpretable
deep
learning-based
method,
solve
this
task.
River
comprises
two-branch
prediction
model
architecture
post-hoc
attribution
method
prioritize
that
explain
condition
differences.
River’s
special
design
for
modeling
spatial-informed
makes
it
scalable
omics
datasets.
proposed
strategies
decouple
non-spatial
components
outcomes.
validated
performance
using
simulated
datasets
applied
genes/proteins
diverse
biological
contexts,
including
embryo
development,
diabetes-induced
alterations
spermatogenesis,
lupus-induced
splenic
changes.
In
human
triple-negative
breast
cancer
dataset,
identified
generalizable
survival-related
DSEPs,
unseen
patient
groups.
does
not
rely
on
specific
data
distribution
assumptions
compatible
various
types,
making
versatile
analyzing
architectures
conditions.
Language: Английский
PathOmCLIP: Connecting tumor histology with spatial gene expression via locally enhanced contrastive learning of Pathology and Single-cell foundation model
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 11, 2024
Abstract
Tumor
morphological
features
from
histology
images
are
a
cornerstone
of
clinical
pathology,
diagnostic
biomarkers,
and
basic
cancer
biology
research.
Spatial
transcriptomics,
which
provides
spatially
resolved
gene
expression
profiles
overlaid
on
images,
offers
unique
opportunity
to
integrate
features,
thereby
deepening
our
understanding
tumor
biology.
However,
spatial
transcriptomics
experiments
with
patient
samples
in
either
trials
or
care
costly
challenging,
whereas
generated
routinely
available
for
many
legacy
prospective
cohorts
disease
progression
outcomes
well-annotated
cohorts.
Inferring
computationally
these
would
significantly
expand
biology,
but
paired
data
training
multi-modal
spatial-histology
models
remains
limited.
Here,
we
tackle
this
challenge
by
incorporating
performant
foundation
pre-trained
massive
datasets
pathology
single-cell
RNA-Seq,
respectively,
provide
useful
embeddings
underpin
models.
To
end,
developed
PathOmCLIP,
model
trained
contrastive
loss
create
joint-embedding
space
between
histopathology
RNA-seq
model.
We
incorporate
set
transformer
gather
localized
neighborhood
architecture
following
training,
further
enhances
performance
is
necessary
obtain
robust
results.
validate
PathOmCLIP
across
five
types
achieve
significant
improvements
prediction
tasks
over
other
methods.
can
be
applied
archived
unlocking
valuable
information
facilitating
new
biomarker
discoveries.
Language: Английский