Deep learning model to discriminate diverse infection types based on pairwise analysis of host gene expression
Jize Xie,
No information about this author
Xubin Zheng,
No information about this author
Jianlong Yan
No information about this author
et al.
iScience,
Journal Year:
2024,
Volume and Issue:
27(6), P. 109908 - 109908
Published: May 7, 2024
Accurate
detection
of
pathogens,
particularly
distinguishing
between
Gram-positive
and
Gram-negative
bacteria,
could
improve
disease
treatment.
Host
gene
expression
can
capture
the
immune
system's
response
to
infections
caused
by
various
pathogens.
Here,
we
present
a
deep
neural
network
model,
bvnGPS2,
which
incorporates
attention
mechanism
based
on
large-scale
integrated
host
transcriptome
dataset
precisely
identify
bacterial
as
well
viral
infections.
We
performed
analysis
4,949
blood
samples
across
40
cohorts
from
10
countries
using
our
previously
designed
omics
data
integration
method,
iPAGE,
select
discriminant
pairs
train
bvnGPS2.
The
performance
model
was
evaluated
six
independent
comprising
374
samples.
Overall,
shows
robust
capability
accurately
specific
infections,
paving
way
for
precise
medicine
strategies
in
infection
treatment
potentially
also
identifying
subtypes
other
diseases.
Language: Английский
Deciphering 3'UTR Mediated Gene Regulation Using Interpretable Deep Representation Learning
Advanced Science,
Journal Year:
2024,
Volume and Issue:
11(39)
Published: Aug. 19, 2024
Abstract
The
3'
untranslated
regions
(3'UTRs)
of
messenger
RNAs
contain
many
important
cis‐regulatory
elements
that
are
under
functional
and
evolutionary
constraints.
It
is
hypothesized
these
constraints
similar
to
grammars
syntaxes
in
human
languages
can
be
modeled
by
advanced
natural
language
techniques
such
as
Transformers,
which
has
been
very
effective
modeling
complex
protein
sequence
structures.
Here
3UTRBERT
described,
implements
an
attention‐based
model,
i.e.,
Bidirectional
Encoder
Representations
from
Transformers
(BERT).
pre‐trained
on
aggregated
3'UTR
sequences
mRNAs
a
task‐agnostic
manner;
the
model
then
fine‐tuned
for
specific
downstream
tasks
identifying
RBP
binding
sites,
m6A
RNA
modification
predicting
sub‐cellular
localizations.
Benchmark
results
show
generally
outperformed
other
contemporary
methods
each
tasks.
More
importantly,
self‐attention
mechanism
within
allows
direct
visualization
semantic
relationship
between
effectively
identifies
with
regulatory
potential.
expected
serve
foundational
tool
analyze
various
labeling
fields,
thus
enhancing
decipherability
post‐transcriptional
mechanisms.
Language: Английский
CFPLncLoc: A multi-label lncRNA subcellular localization prediction based on Chaos game representation and centralized feature pyramid
Sheng Wang,
No information about this author
Zu‐Guo Yu,
No information about this author
Han Guosheng
No information about this author
et al.
International Journal of Biological Macromolecules,
Journal Year:
2025,
Volume and Issue:
297, P. 139519 - 139519
Published: Jan. 5, 2025
Language: Английский
An ensemble deep learning framework for multi-class LncRNA subcellular localization with innovative encoding strategy
Wenxing Hu,
No information about this author
Yan Yue,
No information about this author
Ruomei Yan
No information about this author
et al.
BMC Biology,
Journal Year:
2025,
Volume and Issue:
23(1)
Published: Feb. 21, 2025
Long
non-coding
RNA
(LncRNA)
play
pivotal
roles
in
various
cellular
processes,
and
elucidating
their
subcellular
localization
can
offer
crucial
insights
into
functional
significance.
Accurate
prediction
of
lncRNA
is
paramount
importance.
Despite
numerous
computational
methods
developed
for
this
purpose,
existing
approaches
still
encounter
challenges
stemming
from
the
complexity
data
representation
difficulty
capturing
nucleotide
distribution
information
within
sequences.
In
study,
we
propose
a
novel
deep
learning-based
model,
termed
MGBLncLoc,
which
incorporates
unique
multi-class
encoding
technique
known
as
generalized
based
on
Distribution
Density
Multi-Class
Nucleotide
Groups
(MCD-ND).
This
approach
enables
more
precise
reflection
distributions,
distinguishing
between
constant
discriminative
regions
sequences,
thereby
enhancing
performance.
Additionally,
our
learning
model
integrates
advanced
neural
network
modules,
including
Multi-Dconv
Head
Transposed
Attention,
Gated-Dconv
Feed-forward
Network,
Convolutional
Neural
Bidirectional
Gated
Recurrent
Unit,
to
comprehensively
exploit
sequence
features
lncRNA.
Comparative
analysis
against
commonly
used
feature
models
validates
effectiveness
demonstrating
superior
research
offers
effective
solutions
predicting
localization,
providing
valuable
support
related
biological
investigations.
Language: Английский
TransBind allows precise detection of DNA-binding proteins and residues using language models and deep learning
Communications Biology,
Journal Year:
2025,
Volume and Issue:
8(1)
Published: April 5, 2025
Identifying
DNA-binding
proteins
and
their
binding
residues
is
critical
for
understanding
diverse
biological
processes,
but
conventional
experimental
approaches
are
slow
costly.
Existing
machine
learning
methods,
while
faster,
often
lack
accuracy
struggle
with
data
imbalance,
relying
heavily
on
evolutionary
profiles
like
PSSMs
HMMs
derived
from
multiple
sequence
alignments
(MSAs).
These
dependencies
make
them
unsuitable
orphan
or
those
that
evolve
rapidly.
To
address
these
challenges,
we
introduce
TransBind,
an
alignment-free
deep
framework
predicts
directly
a
single
primary
sequence,
eliminating
the
need
MSAs.
By
leveraging
features
pre-trained
protein
language
models,
TransBind
effectively
handles
issue
of
imbalance
achieves
superior
performance.
Extensive
evaluations
using
datasets
case
studies
demonstrate
significantly
outperforms
state-of-the-art
methods
in
terms
both
computational
efficiency.
available
as
web
server
at
https://trans-bind-web-server-frontend.vercel.app/
.
Language: Английский
PAGE-based transfer learning from single-cell to bulk sequencing enhances model generalization for sepsis diagnosis
Nana Jin,
No information about this author
Chuanchuan Nan,
No information about this author
Wanyang Li
No information about this author
et al.
Briefings in Bioinformatics,
Journal Year:
2024,
Volume and Issue:
26(1)
Published: Nov. 22, 2024
Abstract
Sepsis,
caused
by
infections,
sparks
a
dangerous
bodily
response.
The
transcriptional
expression
patterns
of
host
responses
aid
in
the
diagnosis
sepsis,
but
challenge
lies
their
limited
generalization
capabilities.
To
facilitate
sepsis
diagnosis,
we
present
an
updated
version
single-cell
Pair-wise
Analysis
Gene
Expression
(scPAGE)
using
transfer
learning
method,
scPAGE2,
dedicated
to
data
fusion
between
and
bulk
transcriptome.
Compared
scPAGE,
upgrade
scPAGE2
featured
ameliorated
Differentially
Expressed
Pairs
(DEPs)
for
pretraining
model
transcriptome
retrained
it
construct
diagnostic
model,
which
effectively
transferred
cell-layer
information
from
Seven
datasets
across
three
platforms
fluorescence-activated
cell
sorting
(FACS)
were
used
performance
validation.
involved
four
DEPs,
showing
robust
next-generation
sequencing
microarray
platforms,
surpassing
state-of-the-art
models
with
average
AUROC
0.947
AUPRC
0.987.
scRNA-seq
reveals
higher
proportions
JAM3-PIK3AP1
monocytes,
decreased
ARG1-CCR7
B
T
cells.
Elevated
IRF6-HP
monocytes
confirmed
both
independent
cohort
FACS.
Both
superior
vitro
validation
emphasize
that
is
effective
construction
model.
We
additionally
applied
acute
myeloid
leukemia
demonstrated
its
classification
performance.
Overall,
provided
strategy
improve
generalizability
can
be
adapted
broad
range
clinical
prediction
scenarios.
Language: Английский
mRNA-CLA: An interpretable deep learning approach for predicting mRNA subcellular localization
Yi‐Fan Chen,
No information about this author
Zhenya Du,
No information about this author
Xuanbai Ren
No information about this author
et al.
Methods,
Journal Year:
2024,
Volume and Issue:
227, P. 17 - 26
Published: May 3, 2024
Language: Английский
RNALocate v3.0: Advancing the Repository of RNA Subcellular Localization with Dynamic Analysis and Prediction
Le Wu,
No information about this author
Luqi Wang,
No information about this author
Shijie Hu
No information about this author
et al.
Nucleic Acids Research,
Journal Year:
2024,
Volume and Issue:
53(D1), P. D284 - D292
Published: Oct. 15, 2024
Abstract
Subcellular
localization
of
RNA
is
a
crucial
mechanism
for
regulating
diverse
biological
processes
within
cells.
Dynamic
subcellular
localizations
are
essential
maintaining
cellular
homeostasis;
however,
their
distribution
and
changes
during
development
differentiation
remain
largely
unexplored.
To
elucidate
the
dynamic
patterns
cells,
we
have
upgraded
RNALocate
to
version
3.0,
repository
RNA-subcellular
(http://www.rnalocate.org/
or
http://www.rna-society.org/rnalocate/).
v3.0
incorporates
analyzes
sequencing
data
from
over
850
samples,
with
specific
focus
on
in
under
various
conditions.
The
species
coverage
has
also
been
expanded
encompass
mammals,
non-mammals,
plants
microbes.
Additionally,
provide
an
integrated
prediction
algorithm
seven
types
across
eleven
compartments,
utilizing
convolutional
neural
networks
(CNNs)
transformer
models.
Overall,
contains
total
1
844
013
RNA-localization
entries
covering
26
types,
242
177
localizations.
It
serves
as
comprehensive
readily
accessible
resource
localization,
facilitating
elucidation
function
disease
pathogenesis.
Language: Английский
Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead
Genes,
Journal Year:
2024,
Volume and Issue:
15(5), P. 629 - 629
Published: May 15, 2024
RNA-binding
proteins
and
chemical
modifications
to
RNA
play
vital
roles
in
the
co-
post-transcriptional
regulation
of
genes.
In
order
fully
decipher
their
biological
roles,
it
is
an
essential
task
catalogue
precise
target
locations
along
with
preferred
contexts
sequence-based
determinants.
Recently,
deep
learning
approaches
have
significantly
advanced
this
field.
These
methods
can
predict
presence
or
absence
modification
at
specific
genomic
regions
based
on
diverse
features,
particularly
sequence
secondary
structure,
allowing
us
highly
non-linear
patterns
structures
that
underlie
site
preferences.
This
article
provides
overview
how
being
applied
area,
a
particular
focus
problem
mRNA-RBP
binding,
while
also
considering
other
types
RNA.
It
discusses
different
model
handle
and/or
secondary-structure-based
inputs,
process
training,
including
choice
negative
separating
sets
for
testing
offers
recommendations
developing
biologically
relevant
models.
Finally,
highlights
four
key
areas
are
crucial
advancing
Language: Английский