bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Дек. 8, 2023
Abstract
High-throughput
proteomics
approaches
have
revolutionised
the
identification
of
RNA-binding
proteins
(RBPome)
and
sequences
(RBDome)
across
organisms.
Yet
extent
noise,
including
false-positives,
associated
with
these
methodologies,
is
difficult
to
quantify
as
experimental
for
validating
results
are
generally
low
throughput.
To
address
this,
we
introduce
pyRBDome,
a
pipeline
enhancing
proteome
data
in
silico
.
It
aligns
site
(RBS)
predictions
from
distinct
machine
learning
tools
integrates
high-resolution
structural
when
available.
Its
statistical
evaluation
RBDome
enables
quick
likely
genuine
RNA-binders
datasets.
Furthermore,
by
leveraging
pyRBDome
results,
enhanced
sensitivity
specificity
RBS
detection
through
training
new
ensemble
models.
analysis
human
dataset,
compared
known
data,
revealed
that
while
UV
cross-linked
amino
acids
were
more
contain
predicted
RBSs,
they
infrequently
bind
RNA
structures.
This
discrepancy
underscores
limitations
benchmarks,
positioning
valuable
alternative
increasing
confidence
Nucleic Acids Research,
Год журнала:
2023,
Номер
52(2), С. e10 - e10
Опубликована: Дек. 4, 2023
Current
predictors
of
DNA-binding
residues
(DBRs)
from
protein
sequences
belong
to
two
distinct
groups,
those
trained
on
binding
annotations
extracted
structured
protein-DNA
complexes
(structure-trained)
vs.
intrinsically
disordered
proteins
(disorder-trained).
We
complete
the
first
empirical
analysis
predictive
performance
across
structure-
and
disorder-annotated
for
a
representative
collection
ten
predictors.
Majority
structure-trained
tools
perform
well
structure-annotated
while
doing
relatively
poorly
proteins,
vice
versa.
Several
methods
make
accurate
predictions
or
but
none
performs
highly
accurately
both
annotation
types.
Moreover,
most
excessive
cross-predictions
where
that
interact
with
non-DNA
ligand
types
are
predicted
as
DBRs.
Motivated
by
these
results,
we
design,
validate
deploy
an
innovative
meta-model,
hybridDBRpred,
uses
deep
transformer
network
combine
generated
three
best
current
HybridDBRpred
provides
low
levels
types,
is
statistically
more
than
each
baseline
meta-predictors
rely
averaging
logistic
regression.
hybridDBRpred
convenient
web
server
at
http://biomine.cs.vcu.edu/servers/hybridDBRpred/
provide
corresponding
source
code
https://github.com/jianzhang-xynu/hybridDBRpred.
Drug Discovery Today,
Год журнала:
2025,
Номер
unknown, С. 104362 - 104362
Опубликована: Апрель 1, 2025
Artificial
intelligence
(AI)
and
machine
learning
(ML)
have
revolutionized
pharmaceutical
research,
particularly
in
protein
nucleic
acid
studies.
This
review
summarizes
the
current
status
of
AI
ML
applications
sector,
focusing
on
innovative
tools,
web
servers,
databases.
paper
highlights
how
these
technologies
address
key
challenges
drug
development
including
high
costs,
lengthy
timelines,
complexity
biological
systems.
Furthermore,
potential
personalized
medicine,
cancer
response
prediction,
biomarker
identification
is
discussed.
The
integration
research
promises
to
accelerate
discovery,
reduce
ultimately
lead
more
effective
therapeutic
strategies.
Life Science Alliance,
Год журнала:
2024,
Номер
7(10), С. e202402787 - e202402787
Опубликована: Июль 30, 2024
High-throughput
proteomics
approaches
have
revolutionised
the
identification
of
RNA-binding
proteins
(RBPome)
and
sequences
(RBDome)
across
organisms.
Yet,
extent
noise,
including
false
positives,
associated
with
these
methodologies,
is
difficult
to
quantify
as
experimental
for
validating
results
are
generally
low
throughput.
To
address
this,
we
introduce
pyRBDome,
a
pipeline
enhancing
proteome
data
in
silico.
It
aligns
site
(RBS)
predictions
from
distinct
machine-learning
tools
integrates
high-resolution
structural
when
available.
Its
statistical
evaluation
RBDome
enables
quick
likely
genuine
RNA-binders
datasets.
Furthermore,
by
leveraging
pyRBDome
results,
enhanced
sensitivity
specificity
RBS
detection
through
training
new
ensemble
models.
analysis
human
dataset,
compared
known
data,
revealed
that
although
UV–cross-linked
amino
acids
were
more
contain
predicted
RBSs,
they
infrequently
bind
RNA
structures.
This
discrepancy
underscores
limitations
benchmarks,
positioning
valuable
alternative
increasing
confidence
IEEE Journal of Biomedical and Health Informatics,
Год журнала:
2023,
Номер
27(9), С. 4569 - 4578
Опубликована: Июль 3, 2023
Protein
complexes
play
an
essential
role
in
living
cells.
Detecting
protein
is
crucial
to
understand
functions
and
treat
complex
diseases.
Due
high
time
resource
consumption
of
experiment
approaches,
many
computational
approaches
have
been
proposed
detect
complexes.
However,
most
them
are
only
based
on
protein-protein
interaction
(PPI)
networks,
which
heavily
suffer
from
the
noise
PPI
networks.
Therefore,
we
propose
a
novel
core-attachment
method,
named
CACO,
human
complexes,
by
integrating
functional
information
other
species
via
ortholog
relations.
First,
CACO
constructs
cross-species
relation
matrix
transfers
GO
terms
as
reference
evaluate
confidence
PPIs.
Then,
filter
strategy
adopted
clean
network
thus
weighted
constructed.
Finally,
new
effective
algorithm
network.
Compared
thirteen
state-of-the-art
methods,
outperforms
all
F-measure
Composite
Score,
showing
that
detecting
Identification
of
protein-protein
and
protein-nucleic
acid
binding
sites
provides
insights
into
biological
processes
related
to
protein
functions
technical
guidance
for
disease
diagnosis
drug
design.
However,
accurate
predictions
by
computational
approaches
remain
highly
challenging
due
the
limited
knowledge
residue
patterns.
The
pattern
a
should
be
characterized
spatial
distribution
its
neighboring
residues
combined
with
their
physicochemical
information
interaction,
which
yet
cannot
achieved
previous
methods.
Here,
we
design
GraphRBF,
hierarchical
geometric
deep
learning
model
learn
patterns
from
big
data.
To
achieve
it,
GraphRBF
describes
interactions
designing
an
enhanced
graph
neural
network
characterizes
distributions
introducing
prioritized
radial
basis
function
network.
After
training
testing,
shows
great
improvements
over
existing
state-of-the-art
methods
strong
interpretability
learned
representations.
Applying
SARS-CoV-2
omicron
spike
protein,
it
successfully
identifies
known
epitopes
protein.
Moreover,
predicts
multiple
potential
regions
new
nanobodies
or
even
drugs
evidence.
A
user-friendly
online
server
is
freely
available
at
http://liulab.top/GraphRBF/server.
Briefings in Bioinformatics,
Год журнала:
2024,
Номер
25(3)
Опубликована: Март 27, 2024
Abstract
Proteins
interact
with
diverse
ligands
to
perform
a
large
number
of
biological
functions,
such
as
gene
expression
and
signal
transduction.
Accurate
identification
these
protein–ligand
interactions
is
crucial
the
understanding
molecular
mechanisms
development
new
drugs.
However,
traditional
experiments
are
time-consuming
expensive.
With
high-throughput
technologies,
an
increasing
amount
protein
data
available.
In
past
decades,
many
computational
methods
have
been
developed
predict
interactions.
Here,
we
review
comprehensive
set
over
160
interaction
predictors,
which
cover
protein–protein,
protein−nucleic
acid,
protein−peptide
protein−other
(nucleotide,
heme,
ion)
We
carried
out
analysis
above
four
types
predictors
from
several
significant
perspectives,
including
their
inputs,
feature
profiles,
models,
availability,
etc.
The
current
primarily
rely
on
sequences,
especially
utilizing
evolutionary
information.
improvement
in
predictions
attributed
deep
learning
methods.
Additionally,
sequence-based
pretrained
models
structure-based
approaches
emerging
trends.