Information,
Journal Year:
2024,
Volume and Issue:
15(3), P. 163 - 163
Published: March 13, 2024
Accurate
prediction
of
subcellular
localization
viral
proteins
is
crucial
for
understanding
their
functions
and
developing
effective
antiviral
drugs.
However,
this
task
poses
a
significant
challenge,
especially
when
relying
on
expensive
time-consuming
classical
biological
experiments.
In
study,
we
introduced
computational
model
called
E-MuLA,
based
deep
learning
network
that
combines
multiple
local
attention
modules
to
enhance
feature
extraction
from
protein
sequences.
The
superior
performance
the
E-MuLA
has
been
demonstrated
through
extensive
comparisons
with
LSTM,
CNN,
AdaBoost,
decision
trees,
KNN,
other
state-of-the-art
methods.
It
noteworthy
achieved
an
accuracy
94.87%,
specificity
98.81%,
sensitivity
84.18%,
indicating
potential
become
tool
predicting
virus
localization.
Frontiers in Medicine,
Journal Year:
2023,
Volume and Issue:
10
Published: Oct. 31, 2023
Hemagglutinin
(HA)
is
responsible
for
facilitating
viral
entry
and
infection
by
promoting
the
fusion
between
host
membrane
virus.
Given
its
significance
in
process
of
influenza
virus
infestation,
HA
has
garnered
attention
as
a
target
drug
vaccine
development.
Thus,
accurately
identifying
crucial
development
targeted
drugs.
However,
identification
using
in-silico
methods
still
lacking.
This
study
aims
to
design
computational
model
identify
HA.
Molecular Therapy,
Journal Year:
2022,
Volume and Issue:
30(8), P. 2856 - 2867
Published: May 6, 2022
As
one
of
the
most
prevalent
post-transcriptional
epigenetic
modifications,
N5-methylcytosine
(m5C)
plays
an
essential
role
in
various
cellular
processes
and
disease
pathogenesis.
Therefore,
it
is
important
accurately
identify
m5C
modifications
order
to
gain
a
deeper
understanding
other
possible
functional
mechanisms.
Although
few
computational
methods
have
been
proposed,
their
respective
models
developed
using
small
training
datasets.
Hence,
practical
application
quite
limited
genome-wide
detection.
To
overcome
existing
limitations,
we
propose
Deepm5C,
bioinformatics
method
for
identifying
RNA
sites
throughout
human
genome.
develop
constructed
novel
benchmarking
dataset
investigated
mixture
three
conventional
feature-encoding
algorithms
feature
derived
from
word-embedding
approaches.
Afterward,
four
variants
deep-learning
classifiers
commonly
used
were
employed
trained
with
encodings,
ultimately
obtaining
32
baseline
models.
A
stacking
strategy
effectively
utilized
by
integrating
predicted
output
optimal
one-dimensional
(1D)
convolutional
neural
network.
result,
Deepm5C
predictor
achieved
excellent
performance
during
cross-validation
Matthews
correlation
coefficient
accuracy
0.697
0.855,
respectively.
The
corresponding
metrics
independent
test
0.691
0.852,
Overall,
more
accurate
stable
than
significantly
outperformed
predictors,
demonstrating
effectiveness
our
proposed
hybrid
framework.
Furthermore,
expected
assist
community-wide
efforts
putative
m5Cs
formulate
testable
biological
hypothesis.
Briefings in Bioinformatics,
Journal Year:
2023,
Volume and Issue:
25(1)
Published: Nov. 22, 2023
Abstract
2’-O-methylation
(2OM)
is
the
most
common
post-transcriptional
modification
of
RNA.
It
plays
a
crucial
role
in
RNA
splicing,
stability
and
innate
immunity.
Despite
advances
high-throughput
detection,
chemical
2OM
makes
it
difficult
to
detect
map
messenger
Therefore,
bioinformatics
tools
have
been
developed
using
machine
learning
(ML)
algorithms
identify
sites.
These
made
significant
progress,
but
their
performances
remain
unsatisfactory
need
further
improvement.
In
this
study,
we
introduced
H2Opred,
novel
hybrid
deep
(HDL)
model
for
accurately
identifying
sites
human
Notably,
first
application
HDL
developing
four
nucleotide-specific
models
[adenine
(A2OM),
cytosine
(C2OM),
guanine
(G2OM)
uracil
(U2OM)]
as
well
generic
(N2OM).
H2Opred
incorporated
both
stacked
1D
convolutional
neural
network
(1D-CNN)
blocks
attention-based
bidirectional
gated
recurrent
unit
(Bi-GRU-Att)
blocks.
1D-CNN
learned
effective
feature
representations
from
14
conventional
descriptors,
while
Bi-GRU-Att
five
natural
language
processing-based
embeddings
extracted
sequences.
integrated
these
make
final
prediction.
Rigorous
cross-validation
analysis
demonstrated
that
consistently
outperforms
ML-based
single-feature
on
different
datasets.
Moreover,
remarkable
performance
training
testing
datasets,
significantly
outperforming
existing
predictor
other
models.
To
enhance
accessibility
usability,
deployed
user-friendly
web
server
accessible
at
https://balalab-skku.org/H2Opred/.
This
platform
will
serve
an
invaluable
tool
predicting
within
RNA,
thereby
facilitating
broader
applications
relevant
research
endeavors.
BMC Biology,
Journal Year:
2024,
Volume and Issue:
22(1)
Published: April 19, 2024
Abstract
Background
The
blood–brain
barrier
serves
as
a
critical
interface
between
the
bloodstream
and
brain
tissue,
mainly
composed
of
pericytes,
neurons,
endothelial
cells,
tightly
connected
basal
membranes.
It
plays
pivotal
role
in
safeguarding
from
harmful
substances,
thus
protecting
integrity
nervous
system
preserving
overall
homeostasis.
However,
this
remarkable
selective
transmission
also
poses
formidable
challenge
realm
central
diseases
treatment,
hindering
delivery
large-molecule
drugs
into
brain.
In
response
to
challenge,
many
researchers
have
devoted
themselves
developing
drug
systems
capable
breaching
barrier.
Among
these,
penetrating
peptides
emerged
promising
candidates.
These
had
advantages
high
biosafety,
ease
synthesis,
exceptional
penetration
efficiency,
making
them
an
effective
solution.
While
previous
studies
developed
few
prediction
models
for
peptides,
their
performance
has
often
been
hampered
by
issue
limited
positive
data.
Results
study,
we
present
Augur,
novel
model
using
borderline-SMOTE-based
data
augmentation
machine
learning.
extract
highly
interpretable
physicochemical
properties
while
solving
issues
small
sample
size
imbalance
negative
samples.
Experimental
results
demonstrate
superior
Augur
with
AUC
value
0.932
on
training
set
0.931
independent
test
set.
Conclusions
This
newly
demonstrates
predicting
offering
valuable
insights
development
targeting
neurological
disorders.
breakthrough
may
enhance
efficiency
peptide-based
discovery
pave
way
innovative
treatment
strategies
diseases.
Molecular Therapy — Nucleic Acids,
Journal Year:
2024,
Volume and Issue:
35(2), P. 102192 - 102192
Published: April 24, 2024
RNA
N4-acetylcytidine
(ac4C)
is
a
highly
conserved
modification
that
plays
crucial
role
in
controlling
mRNA
stability,
processing,
and
translation.
Consequently,
accurate
identification
of
ac4C
sites
across
the
genome
critical
for
understanding
gene
expression
regulation
mechanisms.
In
this
study,
we
have
developed
ac4C-AFL,
bioinformatics
tool
precisely
identifies
from
primary
sequences.
identified
optimal
sequence
length
model
building
implemented
an
adaptive
feature
representation
strategy
capable
extracting
most
representative
features
RNA.
To
identify
relevant
features,
proposed
novel
ensemble
importance
scoring
to
rank
effectively.
We
then
used
information
conduct
sequential
forward
search,
which
individually
determine
set
16
sequence-derived
descriptors.
Utilizing
these
descriptors,
constructed
176
baseline
models
using
11
popular
classifiers.
The
efficient
were
two-step
selection
approach,
whose
predicted
scores
integrated
trained
with
appropriate
classifier
develop
final
prediction
model.
Our
rigorous
cross-validations
independent
tests
demonstrate
ac4C-AFL
surpasses
contemporary
tools
predicting
sites.
Moreover,
publicly
accessible
web
server
at
https://balalab-skku.org/ac4C-AFL/.
ACS Omega,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Feb. 8, 2024
In
biological
organisms,
metal
ion-binding
proteins
participate
in
numerous
metabolic
activities
and
are
closely
associated
with
various
diseases.
To
accurately
predict
whether
a
protein
binds
to
ions
the
type
of
protein,
this
study
proposed
classifier
named
MIBPred.
The
incorporated
advanced
Word2Vec
technology
from
field
natural
language
processing
extract
semantic
features
sequence
combined
them
position-specific
score
matrix
(PSSM)
features.
Furthermore,
an
ensemble
learning
model
was
employed
for
classification
task.
model,
we
independently
trained
XGBoost,
LightGBM,
CatBoost
algorithms
integrated
output
results
through
SVM
voting
mechanism.
This
innovative
combination
has
led
significant
breakthrough
predictive
performance
our
model.
As
result,
achieved
accuracies
95.13%
85.19%,
respectively,
predicting
their
types.
Our
research
not
only
confirms
effectiveness
extracting
information
sequences
but
also
highlights
outstanding
MIBPred
problem
provides
reliable
tool
method
in-depth
exploration
structure
function
proteins.
Journal of Hematology & Oncology,
Journal Year:
2025,
Volume and Issue:
18(1)
Published: Jan. 29, 2025
N7-methylguanosine
(m7G)
is
an
important
RNA
modification
involved
in
epigenetic
regulation
that
commonly
observed
both
prokaryotic
and
eukaryotic
organisms.
Their
influence
on
the
synthesis
processing
of
messenger
RNA,
ribosomal
transfer
allows
m7G
modifications
to
affect
diverse
cellular,
physiological,
pathological
processes.
are
pivotal
human
diseases,
particularly
cancer
progression.
On
one
hand,
modification-associated
modulate
tumour
progression
malignant
biological
characteristics,
including
sustained
proliferation
signalling,
resistance
cell
death,
activation
invasion
metastasis,
reprogramming
energy
metabolism,
genome
instability,
immune
evasion.
This
suggests
they
may
be
novel
therapeutic
targets
for
treatment.
other
aberrant
expression
molecules
linked
clinicopathological
staging,
lymph
node
unfavourable
prognoses
patients
with
cancer,
indicating
their
potential
as
biomarkers.
review
consolidates
discovery,
identification,
detection
methodologies,
functional
roles
modification,
analysing
mechanisms
by
which
contribute
development,
exploring
clinical
applications
diagnostics
therapy,
thereby
providing
innovative
strategies
identification
targeted
Briefings in Bioinformatics,
Journal Year:
2023,
Volume and Issue:
25(1)
Published: Nov. 22, 2023
Abstract
The
worldwide
appearance
of
severe
acute
respiratory
syndrome
coronavirus
2
(SARS-CoV-2)
has
generated
significant
concern
and
posed
a
considerable
challenge
to
global
health.
Phosphorylation
is
common
post-translational
modification
that
affects
many
vital
cellular
functions
closely
associated
with
SARS-CoV-2
infection.
Precise
identification
phosphorylation
sites
could
provide
more
in-depth
insight
into
the
processes
underlying
infection
help
alleviate
continuing
COVID-19
crisis.
Currently,
available
computational
tools
for
predicting
these
lack
accuracy
effectiveness.
In
this
study,
we
designed
an
innovative
meta-learning
model,
Meta-Learning
Serine/Threonine
(MeL-STPhos),
precisely
identify
protein
sites.
We
initially
performed
comprehensive
assessment
29
unique
sequence-derived
features,
establishing
prediction
models
each
using
14
renowned
machine
learning
methods,
ranging
from
traditional
classifiers
advanced
deep
algorithms.
then
selected
most
effective
model
feature
by
integrating
predicted
values.
Rigorous
selection
strategies
were
employed
optimal
base
classifier(s)
cell-specific
dataset.
To
best
our
knowledge,
first
study
report
two
generic
site
utilizing
extensive
range
features
Extensive
cross-validation
independent
testing
revealed
MeL-STPhos
surpasses
existing
state-of-the-art
prediction.
also
developed
publicly
accessible
platform
at
https://balalab-skku.org/MeL-STPhos.
believe
will
serve
as
valuable
tool
accelerating
discovery
serine/threonine
elucidating
their
role
in
regulation.
iScience,
Journal Year:
2022,
Volume and Issue:
25(9), P. 104883 - 104883
Published: Aug. 5, 2022
Discovery
of
potential
drugs
requires
rapid
and
precise
identification
drug
targets.
Although
traditional
experimental
methodologies
can
accurately
identify
targets,
they
are
time-consuming
inappropriate
for
high-throughput
screening.
Computational
approaches
based
on
machine
learning
(ML)
algorithms
expedite
the
prediction
druggable
proteins;
however,
performance
existing
computational
methods
remains
unsatisfactory.
This
study
proposes
a
tool,
SPIDER,
to
enhance
accurate
proteins.
SPIDER
employs
various
feature
descriptors
pertaining
several
aspects,
including
physicochemical
properties,
compositional
information,
composition-transition-distribution
coupled
with
well-known
ML
facilitate
construction
final
meta-predictor.
The
results
showed
that
enabled
more
robust
proteins
than
baseline
models
current
in
terms
independent
test
dataset.
An
online
web
server
was
established
made
freely
available
online.