Information,
Journal Year:
2024,
Volume and Issue:
15(3), P. 163 - 163
Published: March 13, 2024
Accurate
prediction
of
subcellular
localization
viral
proteins
is
crucial
for
understanding
their
functions
and
developing
effective
antiviral
drugs.
However,
this
task
poses
a
significant
challenge,
especially
when
relying
on
expensive
time-consuming
classical
biological
experiments.
In
study,
we
introduced
computational
model
called
E-MuLA,
based
deep
learning
network
that
combines
multiple
local
attention
modules
to
enhance
feature
extraction
from
protein
sequences.
The
superior
performance
the
E-MuLA
has
been
demonstrated
through
extensive
comparisons
with
LSTM,
CNN,
AdaBoost,
decision
trees,
KNN,
other
state-of-the-art
methods.
It
noteworthy
achieved
an
accuracy
94.87%,
specificity
98.81%,
sensitivity
84.18%,
indicating
potential
become
tool
predicting
virus
localization.
International Journal of Biological Macromolecules,
Journal Year:
2022,
Volume and Issue:
229, P. 529 - 538
Published: Dec. 31, 2022
The
cell
surface
proteins
of
gram-positive
bacteria
are
involved
in
many
important
biological
functions,
including
the
infection
host
cells.
Owing
to
their
virulent
nature,
these
also
considered
strong
candidates
for
potential
drug
or
vaccine
targets.
Among
various
bacteria,
LPXTG-like
form
a
major
class.
These
have
highly
conserved
C-terminal
wall
sorting
signal,
which
consists
an
LPXTG
sequence
motif,
hydrophobic
domain,
and
positively
charged
tail.
targeted
envelope
by
sortase
enzyme
via
transpeptidation.
A
variety
been
experimentally
characterized;
however,
number
public
databases
has
increased
owing
extensive
bacterial
genome
sequencing
without
proper
annotation.
In
absence
experimental
characterization,
identifying
annotating
sequences
is
extremely
challenging.
Therefore,
this
study,
we
developed
first
machine
learning-based
predictor
called
GPApred,
can
identify
from
primary
sequences.
Using
newly
constructed
benchmark
dataset,
explored
different
classifiers
five
feature
encodings
hybrids.
Optimal
features
were
derived
using
recursive
elimination
method,
then
trained
support
vector
algorithm.
performance
models
was
evaluated
independent
datasets,
final
model
(GPApred)
selected
based
on
consistency
during
cross-validation
assessment.
GPApred
be
effective
tool
predicting
further
employed
functional
characterization
targeting.
Availability:
https://procarb.org/gpapred/.
Frontiers in Microbiology,
Journal Year:
2023,
Volume and Issue:
14
Published: April 13, 2023
Promotors
are
those
genomic
regions
on
the
upstream
of
genes,
which
bound
by
RNA
polymerase
for
starting
gene
transcription.
Because
it
is
most
critical
element
expression,
recognition
promoters
crucial
to
understand
regulation
expression.
This
study
aimed
develop
a
machine
learning-based
model
predict
promotors
in
Agrobacterium
tumefaciens
(
A.
)
strain
C58.
In
model,
promotor
sequences
were
encoded
three
different
kinds
feature
descriptors,
namely,
accumulated
nucleotide
frequency,
k
-mer
composition,
and
binary
encodings.
The
obtained
features
optimized
using
correlation
mRMR-based
algorithm.
These
inputted
into
random
forest
(RF)
classifier
discriminate
from
non-promotor
examination
10-fold
cross-validation
showed
that
proposed
could
yield
an
overall
accuracy
0.837.
will
provide
help
C58
strain.
Computers in Biology and Medicine,
Journal Year:
2023,
Volume and Issue:
155, P. 106436 - 106436
Published: Feb. 15, 2023
Protein
folding
is
a
complex
physicochemical
process
whereby
polymer
of
amino
acids
samples
numerous
conformations
in
its
unfolded
state
before
settling
on
an
essentially
unique
native
three-dimensional
(3D)
structure.
To
understand
this
process,
several
theoretical
studies
have
used
set
3D
structures,
identified
different
structural
parameters,
and
analyzed
their
relationships
using
the
natural
logarithmic
protein
rate
(ln(kf)).
Unfortunately,
these
parameters
are
specific
to
small
proteins
that
not
capable
accurately
predicting
ln(kf)
for
both
two-state
(TS)
non-two-state
(NTS)
proteins.
overcome
limitations
statistical
approach,
few
machine
learning
(ML)-based
models
been
proposed
limited
training
data.
However,
none
methods
can
explain
plausible
mechanisms.
In
study,
we
evaluated
predictive
capabilities
ten
ML
algorithms
eight
five
network
centrality
measures
based
newly
constructed
datasets.
comparison
other
nine
regressors,
support
vector
was
found
be
most
appropriate
with
mean
absolute
differences
1.856,
1.55,
1.745
TS,
NTS,
combined
datasets,
respectively.
Furthermore,
combining
improves
prediction
performance
compared
individual
indicating
multiple
factors
involved
process.
Computational and Structural Biotechnology Journal,
Journal Year:
2023,
Volume and Issue:
23, P. 129 - 139
Published: Dec. 1, 2023
RNA
N7-methylguanosine
(m7G)
is
a
crucial
chemical
modification
of
molecules,
whose
principal
duty
to
maintain
function
and
protein
translation.
Studying
predicting
sites
aid
in
comprehending
the
biological
development
new
drug
therapy
regimens.
In
present
scenario,
efficacy
techniques,
specifically
deep
learning
machine
learning,
stands
out
prediction
sites,
leading
improved
accuracy
identification
efficiency.
this
study,
we
propose
model
leveraging
transformer
framework
that
integrates
natural
language
processing
predict
m7G
called
TMSC-m7G.
TMSC-m7G,
combination
multi-sense-scaled
token
embedding
fixed-position
used
replace
traditional
word
for
extraction
contextual
information
from
sequences.
Moreover,
convolutional
layer
added
encoder
make
up
shortage
local
acquisition
transformer.
The
model's
robustness
generalization
are
validated
through
10-fold
cross-validation
an
independent
dataset
test.
Results
demonstrate
outstanding
performance
comparison
most
advanced
models
available.
Among
them,
Accuracy
TMSC-m7G
reaches
98.70%
92.92%
on
benchmark
dataset,
respectively.
To
facilitate
popularization
use
model,
have
developed
intuitive
online
tool,
which
easily
accessible
free
at
http://39.105.212.81/.
Information,
Journal Year:
2024,
Volume and Issue:
15(3), P. 163 - 163
Published: March 13, 2024
Accurate
prediction
of
subcellular
localization
viral
proteins
is
crucial
for
understanding
their
functions
and
developing
effective
antiviral
drugs.
However,
this
task
poses
a
significant
challenge,
especially
when
relying
on
expensive
time-consuming
classical
biological
experiments.
In
study,
we
introduced
computational
model
called
E-MuLA,
based
deep
learning
network
that
combines
multiple
local
attention
modules
to
enhance
feature
extraction
from
protein
sequences.
The
superior
performance
the
E-MuLA
has
been
demonstrated
through
extensive
comparisons
with
LSTM,
CNN,
AdaBoost,
decision
trees,
KNN,
other
state-of-the-art
methods.
It
noteworthy
achieved
an
accuracy
94.87%,
specificity
98.81%,
sensitivity
84.18%,
indicating
potential
become
tool
predicting
virus
localization.