Briefings in Bioinformatics,
Journal Year:
2024,
Volume and Issue:
26(1)
Published: Nov. 22, 2024
Abstract
Clathrin
proteins,
key
elements
of
the
vesicle
coat,
play
a
crucial
role
in
various
cellular
processes,
including
neural
function,
signal
transduction,
and
endocytosis.
Disruptions
clathrin
protein
functions
have
been
associated
with
wide
range
diseases,
such
as
Alzheimer’s,
neurodegeneration,
viral
infection,
cancer.
Therefore,
correctly
identifying
is
critical
to
unravel
mechanism
these
fatal
diseases
designing
drug
targets.
This
paper
presents
novel
computational
method,
named
TargetCLP,
precisely
identify
proteins.
TargetCLP
leverages
four
single-view
feature
representation
methods,
two
transformed
sets
(PSSM-CLBP
RECM-CLBP),
one
qualitative
characteristics
feature,
deep-learned-based
embedding
using
ESM.
The
features
are
integrated
based
on
their
weights
differential
evolution,
BTG
selection
algorithm
utilized
generate
more
optimal
reduced
subset.
model
trained
classifiers,
among
which
proposed
SnBiLSTM
achieved
remarkable
performance.
Experimental
comparative
results
both
training
independent
datasets
show
that
offers
significant
improvements
terms
prediction
accuracy
generalization
unseen
data,
furthering
advancements
research
field.
Journal of Chemical Information and Modeling,
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 21, 2025
Neuropeptides
are
key
signaling
molecules
that
regulate
fundamental
physiological
processes
ranging
from
metabolism
to
cognitive
function.
However,
accurate
identification
is
a
huge
challenge
due
sequence
heterogeneity,
obscured
functional
motifs
and
limited
experimentally
validated
data.
Accurate
of
neuropeptides
critical
for
advancing
neurological
disease
therapeutics
peptide-based
drug
design.
Existing
neuropeptide
methods
rely
on
manual
features
combined
with
traditional
machine
learning
methods,
which
difficult
capture
the
deep
patterns
sequences.
To
address
these
limitations,
we
propose
NeuroPred-AIMP
(adaptive
integrated
multimodal
predictor),
an
interpretable
model
synergizes
global
semantic
representation
protein
language
(ESM)
multiscale
structural
temporal
convolutional
network
(TCN).
The
introduced
adaptive
fusion
mechanism
residual
enhancement
dynamically
recalibrate
feature
contributions,
achieve
robust
integration
evolutionary
local
information.
experimental
results
demonstrated
proposed
showed
excellent
comprehensive
performance
independence
test
set,
accuracy
92.3%
AUROC
0.974.
Simultaneously,
good
balance
in
ability
identify
positive
negative
samples,
sensitivity
92.6%
specificity
92.1%,
difference
less
than
0.5%.
result
fully
confirms
effectiveness
strategy
task
recognition.
IET Systems Biology,
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 22, 2025
Abstract
Single‐cell
sequencing
(scRNA‐seq)
allows
researchers
to
study
cellular
heterogeneity
in
individual
cells.
In
single‐cell
transcriptomics
analysis,
identifying
the
cell
type
of
cells
is
a
key
task.
At
present,
datasets
often
face
challenges
high
dimensionality,
large
number
samples,
sparsity
and
sample
imbalance.
The
traditional
methods
recognition
have
been
challenged.
authors
propose
deep
residual
generation
model
based
on
semi‐supervised
learning
(scRSSL)
address
these
challenges.
ScRSSL
creatively
introduces
networks
into
generative
models.
take
advantage
its
solve
problem
During
training
model,
use
neural
network
accomplish
inference
types
so
that
local
features
data
can
be
extracted.
Because
approach,
it
automatically
accurately
predict
datasets,
even
with
only
small
labels.
Experimentally,
authors’
method
has
proven
better
performance
compared
other
methods.
BMC Biology,
Journal Year:
2025,
Volume and Issue:
23(1)
Published: April 23, 2025
Abstract
Background
Numerous
studies
have
shown
that
circRNA
can
act
as
a
miRNA
sponge,
competitively
binding
to
miRNAs,
thereby
regulating
gene
expression
and
disease
progression.
Due
the
high
cost
time-consuming
nature
of
traditional
wet
lab
experiments,
analyzing
circRNA-miRNA
associations
is
often
inefficient
labor-intensive.
Although
some
computational
models
been
developed
identify
these
associations,
they
fail
capture
deep
collaborative
features
between
interactions
do
not
guide
training
feature
extraction
networks
based
on
high-order
relationships,
leading
poor
prediction
performance.
Results
To
address
issues,
we
innovatively
propose
novel
graph
collaboration
learning
method
for
interaction,
called
DGCLCMI.
First,
it
uses
word2vec
encode
sequences
into
word
embeddings.
Next,
present
joint
model
combines
an
improved
neural
filtering
with
network
optimization.
Deep
interaction
information
embedded
informative
within
sequence
representations
prediction.
Comprehensive
experiments
three
well-established
datasets
across
seven
metrics
demonstrate
our
algorithm
significantly
outperforms
previous
models,
achieving
average
AUC
0.960.
In
addition,
case
study
reveals
18
out
20
predicted
unknown
CMI
data
points
are
accurate.
Conclusions
The
DGCLCMI
improves
representation
by
capturing
information,
superior
performance
compared
prior
methods.
It
facilitates
discovery
sheds
light
their
roles
in
physiological
processes.
Briefings in Bioinformatics,
Journal Year:
2025,
Volume and Issue:
26(2)
Published: March 1, 2025
Abstract
In
post-translational
modification,
covalent
bonds
on
lysine
and
attached
chemical
groups
significantly
change
proteins’
physical
properties.
They
shape
protein
structures,
enhance
function
stability,
are
vital
for
physiological
processes,
affecting
health
disease
through
mechanisms
like
gene
expression,
signal
transduction,
degradation,
cell
metabolism.
Although
(K)
modification
sites
considered
among
the
most
common
types
of
modifications
in
proteins,
research
K-PTMs
has
largely
overlooked
synergistic
effects
between
different
lacked
techniques
to
address
problem
sample
imbalance.
Based
this,
Extreme
Point
Deviation
Compensated
Clustering
(EPDCC)
Undersampling
algorithm
was
proposed
this
study
combined
with
Cross-Scale
Convolutional
Neural
Networks
(CSCNNs)
develop
a
novel
computational
tool,
MlyPredCSED,
simultaneously
predicting
multiple
sites.
MlyPredCSED
employs
Multi-Label
Position-Specific
Triad
Amino
Acid
Propensity
physicochemical
properties
amino
acids
richness
sequence
information.
To
challenge
imbalance,
innovative
EPDCC
technique
introduced
adjust
majority
class
samples.
The
model’s
training
testing
phase
relies
advanced
CSCNN
framework.
cross-validation
testing,
outperformed
existing
models,
especially
complex
categories
This
not
only
provides
an
efficient
method
identification
but
also
demonstrates
its
value
biological
drug
development.
facilitate
use
by
researchers,
we
have
specifically
developed
accessible
free
web
tool:
http://www.mlypredcsed.com.
BMC Biology,
Journal Year:
2025,
Volume and Issue:
23(1)
Published: April 6, 2025
Lactylation
is
a
newly
discovered
type
of
post-translational
modification,
primarily
occurring
on
lysine
(K)
residues
both
histones
and
non-histones
to
exert
diverse
effects
target
proteins.
Research
has
shown
that
lactylation
(Kla)
modification
ubiquitous
in
different
cells
participates
the
determination
cell
function
fate,
as
well
initiation
progression
various
diseases.
Precise
identification
Kla
sites
fundamental
for
elucidating
their
biological
functions
uncovering
application
potential.
Here,
we
proposed
novel
human
site
predictor
(named
PBertKla)
through
curating
reliable
benchmark
dataset
with
proper
sample
length
sequence
identity
threshold
train
protein
large
language
model
optimal
hyperparameters.
Extensive
experimental
results
consistently
demonstrated
our
possessed
robust
prediction
ability,
achieving
an
AUC
(area
under
receiver
operating
characteristic
curve)
value
over
0.880
independent
validation
data.
Feature
visualization
analysis
further
validated
effectiveness
feature
learning
representation
from
sequences.
Moreover,
benchmarked
PBertKla
against
other
cutting-edge
models
testing
sources,
highlighting
its
superiority
transferability.
All
indicated
excelled
automatic
sites,
it
would
advance
investigation
modifications
significance
health
disease.