SHARK‐capture identifies functional motifs in intrinsically disordered protein regions
Protein Science,
Год журнала:
2025,
Номер
34(4)
Опубликована: Март 18, 2025
Abstract
Increasing
insights
into
how
sequence
motifs
in
intrinsically
disordered
regions
(IDRs)
provide
functions
underscore
the
need
for
systematic
motif
detection.
Contrary
to
structured
where
can
be
readily
identified
from
alignments,
rapid
evolution
of
IDRs
limits
usage
alignment‐based
tools
reliably
detecting
within.
Here,
we
developed
SHARK‐capture,
an
alignment‐free
detection
tool
designed
difficult‐to‐align
regions.
SHARK‐capture
innovates
on
word‐based
methods
by
flexibly
incorporating
amino
acid
physicochemistry
assess
similarity
without
requiring
rigid
definitions
equivalency
groups.
offers
consistently
strong
performance
a
benchmark,
with
superior
residue‐level
performance.
known
functional
across
orthologs
microtubule‐associated
zinc
finger
protein
BuGZ.
We
also
short
IDR
S.
cerevisiae
RNA
helicase
Ded1p,
which
experimentally
verified
capable
promoting
ATPase
activity.
Our
improved
allows
us
systematically
calculate
10,889
2695
yeast
and
it
as
resource.
most
precise
yet
identification
conserved
is
freely
available
Python
package
(
https://pypi.org/project/bio-shark/
)
https://git.mpi-cbg.de/tothpetroczylab/shark
.
Язык: Английский
SHARK enables sensitive detection of evolutionary homologs and functional analogs in unalignable and disordered sequences
Proceedings of the National Academy of Sciences,
Год журнала:
2024,
Номер
121(42)
Опубликована: Окт. 9, 2024
Intrinsically
disordered
regions
(IDRs)
are
structurally
flexible
protein
segments
with
regulatory
functions
in
multiple
contexts,
such
as
the
assembly
of
biomolecular
condensates.
Since
IDRs
undergo
more
rapid
evolution
than
ordered
regions,
identifying
homology
poorly
conserved
remains
challenging
for
state-of-the-art
alignment-based
methods
that
rely
on
position-specific
conservation
residues.
Thus,
systematic
functional
annotation
and
evolutionary
analysis
have
been
limited,
despite
them
comprising
~21%
proteins.
To
accurately
assess
between
unalignable
sequences,
we
developed
an
alignment-free
sequence
comparison
algorithm,
SHARK
(Similarity/Homology
Assessment
by
Relating
K-mers).
We
trained
SHARK-dive,
a
machine
learning
classifier,
which
achieved
superior
performance
to
standard
approaches
assessing
sequences.
Furthermore,
it
correctly
identified
dissimilar
but
functionally
analogous
IDR-replacement
experiments
reported
literature,
whereas
tools
were
incapable
detecting
relationships.
SHARK-dive
not
only
predicts
similar
at
proteome-wide
scale
also
identifies
cryptic
properties
motifs
drive
remote
analogy,
thereby
providing
interpretable
experimentally
verifiable
hypotheses
determinants
underlie
acts
alternative
alignment
facilitate
universe.
Язык: Английский
Deep learning tools predict variants in disordered regions with lower sensitivity
BMC Genomics,
Год журнала:
2025,
Номер
26(1)
Опубликована: Апрель 12, 2025
The
recent
AI
breakthrough
of
AlphaFold2
has
revolutionized
3D
protein
structural
modeling,
proving
crucial
for
design
and
variant
effects
prediction.
However,
intrinsically
disordered
regions-known
their
lack
well-defined
structure
lower
sequence
conservation-often
yield
low-confidence
models.
latest
Variant
Effect
Predictor
(VEP),
AlphaMissense,
leverages
models,
achieving
over
90%
sensitivity
specificity
in
predicting
effects.
the
effectiveness
tools
variants
regions,
which
account
30%
human
proteome,
remains
unclear.
In
this
study,
we
found
that
pathogenicity
regions
is
less
accurate
than
ordered
particularly
mutations
at
first
N-Methionine
site.
Investigations
into
efficacy
effect
predictors
on
(IDRs)
indicated
IDRs
are
predicted
with
gap
between
largest
especially
AlphaMissense
VARITY.
prevalence
within
coupled
increasing
repertoire
biological
functions
they
known
to
perform,
necessitated
an
investigation
state-of-the-art
VEPs
such
regions.
This
analysis
revealed
consistently
reduced
differing
prediction
performance
profile
indicating
new
IDR-specific
features
paradigms
needed
accurately
classify
disease
those
Язык: Английский
Machine learning methods to study sequence–ensemble–function relationships in disordered proteins
Current Opinion in Structural Biology,
Год журнала:
2025,
Номер
92, С. 103028 - 103028
Опубликована: Март 12, 2025
Язык: Английский
The evolution and exploration of intrinsically disordered and phase-separated protein states
Elsevier eBooks,
Год журнала:
2024,
Номер
unknown, С. 353 - 379
Опубликована: Ноя. 22, 2024
Язык: Английский
PairK: Pairwise k-mer alignment for quantifying protein motif conservation in disordered regions
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Июль 24, 2024
ABSTRACT
Protein-protein
interactions
are
often
mediated
by
a
modular
peptide
recognition
domain
binding
to
short
linear
motif
(SLiM)
in
the
disordered
region
of
another
protein.
The
ability
predict
domain-SLiM
would
allow
researchers
map
protein
interaction
networks,
effects
perturbations
those
and
develop
biologically
meaningful
hypotheses.
Unfortunately,
sequence
database
searches
for
SLiMs
generally
yield
mostly
irrelevant
matches
or
false
positives.
To
improve
prediction
novel
SLiM
interactions,
employ
filters
discriminate
between
relevant
improbable
matches.
One
promising
criterion
identifying
is
conservation
motif,
exploiting
fact
that
functional
motifs
more
likely
be
conserved
than
spurious
However,
difficulty
aligning
regions
has
significantly
hampered
utility
this
approach.
We
present
PairK
(pairwise
k-mer
alignment),
an
MSA-free
method
quantify
regions.
outperforms
both
standard
MSA-based
scores
modern
LLM-based
score
predictor
on
task
important
instances.
can
over
wider
phylogenetic
distances
MSAs,
indicating
may
implied
metrics.
available
as
open-source
code
at
https://github.com/jacksonh1/pairk
.
Язык: Английский