PairK: Pairwise k-mer alignment for quantifying protein motif conservation in disordered regions DOI Creative Commons
Jackson C. Halpin, Amy E. Keating

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июль 24, 2024

ABSTRACT Protein-protein interactions are often mediated by a modular peptide recognition domain binding to short linear motif (SLiM) in the disordered region of another protein. The ability predict domain-SLiM would allow researchers map protein interaction networks, effects perturbations those and develop biologically meaningful hypotheses. Unfortunately, sequence database searches for SLiMs generally yield mostly irrelevant matches or false positives. To improve prediction novel SLiM interactions, employ filters discriminate between relevant improbable matches. One promising criterion identifying is conservation motif, exploiting fact that functional motifs more likely be conserved than spurious However, difficulty aligning regions has significantly hampered utility this approach. We present PairK (pairwise k-mer alignment), an MSA-free method quantify regions. outperforms both standard MSA-based scores modern LLM-based score predictor on task important instances. can over wider phylogenetic distances MSAs, indicating may implied metrics. available as open-source code at https://github.com/jacksonh1/pairk .

Язык: Английский

SHARK‐capture identifies functional motifs in intrinsically disordered protein regions DOI Creative Commons
Chi Fung Willis Chow, Swantje Lenz, Maxim Scheremetjew

и другие.

Protein Science, Год журнала: 2025, Номер 34(4)

Опубликована: Март 18, 2025

Abstract Increasing insights into how sequence motifs in intrinsically disordered regions (IDRs) provide functions underscore the need for systematic motif detection. Contrary to structured where can be readily identified from alignments, rapid evolution of IDRs limits usage alignment‐based tools reliably detecting within. Here, we developed SHARK‐capture, an alignment‐free detection tool designed difficult‐to‐align regions. SHARK‐capture innovates on word‐based methods by flexibly incorporating amino acid physicochemistry assess similarity without requiring rigid definitions equivalency groups. offers consistently strong performance a benchmark, with superior residue‐level performance. known functional across orthologs microtubule‐associated zinc finger protein BuGZ. We also short IDR S. cerevisiae RNA helicase Ded1p, which experimentally verified capable promoting ATPase activity. Our improved allows us systematically calculate 10,889 2695 yeast and it as resource. most precise yet identification conserved is freely available Python package ( https://pypi.org/project/bio-shark/ ) https://git.mpi-cbg.de/tothpetroczylab/shark .

Язык: Английский

Процитировано

1

SHARK enables sensitive detection of evolutionary homologs and functional analogs in unalignable and disordered sequences DOI Creative Commons
Chi Fung Willis Chow, Soumyadeep Ghosh, Anna Hadarovich

и другие.

Proceedings of the National Academy of Sciences, Год журнала: 2024, Номер 121(42)

Опубликована: Окт. 9, 2024

Intrinsically disordered regions (IDRs) are structurally flexible protein segments with regulatory functions in multiple contexts, such as the assembly of biomolecular condensates. Since IDRs undergo more rapid evolution than ordered regions, identifying homology poorly conserved remains challenging for state-of-the-art alignment-based methods that rely on position-specific conservation residues. Thus, systematic functional annotation and evolutionary analysis have been limited, despite them comprising ~21% proteins. To accurately assess between unalignable sequences, we developed an alignment-free sequence comparison algorithm, SHARK (Similarity/Homology Assessment by Relating K-mers). We trained SHARK-dive, a machine learning classifier, which achieved superior performance to standard approaches assessing sequences. Furthermore, it correctly identified dissimilar but functionally analogous IDR-replacement experiments reported literature, whereas tools were incapable detecting relationships. SHARK-dive not only predicts similar at proteome-wide scale also identifies cryptic properties motifs drive remote analogy, thereby providing interpretable experimentally verifiable hypotheses determinants underlie acts alternative alignment facilitate universe.

Язык: Английский

Процитировано

6

Deep learning tools predict variants in disordered regions with lower sensitivity DOI Creative Commons
Federica Luppino, Swantje Lenz, Chi Fung Willis Chow

и другие.

BMC Genomics, Год журнала: 2025, Номер 26(1)

Опубликована: Апрель 12, 2025

The recent AI breakthrough of AlphaFold2 has revolutionized 3D protein structural modeling, proving crucial for design and variant effects prediction. However, intrinsically disordered regions-known their lack well-defined structure lower sequence conservation-often yield low-confidence models. latest Variant Effect Predictor (VEP), AlphaMissense, leverages models, achieving over 90% sensitivity specificity in predicting effects. the effectiveness tools variants regions, which account 30% human proteome, remains unclear. In this study, we found that pathogenicity regions is less accurate than ordered particularly mutations at first N-Methionine site. Investigations into efficacy effect predictors on (IDRs) indicated IDRs are predicted with gap between largest especially AlphaMissense VARITY. prevalence within coupled increasing repertoire biological functions they known to perform, necessitated an investigation state-of-the-art VEPs such regions. This analysis revealed consistently reduced differing prediction performance profile indicating new IDR-specific features paradigms needed accurately classify disease those

Язык: Английский

Процитировано

0

Machine learning methods to study sequence–ensemble–function relationships in disordered proteins DOI Creative Commons
Sören von Bülow, Giulio Tesei, Kresten Lindorff‐Larsen

и другие.

Current Opinion in Structural Biology, Год журнала: 2025, Номер 92, С. 103028 - 103028

Опубликована: Март 12, 2025

Язык: Английский

Процитировано

0

The evolution and exploration of intrinsically disordered and phase-separated protein states DOI
Chi Fung Willis Chow, Ágnes Tóth-Petróczy

Elsevier eBooks, Год журнала: 2024, Номер unknown, С. 353 - 379

Опубликована: Ноя. 22, 2024

Язык: Английский

Процитировано

1

PairK: Pairwise k-mer alignment for quantifying protein motif conservation in disordered regions DOI Creative Commons
Jackson C. Halpin, Amy E. Keating

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июль 24, 2024

ABSTRACT Protein-protein interactions are often mediated by a modular peptide recognition domain binding to short linear motif (SLiM) in the disordered region of another protein. The ability predict domain-SLiM would allow researchers map protein interaction networks, effects perturbations those and develop biologically meaningful hypotheses. Unfortunately, sequence database searches for SLiMs generally yield mostly irrelevant matches or false positives. To improve prediction novel SLiM interactions, employ filters discriminate between relevant improbable matches. One promising criterion identifying is conservation motif, exploiting fact that functional motifs more likely be conserved than spurious However, difficulty aligning regions has significantly hampered utility this approach. We present PairK (pairwise k-mer alignment), an MSA-free method quantify regions. outperforms both standard MSA-based scores modern LLM-based score predictor on task important instances. can over wider phylogenetic distances MSAs, indicating may implied metrics. available as open-source code at https://github.com/jacksonh1/pairk .

Язык: Английский

Процитировано

0