Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions DOI Creative Commons
Alisa A. Omelchenko, Jane C. Siwek, Prabal Chhibbar

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Май 4, 2024

Abstract The explosion of sequence data has allowed the rapid growth protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration concatenated prior to embedding. Interestingly, no method utilizes a representation interaction itself. We developed an LM (iLM), which uses novel represent between protein/peptide sequences. S liding W indow In teraction G rammar (SWING) leverages differences amino acid properties generate vocabulary. This vocabulary is input into supervised prediction step where LM’s representations used as features. SWING was first applied predicting peptide:MHC (pMHC) interactions. not only successful at generating Class I II that comparable state-of-the-art approaches, but unique Mixed model also jointly both classes. Further, trained on alleles predictive II, complex task attempted any existing approach. For de novo data, using accurately predicted pMHC murine SLE (MRL/lpr model) T1D (NOD model), were validated experimentally. To further evaluate SWING’s generalizability, we tested its ability predict disruption specific missense mutations. Although modern methods like AlphaMissense ESM1b can interfaces variant effects/pathogenicity per mutation, they unable interaction-specific disruptions. impact Mendelian mutations population variants PPIs. generalizable approach disruptions with information. Overall, first-in-class zero-shot iLM learns

Язык: Английский

MutaBind2: Predicting the Impacts of Single and Multiple Mutations on Protein-Protein Interactions DOI Creative Commons
Ning Zhang, Yuting Chen,

Haoyu Lu

и другие.

iScience, Год журнала: 2020, Номер 23(3), С. 100939 - 100939

Опубликована: Фев. 27, 2020

Missense mutations may affect proteostasis by destabilizing or over-stabilizing protein complexes and changing the pathway flux. Predicting effects of stabilizing on protein-protein interactions is notoriously difficult because existing experimental sets are skewed toward reducing binding affinity many computational methods fail to correctly evaluate their effects. To address this issue, we developed a method MutaBind2, which estimates impacts single as well multiple interactions. MutaBind2 employs only seven features, most important them describe proteins with solvent, evolutionary conservation site, thermodynamic stability complex each monomer. This approach shows distinct improvement especially in evaluating increasing affinity. can be used for finding disease driver mutations, designing stable complexes, discovering new interaction inhibitors.

Язык: Английский

Процитировано

172

Mass spectrometry‐based protein–protein interaction networks for the study of human diseases DOI Creative Commons
Alicia Richards, Manon Eckhardt, Nevan J. Krogan

и другие.

Molecular Systems Biology, Год журнала: 2021, Номер 17(1)

Опубликована: Янв. 1, 2021

Review12 January 2021Open Access Mass spectrometry-based protein–protein interaction networks for the study of human diseases Alicia L Richards orcid.org/0000-0002-4869-2945 Quantitative Biosciences Institute (QBI), University California San Francisco, CA, USA J. David Gladstone Institutes, Department Cellular and Molecular Pharmacology, Search more papers by this author Manon Eckhardt orcid.org/0000-0001-8143-6129 Nevan J Krogan Corresponding Author [email protected] orcid.org/0000-0003-4902-337X Information Richards1,2,3, Eckhardt1,2,3 *,1,2,3 1Quantitative 2J. 3Department *Corresponding author. Tel: +1 415 476 2980; E-mail: Systems Biology (2021)17:e8792https://doi.org/10.15252/msb.20188792 PDFDownload PDF article text main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract A better understanding molecular mechanisms underlying disease is key expediting development novel therapeutic interventions. Disease are often mediated interactions between proteins. Insights into physical rewiring in response mutations, pathological conditions, or pathogen infection can advance our etiology, progression, pathogenesis lead identification potential druggable targets. Advances quantitative mass spectrometry (MS)-based approaches have allowed unbiased mapping these disease-mediated changes on a global scale. Here, we review MS techniques that been instrumental at system-level, discuss challenges associated with methodologies as well advancements aim address challenges. An overview examples from diverse contexts illustrates MS-based revealing mechanisms, pinpointing new targets, eventually moving toward personalized applications. Introduction Identifying principal basis crucial successful prevention, diagnosis, treatment. In past two decades, scientists placed lot hope large genomic studies deciphering mechanisms. Nevertheless, despite wealth information gathered, mechanism most remains unknown. This be explained least part fact many complex do not follow classical genotype phenotype model. They may result multiple genetic changes, epigenetic modifications, pathogen. The fallacy expecting simple explain phenotypes has demonstrated especially case cancer, where distinct collection mutations exclusive given cancer type (Junttila de Sauvage, 2013; Leiserson et al, 2015). Additionally, single gene different diseases, corresponding proteins having several functions cellular (Nadeau, 2001). Consequently, extracting useful diagnostic prognostic genetics alone difficult. Considering context disrupted processes help overcome challenge. biology approaches, which provide comprehensive picture biological process quantifying all observable components their relationships, well-suited understand influence network interconnected pathways. Proteins networks. Often, individual perform any isolation but accomplish task through direct other As such, studying (PPI) become powerful tool identifying functional consequences variation. approach, disease-related mapped vital PPIs processes. Comparison states wild-type reference map—either introduction carrying exogenous expression proteins—promises reveal how change during (Krogan 2015; Willsey 2018). directly responsible adaptation changes. Because connectivity proteins, impact mutation restricted specific product. Instead, it affects entire accordingly activity whole subset Instead focusing genes loci implicated disease, PPI-based analyses parts pathway connections changed state, thus offering an alternative identify mutation's function. Interacting visualized using network-based nodes representing "bait" interest PPI study. Nodes connected edges interacting identified Affinity Purification Spectrometry (AP-MS), proximity labeling, Cross-Linking (XL-MS), types experiments. performed both diseased state non-diseased WT states, variations regulation monitored. perturbations networks, including complete loss interactions, partial gain (Fig 1). suggests small network, such particular gene, cause significant across system. Changes partners protein, either progression following infection, might contribute potentially linking phenotype. Applying approach clinical advantages. finding protein biochemical its also play role same processes, providing mechanistic explanations implications beyond protein. Figure 1. systems-level converting pathway-level dataGenetic variants, occur rarely individuals used Comparisons introduced aid determining significance mutations. Similarly, pathogenic determine host pathways hijacked over course infection. Download figure PowerPoint current research disease. Throughout, will highlight field, advances some them. For detailed examination tools relying detection, refer reader reviews (e.g., Snider Beltran 2017). methods Liquid chromatography-MS (LC-MS) sensitive, accurate, selective method quantify (Richards Aebersold Mann, 2016). One major benefits nature proteomics. contrast PPIs, yeast-2-hybrid (Y2H), maps physical, binary predetermined set (Walhout Vidal, general workflow utilizing discovery develop outlined Box 1 illustrated Fig 2. Below, summarize variety that, when combined MS, allow proteome-level analysis systems. Overview techniques(A) Workflow bottom-up Preparing proteomic samples LC-MS/MS requires extraction, proteolysis, and, optionally, peptide-level fractionation. Online LC separation peptide mixtures introduces analytes spectrometer precursor fragment ion analysis. Tandem spectra matched theoretical generated silico garner sequences inference. (B) Label-free quantitation. Following digestion, each sample, equal amount peptides separately loaded column. Relative quantitation comparing extracted peak intensity runs dataset. (C) SILAC. During cell culture, "light" "heavy" versions amino acids metabolically incorporated samples. sample preparation, lysates mixed total ratios digested peptides. Intensities chromatograms MS1 scan relative abundances (D) Isobaric labeling. Each peptides, labeled unique isobaric label, ratios. MS/MS analysis, tag yields (E) Targeted MS. SRM, individually monitored quantified. first isolated, characteristic fragments Only masses selected user starts digesting mixture defined cleavage sites trypsin), separated liquid chromatography mass-to-charge (m/z) measured spectrometer. standard tandem experiments, sequence determined collecting second spectrum after induced fragmentation. Taken together, m/z data full then computationally search databases organism original 2A). To candidate interactors studies, "scored" accuracy interaction. oftentimes done combining parameters reproducibility, specificity, abundance detected scoring algorithms exists purpose, MiST, CompPASS, SAINT (Sowa 2009; Choi 2011; Teo 2014, 2016; Morris 2014; Verschueren methodology algorithm differs—for example, incorporates quality controls prey probability bait true positive, while CompPASS utilizes ultimately focus abundance, uniqueness, reproducibility distinguish contaminant background (Christianson 2011). output programs table filtered, scored imported visualization Cytoscape (Shannon 2003). addition computational assessing specificity appropriate controls, conditions 2B–E). allows unlimited number 2B). However, there limitations one them being comparison purposes, identical amounts should injected column When possible, normalization required. reduce bias, compared analyzed acquisition batch Randomization run order avoid systematic errors. Metabolic labeling Stable Isotope Labeling Amino Acids Cell Culture (SILAC) (TMT) labels multiplex increasing experimental throughput. SILAC stable heavy level 2C; Ong 2002; Szklarczyk 2019), tagging utilize NHS-activated molecules label free amines chemical tags vitro digestion 2D). All rely inclusion additional control added, so origin respective interactor traced (Ong Thompson 2003; 2014). Together, timepoints discriminate non-specific (Wiese 2007; Virreira Winter targeted strategies, parallel reaction monitoring (PRM) multiple/selective (MRM/SRM), validate greater consistency, sensitivity, (Lange 2008; Gallien 2012; Peterson 2012). Briefly, target assay development. These signature ions precise final experiment 2E). Among numerous contaminants copurified together interest. Therefore, necessary analyze way separates artifacts. done, part, careful design suitable controls. Importantly, unrelated tag, alone, need included (Jäger 2011b). GFP It unlikely form presumably false positives due epitope affinity capture (Morris contaminations. accessed via CRAPome database (Mellacheruvu 2013), public repository negative data, filtered out Contamination carryover overexpressed residual subsequent experiments actually present interactor. Strict wash steps required alleviate problem. purification (AP-MS) AP-MS 3A) tagging, short (for FLAG-, TAP-, Strep-Tag, c-myc (Chang, 2006)) fused interest—either construct under gene's endogenous promoter editing technologies like CRISPR-Cas9. resulting probe interacting, "prey" eliminating antibodies interest, would lower throughput immunoprecipitation (IP) easily purified matrix recognizing epitope. After washing eliminate interactors, 3. networks(A) General AP-MS. Bait endogenously tagged expressed cells, followed lysis LC-MS/MS. processing (BOX), Identification proximal promiscuous ligase cells. biotin, within fusion protein's radius subsequently lysed captured matrix. Direct cross-linked XL-MS. cross-linking reagent, cells digested, enriched cross-linker. LC-MS/MS, interpretation build high-throughput enabled 1,000s complexes large-scale models healthy states. largest assembly BioPlex database, has, date, compiled 56,533 10,961 HEK293T (Huttlin 2015, Publicly available sets these, hu.MAP 2.0 (Drew 2017; preprint: Drew 2020), represent important resources biomedical efforts spurred multitude discoveries further below. limitation milder than those typically employed Membrane hard problems extraction (Sastry Pankow Weaker transient prone steps. (TAP) affixes separate (Rigaut 1999), endure harsher His-tag) increase recovery rate lost regular (Puig comes disadvantage laborious preparation purification, artifacts Irrespective employed, remain issues, requiring selection Another lysis-induced mixing compartments normally interact, positive identifications. Possible solutions deconvolute effects compartment currently explored discussed section New Methodology. possible introducing N- C-terminus disrupt normal function, making advantageous test termini. note does readily differentiate indirect interactors. On hand, offers advantages earlier strategies (e.g. Y2H), high sensitivity quantification time (non-binary). detecting post-translational modifications (PTMs) (Matsuura 2008). generation, label-free value comparative whether Proximity represents complementary strategy traditional (Han case, expressing enzyme 3B). molecule substrate, covalent 10–20 nm range, capturing surrounding environment, lysis, denatured solubilized, enrichment biotinylated commonly streptavidin binding, strong binding biotin streptavidin, permits efficient AP-MS, allowing weak methodologies. procedure includes use detergents intact purification. Various established. BioID BirA, rendering promiscuous. BirA catalyzes transformation reactive form, resultant cloud reacts primary vicinity, biotinylation (Roux Subcellular include nuclear envelope (Kim 2016b), centrosome (Antonicka nucleus (preprint: Go cytoplasm (Redwine 2017), Golgi apparatus (Liu 2018), ER (Hoffman endosome, lysosome, mitochondrial cell–cell junctions (Fredriksson 2015), flagella (Kelly efficiency limited 2018; 2019). Due slow kinetics, 18–24 h produce sufficient material off-target background, somewhat restricts amenable BioID. timescale, generation static maps. BioID, BioID2, was developed Aquifex aeolicus. significantly smaller decreases disruption improved targeting localization subcellular 2016a). still 16 improve speed, Branon al (2018) directed evolution resulted faster-acting enzymatic variations: TurboID 15 miniTurbo 13 deletion N-terminal domain. enzymes comparable ten minutes. class arose peroxidases, catalyzing redox reactions. Horseradish peroxidase (HRP) best-studied suffers poor reducing environments (Trinkle-Mulcahy, Engineered ascorbic acid (APEX) drawback, genetically (Rhee Hung timed H2O2, APEX oxidizes phenol derivatives biotin-phenoxyl radicals covalently react electron rich acids, kinetics minutes (Martell rapid capabilities offer speed make investigate dynamically changing interactions. environments, retains cytosol peroxide criticized harmful effect prevents living organisms. Newer iterations seek toxicity issues times. recently introduced, contact-specific SplitID divides separate, inactive (Cho 2020). recombine close proximity, suited organelle contact sites, organelle, subsequently, C-terminal split separated, joined promote Experimental carefully considered before undertaking experiment. With techniques, neighboring throughout colocalize period, simply diffusion region, difficult really reside immediate environment (Lobingier without attached expected presence arise natural 2018) attach enrichment. Similar insertion C- terminus alter Prior generating enzyme-expressing line, C-termini tested ensure no (Sears possibility non-labeled fall outside therefore detected. N-terminus advantageous. Cross-linking (XL-MS) Although complex, members contact. XL-MS fill gap 3C). provides structural proximat

Язык: Английский

Процитировано

158

A comprehensive SARS-CoV-2–human protein–protein interactome reveals COVID-19 pathobiology and potential host therapeutic targets DOI Open Access
Yadi Zhou, Yuan Liu, Shagun Gupta

и другие.

Nature Biotechnology, Год журнала: 2022, Номер 41(1), С. 128 - 139

Опубликована: Окт. 10, 2022

Язык: Английский

Процитировано

128

Converging mechanism of UM171 and KBTBD4 neomorphic cancer mutations DOI Creative Commons
Xiaowen Xie, Olivia Zhang, Megan J. R. Yeo

и другие.

Nature, Год журнала: 2025, Номер unknown

Опубликована: Фев. 12, 2025

Язык: Английский

Процитировано

7

Decoding the functional impact of the cancer genome through protein–protein interactions DOI
Haian Fu, Xiulei Mo, Andrei A. Ivanov

и другие.

Nature reviews. Cancer, Год журнала: 2025, Номер unknown

Опубликована: Янв. 14, 2025

Язык: Английский

Процитировано

2

SAAMBE-3D: Predicting Effect of Mutations on Protein–Protein Interactions DOI Open Access
Swagata Pahari, Gen Li,

Adithya Krishna Murthy

и другие.

International Journal of Molecular Sciences, Год журнала: 2020, Номер 21(7), С. 2563 - 2563

Опубликована: Апрель 7, 2020

Maintaining wild type protein–protein interactions is essential for the normal function of cell and any mutation that alter their characteristics can cause disease. Therefore, ability to correctly quickly predict effect amino acid mutations crucial understanding disease effects be able carry out genome-wide studies. Here, we report a new development SAAMBE method, SAAMBE-3D, which machine learning-based approach, resulting in accurate predictions extremely fast. It achieves Pearson correlation coefficient ranging from 0.78 0.82 depending on training protocol benchmarking five-fold validation test against SKEMPI v2.0 database outperforms currently existing algorithms various blind-tests. Furthermore, optimized tested via cross-validation Cornell University dataset, SAAMBE-3D AUC 1.0 0.96 homo hereto-dimer datasets. Another important feature it very fast, takes less than fraction second complete prediction. available as web server well stand-alone code, last one being another allowing other researchers directly download code run local computer. Combined all together, an fast software applicable studies assess interactions. The webserver codes (SAAMBE-3D predicting change binding free energy SAAMBE-3D-DN if disruptive or non-disruptive) are available.

Язык: Английский

Процитировано

95

Genetic basis of mitochondrial diseases DOI
Mirjana Gušić, Holger Prokisch

FEBS Letters, Год журнала: 2021, Номер 595(8), С. 1132 - 1158

Опубликована: Март 3, 2021

Mitochondrial disorders are monogenic characterized by a defect in oxidative phosphorylation and caused pathogenic variants one of over 340 different genes. The implementation whole-exome sequencing has led to revolution their diagnosis, duplicated the number associated disease genes, significantly increased diagnosed fraction. However, genetic etiology substantial fraction patients exhibiting mitochondrial remains unknown, highlighting limitations variant detection interpretation, which calls for improved computational DNA methods, as well addition OMICS tools. More intriguingly, this also suggests that some lie outside protein-coding genes mechanisms beyond Mendelian inheritance mtDNA relevance. This review covers current status basis diseases, discusses challenges perspectives, explores contribution factors regions expansion spectrum disease.

Язык: Английский

Процитировано

61

Pervasive mislocalization of pathogenic coding variants underlying human disorders DOI
Jessica Lacoste,

Marzieh Haghighi,

Shahan Haider

и другие.

Cell, Год журнала: 2024, Номер 187(23), С. 6725 - 6741.e13

Опубликована: Сен. 30, 2024

Язык: Английский

Процитировано

11

Review of Computational Methods and Database Sources for Predicting the Effects of Coding Frameshift Small Insertion and Deletion Variations DOI Creative Commons
Fang Ge, Muhammad Arif, Zihao Yan

и другие.

ACS Omega, Год журнала: 2024, Номер unknown

Опубликована: Янв. 3, 2024

Genetic variations (including substitutions, insertions, and deletions) exert a profound influence on DNA sequences. These are systematically classified as synonymous, nonsynonymous, nonsense, each manifesting distinct effects proteins. The implementation of high-throughput sequencing has significantly augmented our comprehension the intricate interplay between gene protein structure function, well their ramifications in context diseases. Frameshift variations, particularly small insertions deletions (indels), disrupt coding instrumental disease pathogenesis. This review presents succinct computational methods, databases, current challenges, future directions predicting consequences frameshift indels variations. We analyzed predictive efficacy, reliability, utilization methods variant account, database. Besides, we also compared prediction methodologies GOF/LOF pathogenic variation data. Addressing challenges pertaining to accuracy cross-species generalizability, nascent technologies such AI deep learning harbor immense potential enhance capabilities. importance interdisciplinary research collaboration cannot be overstated for devising effective diagnosis, treatment, prevention strategies concerning diseases associated with

Язык: Английский

Процитировано

6

Low concentration Tetrabromobisphenol A (TBBPA) elevating overall metabolism by inducing activation of the Ras signaling pathway DOI
Lirong Lu, Junjie Hu, Guiying Li

и другие.

Journal of Hazardous Materials, Год журнала: 2021, Номер 416, С. 125797 - 125797

Опубликована: Апрель 1, 2021

Язык: Английский

Процитировано

39