From gene to structure: Unraveling genomic dark matter in Ca. Accumulibacter DOI Creative Commons

Xiaojing Xie,

Xuhan Deng,

Liping Chen

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: May 17, 2024

Abstract Candidatus Accumulibacter is a unique and pivotal genus of polyphosphate-accumulating organisms (PAOs) prevalent in wastewater treatment plants, plays mainstay roles the global phosphorus cycle. Whereas, efforts toward complete understanding their genetic metabolic characteristics are largely hindered by major limitations existing sequence-based annotation methods, leaving more than half protein-encoding genes unannotated. To address challenge, we developed comprehensive approach integrating pangenome analysis, gene-based protein structure function prediction, metatranscriptomic extending beyond constraints sequence-centric methodologies. The application to Ca . allowed establishment pan- proteome database, providing references for >200,000 proteins. Benchmarking on 28 genomes showed increases average coverage from 51% 83%. Genetic that had eluded exploration via conventional methods were unraveled. For instance, identification previously unknown phosphofructokinase gene suggests all Ca. encoded Embden-Meyerhof-Parnas pathway. A defined homolog phosphate-specific transport system accessory (PhoU) was actually an inorganic phosphate (Pit) protein, regulating Pit instead high-affinity (Pst), key emergence trait Accumulibacter. Additional lineage members found encoding denitrification pathways. This study offers readily usable transferable tool high-coverage reference databases diverse cultured uncultured bacteria, facilitating genomic dark matter bacterial domain. Synopsis integrated advanced unraveling applicable bacteria customerized database

Language: Английский

Using structure prediction of negative sense RNA virus nucleoproteins to assess evolutionary relationships DOI Creative Commons
Kimberly R. Sabsay, Aartjan J.W. te Velthuis

Virus Evolution, Journal Year: 2024, Volume and Issue: 10(1)

Published: Jan. 1, 2024

Negative sense RNA viruses (NSV) include some of the most detrimental human pathogens, including influenza, Ebola, and measles viruses. NSV genomes consist one or multiple single-stranded molecules that are encapsidated into more ribonucleoprotein (RNP) complexes. These RNPs viral RNA, a polymerase, many copies nucleoprotein (NP). Current evolutionary relationships within phylum based on alignment conserved RNA-dependent polymerase (RdRp) domain amino acid sequences. However, RdRp domain-based phylogeny does not address whether NP, other core protein in genome, evolved along same trajectory several RdRp-NP pairs through convergent evolution segmented non-segmented genome architectures. Addressing how NP may help us better understand diversity. Since sequences too short to infer robust phylogenetic relationships, we here used experimentally obtained AlphaFold 2.0-predicted structures probe can be estimated using Following flexible structure alignments modeled structures, find structural homology NPs reveals clusters consistent with RdRp-based clustering. In addition, were able assign for which currently missing available sequence. Both our NP-based deviate from current classification

Language: Английский

Citations

1

From Gene to Structure: Unraveling Genomic Dark Matter in Ca. Accumulibacter DOI

Xiaojing Xie,

Xuhan Deng, Liping Chen

et al.

Environmental Science & Technology, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 19, 2024

"

Citations

1

How enzyme‐centered approaches are advancing research on cyclic oligo‐nucleotides DOI Creative Commons
Simon J. Wenzl, Carina C. de Oliveira Mann

FEBS Letters, Journal Year: 2024, Volume and Issue: 598(8), P. 839 - 863

Published: March 7, 2024

Cyclic nucleotides are the most diversified category of second messengers and found in all organisms modulating diverse pathways. While cAMP cGMP have been studied over 50 years, cyclic di‐nucleotide signaling eukaryotes emerged only recently with anti‐viral molecule 2´3´cGAMP. Recent breakthrough discoveries revealed not astonishing chemical diversity but also surprisingly deep‐rooted evolutionary origins oligo‐nucleotide pathways structural conservation proteins involved their synthesis signaling. Here we discuss how enzyme‐centered approaches paved way for identification several nucleotide signals, focusing on advantages challenges associated deciphering activation mechanisms such enzymes.

Language: Английский

Citations

1

Large scale analysis of predicted protein structures links model features to in vivo behaviour DOI Creative Commons
Michael J. Stam, Diego A. Oyarzún, Nadanai Laohakunakorn

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: April 14, 2024

Abstract Rapid advancements in protein structure prediction methods have ushered a new era of abundant and accurate structural data, providing opportunities to analyse proteins at scale that has not been possible before. Here we show features derived solely from predicted structures can be used understand vivo behaviour using data-driven methods. We found these were predictive production for set designed antibodies, enabling identification high-quality designs. Following on this result, calculated diverse ≈500,000 structures, our analysis showed systematic variation between different organisms such an extent the tree life could recapitulated data. Given high degree functional constraint around chemistry proteins, result is surprising, important implications design engineering novel proteins.

Language: Английский

Citations

0

From gene to structure: Unraveling genomic dark matter in Ca. Accumulibacter DOI Creative Commons

Xiaojing Xie,

Xuhan Deng,

Liping Chen

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: May 17, 2024

Abstract Candidatus Accumulibacter is a unique and pivotal genus of polyphosphate-accumulating organisms (PAOs) prevalent in wastewater treatment plants, plays mainstay roles the global phosphorus cycle. Whereas, efforts toward complete understanding their genetic metabolic characteristics are largely hindered by major limitations existing sequence-based annotation methods, leaving more than half protein-encoding genes unannotated. To address challenge, we developed comprehensive approach integrating pangenome analysis, gene-based protein structure function prediction, metatranscriptomic extending beyond constraints sequence-centric methodologies. The application to Ca . allowed establishment pan- proteome database, providing references for >200,000 proteins. Benchmarking on 28 genomes showed increases average coverage from 51% 83%. Genetic that had eluded exploration via conventional methods were unraveled. For instance, identification previously unknown phosphofructokinase gene suggests all Ca. encoded Embden-Meyerhof-Parnas pathway. A defined homolog phosphate-specific transport system accessory (PhoU) was actually an inorganic phosphate (Pit) protein, regulating Pit instead high-affinity (Pst), key emergence trait Accumulibacter. Additional lineage members found encoding denitrification pathways. This study offers readily usable transferable tool high-coverage reference databases diverse cultured uncultured bacteria, facilitating genomic dark matter bacterial domain. Synopsis integrated advanced unraveling applicable bacteria customerized database

Language: Английский

Citations

0