Annotation-free prediction of microbial dioxygen utilization DOI Creative Commons
Avi I. Flamholz, Joshua E. Goldford,

Philippa A. Richter

и другие.

mSystems, Год журнала: 2024, Номер unknown

Опубликована: Сен. 4, 2024

ABSTRACT Aerobes require dioxygen (O 2 ) to grow; anaerobes do not. However, nearly all microbes—aerobes, anaerobes, and facultative organisms alike—express enzymes whose substrates include O , if only for detoxification. This presents a challenge when trying assess which are aerobic from genomic data alone. can be overcome by noting that utilization has wide-ranging effects on microbes: aerobes typically have larger genomes encoding distinctive -utilizing enzymes, example. These permit high-quality prediction of annotated genome sequences, with several models displaying ≈80% accuracy ternary classification task blind guessing is 33% accurate. Since annotation compute-intensive relies many assumptions, we asked annotation-free methods also perform well. We discovered simple efficient based entirely sequence content—e.g., triplets amino acids—perform as well intensive annotation-based classifiers, enabling rapid processing genomes. further show acid trimers useful because they encode information about protein composition phylogeny. To showcase the utility prediction, estimated prevalence in diverse natural environments cataloged Earth Microbiome Project. Focusing well-studied gradient Black Sea, found quantitative correspondence between local chemistry :sulfide concentration ratio) microbial communities. We, therefore, suggest statistical like ours might used estimate, or “sense,” pivotal features chemical environment using DNA sequencing data. IMPORTANCE now access wide variety environments. document bewildering diversity microbes, known their Physiology—an organism’s capacity engage metabolically its environment—may provide more lens than taxonomy understanding As an example this broader principle, developed algorithms accurately predict directly sequences without annotating genes, e.g., considering acids sequences. Annotation-free enable characterization samples, highlighting levels set Sea. suggests repurposed multi-pronged sensor, estimating concentrations other key facets complex settings.

Язык: Английский

Annotation-free prediction of microbial dioxygen utilization DOI Creative Commons
Avi I. Flamholz, Joshua E. Goldford,

Philippa A. Richter

и другие.

mSystems, Год журнала: 2024, Номер unknown

Опубликована: Сен. 4, 2024

ABSTRACT Aerobes require dioxygen (O 2 ) to grow; anaerobes do not. However, nearly all microbes—aerobes, anaerobes, and facultative organisms alike—express enzymes whose substrates include O , if only for detoxification. This presents a challenge when trying assess which are aerobic from genomic data alone. can be overcome by noting that utilization has wide-ranging effects on microbes: aerobes typically have larger genomes encoding distinctive -utilizing enzymes, example. These permit high-quality prediction of annotated genome sequences, with several models displaying ≈80% accuracy ternary classification task blind guessing is 33% accurate. Since annotation compute-intensive relies many assumptions, we asked annotation-free methods also perform well. We discovered simple efficient based entirely sequence content—e.g., triplets amino acids—perform as well intensive annotation-based classifiers, enabling rapid processing genomes. further show acid trimers useful because they encode information about protein composition phylogeny. To showcase the utility prediction, estimated prevalence in diverse natural environments cataloged Earth Microbiome Project. Focusing well-studied gradient Black Sea, found quantitative correspondence between local chemistry :sulfide concentration ratio) microbial communities. We, therefore, suggest statistical like ours might used estimate, or “sense,” pivotal features chemical environment using DNA sequencing data. IMPORTANCE now access wide variety environments. document bewildering diversity microbes, known their Physiology—an organism’s capacity engage metabolically its environment—may provide more lens than taxonomy understanding As an example this broader principle, developed algorithms accurately predict directly sequences without annotating genes, e.g., considering acids sequences. Annotation-free enable characterization samples, highlighting levels set Sea. suggests repurposed multi-pronged sensor, estimating concentrations other key facets complex settings.

Язык: Английский

Процитировано

0