ClinVar and HGMD genomic variant classification accuracy has improved over time, as measured by implied disease burden DOI Creative Commons
Andrew G. Sharo, Yangyun Zou, Aashish N. Adhikari

и другие.

Genome Medicine, Год журнала: 2023, Номер 15(1)

Опубликована: Июль 13, 2023

Abstract Background Curated databases of genetic variants assist clinicians and researchers in interpreting variation. Yet, these contain some misclassified variants. It is unclear whether variant misclassification abating as rapidly grow implement new guidelines. Methods Using archives ClinVar HGMD, we investigated how has changed over 6 years, across different ancestry groups. We considered inborn errors metabolism (IEMs) screened newborns a model system because disorders are often highly penetrant with neonatal phenotypes. used samples from the 1000 Genomes Project (1KGP) to identify individuals genotypes that were classified by pathogenic. Due rarity IEMs, nearly all such pathogenic indicate likely or HGMD. Results While false-positive rates both HGMD have improved time, currently imply two orders magnitude more affected 1KGP than observed African significantly increased chance being incorrectly indicated be IEM when used. However, this bias affecting genomes was no longer significant once common removed accordance recent classification discovered Pathogenic Likely reclassified sixfold DM DM? which resulted ClinVar’s lower rate. Conclusions Considering since been reveals our increasing understanding rare found guidelines allele frequency comprising genetically diverse important factors reclassification. also European South Asian confidence category, perhaps due an multiple submitters. discuss features for would support their continued improvement.

Язык: Английский

Inferring the molecular and phenotypic impact of amino acid variants with MutPred2 DOI Creative Commons
Vikas Pejaver,

Jorge Urresti,

Jose Lugo-Martinez

и другие.

Nature Communications, Год журнала: 2020, Номер 11(1)

Опубликована: Ноя. 20, 2020

Abstract Identifying pathogenic variants and underlying functional alterations is challenging. To this end, we introduce MutPred2, a tool that improves the prioritization of amino acid substitutions over existing methods, generates molecular mechanisms potentially causative disease, returns interpretable pathogenicity score distributions on individual genomes. Whilst its performance state-of-the-art, distinguishing feature MutPred2 probabilistic modeling variant impact specific aspects protein structure function can serve to guide experimental studies phenotype-altering variants. We demonstrate utility in identification structural mutational signatures relevant Mendelian disorders de novo mutations associated with complex neurodevelopmental disorders. then experimentally validate several identified patients such argue mechanism-driven human inherited disease have potential significantly accelerate discovery clinically actionable

Язык: Английский

Процитировано

596

MutationTaster2021 DOI Creative Commons
Robin Steinhaus, Sebastian Proft, Markus Schuelke

и другие.

Nucleic Acids Research, Год журнала: 2021, Номер 49(W1), С. W446 - W451

Опубликована: Апрель 1, 2021

Here we present an update to MutationTaster, our DNA variant effect prediction tool. The new version uses a different model and attains higher accuracy than its predecessor, especially for rare benign variants. In addition, have integrated many sources of data that only became available after the last release (such as gnomAD ExAC pLI scores) changed splice site model. To more easily assess relevance detected known disease mutations clinical phenotype patient, MutationTaster now provides information on diseases they cause. Further changes represent major overhaul interfaces increase user-friendliness whilst under hood been designed accelerate processing uploaded VCF files. We also offer API rapid automated query smaller numbers variants from within other software. MutationTaster2021 integrates mutation search engine, MutationDistiller, prioritise files using patient's phenotype. novel is at https://www.genecascade.org/MutationTaster2021/. This website free open all users there no login requirement.

Язык: Английский

Процитировано

210

The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics DOI Creative Commons

Hugo Dalla-Torre,

Liam Gonzalez,

Javier Mendoza‐Revilla

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Янв. 15, 2023

Closing the gap between measurable genetic information and observable traits is a longstanding challenge in genomics. Yet, prediction of molecular phenotypes from DNA sequences alone remains limited inaccurate, often driven by scarcity annotated data inability to transfer learnings tasks. Here, we present an extensive study foundation models pre-trained on sequences, named Nucleotide Transformer, ranging 50M up 2.5B parameters integrating 3,202 diverse human genomes, as well 850 genomes selected across phyla, including both model non-model organisms. These transformer yield transferable, context-specific representations nucleotide which allow for accurate phenotype even low-data settings. We show that developed can be fine-tuned at low cost despite available regime solve variety genomics applications. Despite no supervision, learned focus attention key genomic elements, those regulate gene expression, such enhancers. Lastly, demonstrate utilizing improve prioritization functional variants. The training application foundational explored this provide widely applicable stepping stone bridge sequence. Code weights at: https://github.com/instadeepai/nucleotide-transformer Jax https://huggingface.co/InstaDeepAI Pytorch. Example notebooks apply these any downstream task are https://huggingface.co/docs/transformers/notebooks#pytorch-bio.

Язык: Английский

Процитировано

103

MetaRNN: differentiating rare pathogenic and rare benign missense SNVs and InDels using deep learning DOI Creative Commons
Chang Li, Degui Zhi, Kai Wang

и другие.

Genome Medicine, Год журнала: 2022, Номер 14(1)

Опубликована: Окт. 8, 2022

Multiple computational approaches have been developed to improve our understanding of genetic variants. However, their ability identify rare pathogenic variants from benign ones is still lacking. Using context annotations and deep learning methods, we present pathogenicity prediction models, MetaRNN MetaRNN-indel, help prioritize nonsynonymous single nucleotide (nsSNVs) non-frameshift insertion/deletions (nfINDELs). We use independent test sets demonstrate that these new models outperform state-of-the-art competitors achieve a more interpretable score distribution. Importantly, scores both are comparable, enabling easy adoption integrated genotype-phenotype association analysis methods. All pre-computed nsSNV available at http://www.liulab.science/MetaRNN . The stand-alone program also https://github.com/Chang-Li2019/MetaRNN

Язык: Английский

Процитировано

100

The landscape of tolerated genetic variation in humans and primates DOI
Hong Gao,

Tobias Hamp,

Jeffrey M. Ede

и другие.

Science, Год журнала: 2023, Номер 380(6648)

Опубликована: Июнь 1, 2023

Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding their clinical relevance remains largely incomplete. To systematically decipher the effects human variants, we obtained whole-genome data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these can be inferred to have nondeleterious humans based on presence at high allele frequencies other populations. use this resource classify 6% all possible as likely benign impute pathogenicity remaining 94% deep learning, achieving state-of-the-art accuracy diagnosing pathogenic patients diseases.

Язык: Английский

Процитировано

99

Phylogenomic analyses provide insights into primate evolution DOI
Yong Shao, Long Zhou, Li Fang

и другие.

Science, Год журнала: 2023, Номер 380(6648), С. 913 - 924

Опубликована: Июнь 1, 2023

Comparative analysis of primate genomes within a phylogenetic context is essential for understanding the evolution human genetic architecture and diversity. We present such study 50 species spanning 38 genera 14 families, including 27 first reported here, with many from previously less well represented groups, New World monkeys Strepsirrhini. Our analyses reveal heterogeneous rates genomic rearrangement gene across lineages. Thousands genes under positive selection in different lineages play roles nervous, skeletal, digestive systems may have contributed to innovations adaptations. reveals that key occurred Simiiformes ancestral node had an impact on adaptive radiation evolution.

Язык: Английский

Процитировано

89

Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencing DOI Creative Commons
Sneha D. Goenka, John E. Gorzynski, Kishwar Shafin

и другие.

Nature Biotechnology, Год журнала: 2022, Номер 40(7), С. 1035 - 1041

Опубликована: Март 28, 2022

Abstract Whole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for and analysis has been a barrier to its use in acutely ill patients. In present study, we develop an approach ultra-rapid nanopore WGS combines optimized sample preparation protocol, distributing over 48 flow cells, near real-time base calling alignment, accelerated variant fast filtration efficient manual review. Application two example clinical cases identified candidate <8 h from identification. We show this framework provides accurate calls prioritization, accelerates diagnostic genome twofold compared with previous approaches.

Язык: Английский

Процитировано

73

The molecular structure of IFT-A and IFT-B in anterograde intraflagellar transport trains DOI Creative Commons
Samuel E. Lacey, Helen Foster, Gaia Pigino

и другие.

Nature Structural & Molecular Biology, Год журнала: 2023, Номер 30(5), С. 584 - 593

Опубликована: Янв. 2, 2023

Anterograde intraflagellar transport (IFT) trains are essential for cilia assembly and maintenance. These formed of 22 IFT-A IFT-B proteins that link structural signaling cargos to microtubule motors import into cilia. It remains unknown how the IFT-A/-B arranged complexes these polymerize functional trains. Here we use in situ cryo-electron tomography Chlamydomonas reinhardtii AlphaFold2 protein structure predictions generate a molecular model entire anterograde train. We show conformations both dependent on lateral interactions with neighboring repeats, suggesting polymerization is required cooperatively stabilize complexes. Following three-dimensional classification, reveal extends two flexible tethers maintain connection can withstand mechanical stresses present actively beating Overall, our findings provide framework understanding fundamental processes govern assembly.

Язык: Английский

Процитировано

72

CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods DOI Creative Commons
Shantanu Jain, Constantina Bakolitsa, Steven E. Brenner

и другие.

Genome biology, Год журнала: 2024, Номер 25(1)

Опубликована: Фев. 22, 2024

Abstract Background The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction genetic variant impact, particularly where relevant disease. five complete editions CAGI community experiment comprised 50 challenges, in which participants made blind predictions phenotypes from data, and these were evaluated by independent assessors. Results Performance was strong clinical pathogenic variants, including some difficult-to-diagnose cases, extends interpretation cancer-related variants. Missense methods able estimate biochemical effects with increasing accuracy. regulatory variants complex trait disease risk less definitive indicates performance potentially suitable auxiliary use clinic. Conclusions show that while current are imperfect, they have major utility research applications. Emerging increasingly large, robust datasets training assessment promise further progress ahead.

Язык: Английский

Процитировано

37

Nucleotide Transformer: building and evaluating robust foundation models for human genomics DOI Creative Commons

Hugo Dalla-Torre,

Liam Gonzalez,

Javier Mendoza‐Revilla

и другие.

Nature Methods, Год журнала: 2024, Номер unknown

Опубликована: Ноя. 28, 2024

The prediction of molecular phenotypes from DNA sequences remains a longstanding challenge in genomics, often driven by limited annotated data and the inability to transfer learnings between tasks. Here, we present an extensive study foundation models pre-trained on sequences, named Nucleotide Transformer, ranging 50 million up 2.5 billion parameters integrating information 3,202 human genomes 850 diverse species. These transformer yield context-specific representations nucleotide which allow for accurate predictions even low-data settings. We show that developed can be fine-tuned at low cost solve variety genomics applications. Despite no supervision, learned focus attention key genomic elements used improve prioritization genetic variants. training application foundational provides widely applicable approach phenotype sequence. Transformer is series different parameter sizes datasets applied various downstream tasks fine-tuning.

Язык: Английский

Процитировано

36