From GPUs to AI and quantum: three waves of acceleration in bioinformatics DOI Creative Commons
Bertil Schmidt, Andreas Hildebrandt

Drug Discovery Today, Journal Year: 2024, Volume and Issue: 29(6), P. 103990 - 103990

Published: April 23, 2024

The enormous growth in the amount of data generated by life sciences is continuously shifting field from model-driven science towards data-driven science. need for efficient processing has led to adoption massively parallel accelerators such as graphics units (GPUs). Consequently, development bioinformatics methods nowadays often heavily depends on effective use these powerful technologies. Furthermore, progress computational techniques and architectures continues be highly dynamic, involving novel deep neural network models artificial intelligence (AI) accelerators, potentially quantum future. These are expected disruptive a whole drug discovery particular. Here, we identify three waves acceleration their applications context: (i) GPU computing, (ii) AI (iii) next-generation computers.

Language: Английский

Telomere-to-telomere assembly of diploid chromosomes with Verkko DOI
Mikko Rautiainen, Sergey Nurk, Brian P. Walenz

et al.

Nature Biotechnology, Journal Year: 2023, Volume and Issue: 41(10), P. 1474 - 1482

Published: Feb. 16, 2023

Language: Английский

Citations

220

Pangenome graph construction from genome alignments with Minigraph-Cactus DOI
Glenn Hickey, Jean Monlong, Jana Ebler

et al.

Nature Biotechnology, Journal Year: 2023, Volume and Issue: 42(4), P. 663 - 673

Published: May 10, 2023

Language: Английский

Citations

133

Applications of transformer-based language models in bioinformatics: a survey DOI Creative Commons
Shuang Zhang, Rui Fan, Yuti Liu

et al.

Bioinformatics Advances, Journal Year: 2023, Volume and Issue: 3(1)

Published: Jan. 1, 2023

Abstract Summary The transformer-based language models, including vanilla transformer, BERT and GPT-3, have achieved revolutionary breakthroughs in the field of natural processing (NLP). Since there are inherent similarities between various biological sequences languages, remarkable interpretability adaptability these models prompted a new wave their application bioinformatics research. To provide timely comprehensive review, we introduce key developments by describing detailed structure transformers summarize contribution to wide range research from basic sequence analysis drug discovery. While applications diverse multifaceted, identify discuss common challenges, heterogeneity training data, computational expense model interpretability, opportunities context We hope that broader community NLP researchers, bioinformaticians biologists will be brought together foster future development inspire novel unattainable traditional methods. Supplementary information data available at Bioinformatics Advances online.

Language: Английский

Citations

95

High-throughput RNA isoform sequencing using programmed cDNA concatenation DOI
Aziz Al’Khafaji,

Jonathan T. Smith,

Kiran Garimella

et al.

Nature Biotechnology, Journal Year: 2023, Volume and Issue: 42(4), P. 582 - 586

Published: June 8, 2023

Language: Английский

Citations

86

Variant calling and benchmarking in an era of complete human genome sequences DOI
Nathan D. Olson, Justin Wagner, Nathan Dwarshuis

et al.

Nature Reviews Genetics, Journal Year: 2023, Volume and Issue: 24(7), P. 464 - 483

Published: April 14, 2023

Language: Английский

Citations

81

Genomics in the long-read sequencing era DOI Creative Commons

Erwin L. van Dijk,

Delphine Naquin, Kévin Gorrichon

et al.

Trends in Genetics, Journal Year: 2023, Volume and Issue: 39(9), P. 649 - 671

Published: May 23, 2023

Language: Английский

Citations

74

The variation and evolution of complete human centromeres DOI Creative Commons
Glennis A. Logsdon, Allison N. Rozanski, Fedor Ryabov

et al.

Nature, Journal Year: 2024, Volume and Issue: 629(8010), P. 136 - 145

Published: April 3, 2024

Abstract Human centromeres have been traditionally very difficult to sequence and assemble owing their repetitive nature large size 1 . As a result, patterns of human centromeric variation models for evolution function remain incomplete, despite being among the most rapidly mutating regions 2,3 Here, using long-read sequencing, we completely sequenced assembled all from second genome compared it finished reference 4,5 We find that two sets show at least 4.1-fold increase in single-nucleotide when with unique flanks vary up 3-fold size. Moreover, 45.8% cannot be reliably aligned standard methods emergence new α-satellite higher-order repeats (HORs). DNA methylation CENP-A chromatin immunoprecipitation experiments 26% differ kinetochore position by >500 kb. To understand evolutionary change, selected six chromosomes 31 orthologous common chimpanzee, orangutan macaque genomes. Comparative analyses reveal nearly complete turnover HORs, characteristic idiosyncratic changes HORs each species. Phylogenetic reconstruction haplotypes supports limited no recombination between short (p) long (q) arms across reveals novel share monophyletic origin, providing strategy estimate rate saltatory amplification mutation DNA.

Language: Английский

Citations

71

Battery safety: Machine learning-based prognostics DOI Creative Commons
Jingyuan Zhao,

Xuning Feng,

Quanquan Pang

et al.

Progress in Energy and Combustion Science, Journal Year: 2024, Volume and Issue: 102, P. 101142 - 101142

Published: Jan. 19, 2024

Lithium-ion batteries play a pivotal role in wide range of applications, from electronic devices to large-scale electrified transportation systems and grid-scale energy storage. Nevertheless, they are vulnerable both progressive aging unexpected failures, which can result catastrophic events such as explosions or fires. Given their expanding global presence, the safety these potential hazards serious malfunctions now major public concerns. Over past decade, scholars industry experts intensively exploring methods monitor battery safety, spanning materials cell, pack system levels across various spectral, spatial, temporal scopes. In this Review, we start by summarizing mechanisms nature failures. Following this, explore intricacies predicting evolution delve into specialized knowledge essential for data-driven, machine learning models. We offer an exhaustive review spotlighting latest strides fault diagnosis failure prognosis via array approaches. Our discussion encompasses: (1) supervised reinforcement integrated with models, apt faults/failures probing causes protocols at cell level; (2) unsupervised, semi-supervised, self-supervised learning, advantageous harnessing vast data sets modules/packs; (3) few-shot tailored gleaning insights scarce examples, alongside physics-informed bolster model generalization optimize training data-scarce settings. conclude casting light on prospective horizons comprehensive, real-world prognostics management.

Language: Английский

Citations

69

Applications of long-read sequencing to Mendelian genetics DOI Creative Commons
Francesco Mastrorosa, Danny E. Miller, Evan E. Eichler

et al.

Genome Medicine, Journal Year: 2023, Volume and Issue: 15(1)

Published: June 14, 2023

Advances in clinical genetic testing, including the introduction of exome sequencing, have uncovered molecular etiology for many rare and previously unsolved disorders, yet more than half individuals with a suspected disorder remain after complete evaluation. A precise diagnosis may guide treatment plans, allow families to make informed care decisions, permit participate N-of-1 trials; thus, there is high interest developing new tools techniques increase solve rate. Long-read sequencing (LRS) promising technology both increasing rate decreasing amount time required diagnosis. Here, we summarize current LRS technologies, give examples how they been used evaluate complex variation identify missing variants, discuss future applications LRS. As costs continue decrease, will find additional utility space fundamentally changing pathological variants are discovered eventually acting as single-data source that can be interrogated multiple times service.

Language: Английский

Citations

58

Approaching complete genomes, transcriptomes and epi-omes with accurate long-read sequencing DOI
Sam Kovaka, Shujun Ou, Katharine M. Jenike

et al.

Nature Methods, Journal Year: 2023, Volume and Issue: 20(1), P. 12 - 16

Published: Jan. 1, 2023

Language: Английский

Citations

51