Do pseudogenes pose a problem for metabarcoding marine animal communities? DOI Open Access
Jessica A. Schultz, Paul D. N. Hebert

Authorea (Authorea), Journal Year: 2021, Volume and Issue: unknown

Published: Sept. 27, 2021

Because DNA metabarcoding typically employs sequence diversity among mitochondrial amplicons to estimate species composition, nuclear pseudogenes (NUMTs) can inflate diversity. This study quantifies the incidence and attributes of NUMTs derived from 658 bp barcode region cytochrome c oxidase I (COI) in 156 marine animal genomes. The number meeting four length criteria (>150 bp, >300 >450 >600 bp) was determined, they were examined ascertain if could be recognized by their possession indels or stop codons. In total, 389 <100 detected, with an average 2.49 per (range = 0–50) a mean 336 +/- 208 bp. Among lacking diagnostic features, 52.5% ≤300 63.9% ≤450 76.2% ≤600 Studies examing 150 OTU count 1.57x compared true increase perceived intraspecific variation at COI 1.19x (when variants >2% divergence are as different OTUs). There weak positive correlation between genome size NUMT but no phyla, trophic groups life history traits. While bioinformatic advances will improve detection, best defense involves targeting long developing reference databases that include both sequences derivatives.

Language: Английский

Benchmarking the discrimination power of commonly used markers and amplicons in marine fish (e)DNA (meta)barcoding DOI Creative Commons
João T. Fontes, Kazutaka Katoh, Rui Pena Pires

et al.

Metabarcoding and Metagenomics, Journal Year: 2024, Volume and Issue: 8

Published: Oct. 4, 2024

Environmental DNA (eDNA) metabarcoding is revolutionising the study of aquatic ecosystems, enabling high-throughput analysis biodiversity with minimal disturbance. Despite its potential to support fisheries management, species identification and downstream reliability are hindered by lack standardisation in fragment choice. This compares discrimination power three markers used marine fish (e)DNA (meta)barcoding – 12S rRNA, 16S rRNA cytochrome oxidase subunit I (COI) as well two amplicons for each. We analysed sequences from NCBI GenBank 10 orders Actinopterygii (ray-finned fishes), including mitochondrial genomes. assessed determining percentage monophyletic Neighbour-Joining trees calculating congeneric divergences datasets: one genomic regions extracted genomes (771 species) another independent each region (3,879 species). Amongst genomes’ dataset, COI Folmer Leray-Lobo had highest discriminatory power, 89.2% 87.0% species, respectively, while Teleo lowest at 71.6%. Conversely, using these amplicons, percentages 64.8% 63.5%, (Ac16S) 83.0%. Species influenced marker’s evolutionary rate, length, target order quality reference sequence data. recommend considering differences amplicon selection, especially species-level identifications. advise a standard multi-marker approach under certain scenarios, particularly when presence close expected.

Language: Английский

Citations

0

VLF: An R package for the analysis of very low frequency variants in DNA sequences DOI Creative Commons
Jarrett D. Phillips,

Taryn Athey,

Paul D. McNicholas

et al.

Biodiversity Data Journal, Journal Year: 2023, Volume and Issue: 11

Published: Jan. 26, 2023

Here, we introduce VLF , an R package to determine the distribution of very low frequency variants (VLFs) in nucleotide and amino acid sequences for analysis errors DNA sequence records. The allows users assess VLFs aligned trimmed protein-coding by automatically calculating nucleotides or acids each position outputting those that occur under a user-specified (default p = 0.001). These results can then be used explore fundamental population genetic phylogeographic patterns, mechanisms processes at microevolutionary level, such as conservation. Our extends earlier work pertaining implementation Microsoft Excel, which was found both computationally slow error prone. We compare our own herein. Results between two implementations are highly consistent large barcode dataset bird species. Differences readily explained manual human inadequate Linnean taxonomy (specifically, species synonymy). is also applied subset avian barcodes extent biological artifacts level Canada goose ( Branta canadensis ), well within fishes forensic regulatory importance. novelty its benefit over previous include high automation, speed, scalability ease-of-use, desirable characteristics will extremely valuable more data rapidly accumulated popular reference databases, BOLD GenBank.

Language: Английский

Citations

1

An assessment of South African small mammal barcode sequence libraries: Implications for future carnivore diet analyses by DNA DOI Creative Commons

Vimbai I. Siziba,

Sandi Willows‐Munro

African Journal of Ecology, Journal Year: 2023, Volume and Issue: 62(1)

Published: Nov. 27, 2023

Abstract DNA metabarcoding requires reference libraries that link sequences to species. Mitochondrial gene regions cytochrome c oxidase I (COI), 12S ribosomal RNA (12S rRNA), 16S (16S b (cyt ) and the hypervariable control region (D‐loop) are routinely used in studies measure genetic diversity animal This study aimed review state of for small South African mammals as constitute a large portion medium carnivore diet. Analyses records revealed 193 mammal species Africa, only 141 have available one or more mitochondrial genes examined. Cyt had highest coverage, with 59.1% represented libraries. COI has 33.7%, rRNA 23.8%, D‐loop 17.6%, lowest coverage 15%. supports use multiple when performing scat metabarcoding, particularly wanting determine component Additionally, it emphasises need build comprehensive linking taxonomically identified

Language: Английский

Citations

1

Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing DOI Open Access
Orchidea Maria Lecian

Deleted Journal, Journal Year: 2024, Volume and Issue: 1(1), P. 55 - 71

Published: Dec. 16, 2024

The sheaf cohomology techniques are newly used to include Morse simplicial complexes in a rectangular-matrix chain, whose singular values compatible with those of square matrix, which can be for multiple sequencing. equivalence the simplices corresponding graph is proven, as well that filtration probability space. new protocol eliminates problem stochastic stability deep Markov models. paradigm implemented develop deep-machine-learning construction models sequencing, starting from profile model, analytically written. Applications found an amino-acid sequencing model. As result, nucleotide-dependence positions on alignments fully modelized. metrics manifolds discussed. instance application Jukes–Cantor model successfully controlled nucleotide-substitution

Language: Английский

Citations

0

Do pseudogenes pose a problem for metabarcoding marine animal communities? DOI Open Access
Jessica A. Schultz, Paul D. N. Hebert

Authorea (Authorea), Journal Year: 2021, Volume and Issue: unknown

Published: Sept. 27, 2021

Because DNA metabarcoding typically employs sequence diversity among mitochondrial amplicons to estimate species composition, nuclear pseudogenes (NUMTs) can inflate diversity. This study quantifies the incidence and attributes of NUMTs derived from 658 bp barcode region cytochrome c oxidase I (COI) in 156 marine animal genomes. The number meeting four length criteria (>150 bp, >300 >450 >600 bp) was determined, they were examined ascertain if could be recognized by their possession indels or stop codons. In total, 389 <100 detected, with an average 2.49 per (range = 0–50) a mean 336 +/- 208 bp. Among lacking diagnostic features, 52.5% ≤300 63.9% ≤450 76.2% ≤600 Studies examing 150 OTU count 1.57x compared true increase perceived intraspecific variation at COI 1.19x (when variants >2% divergence are as different OTUs). There weak positive correlation between genome size NUMT but no phyla, trophic groups life history traits. While bioinformatic advances will improve detection, best defense involves targeting long developing reference databases that include both sequences derivatives.

Language: Английский

Citations

2