
The EMBO Journal, Journal Year: 2020, Volume and Issue: 39(6)
Published: Feb. 24, 2020
Review24 February 2020Open Access A guide to naming human non-coding RNA genes Ruth L Seal Corresponding Author [email protected] orcid.org/0000-0002-7545-6817 Department of Haematology, University Cambridge School Clinical Medicine, Cambridge, UK European Molecular Biology Laboratory, Bioinformatics Institute, Hinxton, Search for more papers by this author Ling-Ling Chen State Key Laboratory Biology, Shanghai Institute Biochemistry and Cell Chinese Academy Science, Shanghai, China Sam Griffiths-Jones Biological Sciences, Faculty Medicine Health, Manchester, Todd M Lowe Biomolecular Engineering, California, Santa Cruz, CA, USA Michael B Mathews Rutgers New Jersey Medical School, Newark, NJ, Dawn O'Reilly Computational Integrative Genomics Lab, MRC/CRUK Oxford Oncology, Oxford, Andrew J Pierce Translational Oncology R&D, AstraZeneca, Peter F Stadler Group, Computer Interdisciplinary Center Bioinformatics, Leipzig, Germany Max Planck Mathematics in the Theoretical Chemistry, Vienna, Austria Facultad de Ciencias, Universidad National Colombia, Sede Bogotá, Colombia Fe Fe, Igor Ulitsky orcid.org/0000-0003-0555-6561 Regulation, Weizmann Rehovot, Israel Sandra Wolin Cancer Institutes Frederick, MD, Elspeth Bruford Information *,1,2, Chen3, Griffiths-Jones4, Lowe5, Mathews6, O'Reilly7, Pierce8, Stadler9,10,11,12,13, Ulitsky14, Wolin15 Bruford1,2 1Department 2European 3State 4School 5Department 6Department 7Computational 8Translational 9Bioinformatics 10Max 11Institute 12Facultad 13Santa 14Department 15RNA *Corresponding author. Tel: +44 (0)1223 494 446; E-mail: The EMBO Journal (2020)39:e103777https://doi.org/10.15252/embj.2019103777 PDFDownload PDF article text main figures. ToolsAdd favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract Research on (ncRNA) is a rapidly expanding field. Providing an official gene symbol name ncRNA brings order otherwise potential chaos as it allows unambiguous communication about each gene. HUGO Gene Nomenclature Committee (HGNC, www.genenames.org) only group with authority approve symbols genes. HGNC works specialist advisors different classes ensure that nomenclature accurate informative, where possible. Here, we review major class currently annotated genome describe how assigned standardised nomenclature. Introduction (HGNC) under auspices Human Genome Organisation (HUGO) worldwide assigns names (Braschi et al, 2019). unique every essential enable scientific communication, approved should be used ubiquitously research papers, conference talks posters, biomedical databases. endeavours all are supported annotation projects began working mid-1980s approval initial mitochondrial transfer (tRNA) Since then, have worked closely experts field develop many kinds number has named per shown Fig 1, ranges from over 4,500 long (lncRNA) 1,900 microRNA genes, just four vault Y classes. Every Symbol Report our website, www.genenames.org, which displays symbol, name, chromosomal location also includes links key resources such Ensembl (Zerbino 2018), NCBI (O'Leary 2016) GeneCards (Stelzer 2016). We collaborate directly these databases and, importantly, always use primary Due relative completeness set, data been chosen canonical dataset RNAcentral database (The Consortium, 2019), sequence resource. For microRNAs, work resource miRBase (Kozomara tRNAs, GtRNAdb (Chan Lowe, display relevant Report. Where available, lncRNAs provide LNCipedia (Volders lncRNA (Box 1). Figure 1. type ncRNAA full list locus types, along numbers category, can found at Statistics Downloads webpage (https://www.genenames.org/download/statistics-and-files/). Download figure PowerPoint Box Useful Resource URL Description https://rnacentral.org/ Centralised sequences collated expert member databases, model organism accession http://www.mirbase.org/ Searchable annotations. Also hosts registry researchers submit prospective new microRNAs http://gtrnadb.ucsc.edu/ genomic tRNA database, contains predicted tRNAscan-SE program species snoRNABase https://www-snorna.biotoul.fr/ Database snoRNA genes; useful but no longer being updated https://lncipedia.org/ manually curated articles http://www.ensembl.org/ browser vertebrate genomes GENCODE models mouse https://www.ncbi.nlm.nih.gov/gene/ Integrated related information genomes. Incudes RefSeq manual ncRNA, host pages www.genenames.org—a URLs Table types These follow hierarchical structure browsed starting highest-level page labelled "Non-coding RNAs" Non-coding RNAs https://www.genenames.org/data/genegroup/#!/group/475 Overview project. Can point browse through ncRNAs MicroRNAs https://www.genenames.org/data/genegroup/#!/group/476 Starting split into families not defined family listed first MicroRNA https://www.genenames.org/data/genegroup/#!/group/1690 protein coding subgroups Transfer https://www.genenames.org/data/genegroup/#!/group/478 "Mitochondrially encoded "Cytoplasmic (this subsets pseudogenes" "Low confidence cytoplasmic RNAs") Small nuclear https://www.genenames.org/data/genegroup/#!/group/1819 Lists small variant snRNA subgroup nucleolar https://www.genenames.org/data/genegroup/#!/group/844 snoRNAs "Small Cajal body-specific RNAs", RNAs, C/D box" H/ACA https://www.genenames.org/data/genegroup/#!/group/1838 Ribosomal https://www.genenames.org/data/genegroup/#!/group/848 ribosomal further subtypes rRNAs Vault https://www.genenames.org/data/genegroup/#!/group/852 Full https://www.genenames.org/data/genegroup/#!/group/853 NF90 (ILF3) associated https://www.genenames.org/data/genegroup/#!/group/1624 SNAR Long https://www.genenames.org/data/genegroup/#!/group/788 Divided subgroups: intergenic non-protein Overlapping transcripts, Intronic Antisense Divergent non-systematic symbols, FAM root aim paper was overview named, well them. Each section written collaboration class: Manchester Cruz (snRNAs), Leipzig (rRNAs), NIH Science Shaghai RNAs. finish outlining recommendations circular intronic lacking transcripts ~ 22 nucleotides mediate post-transcriptional regulation via direct binding messenger (mRNA) molecules. In animal cells, (miRNA) usually transcribed (pri-miRNAs), processed Drosha microprocessor complex precursor hairpin stem-loop (pre-miRNAs). hairpins exported nucleus cytoplasm, cleaved Dicer enzyme produce nt duplex. One strand duplex associates Argonaute (AGO) ribonucleoprotein (miRNP) binds sites mRNAs complementary miRNA sequence, 3′ untranslated region (UTR). Ago-miRNP then recruits other proteins, typically either degradation or translational repression mRNA [for review, see (Bartel, 2018)]. Approximately 60% bound miRNAs (Friedman 2009), so diverse biological processes across tissue stages life. As such, implicated diseases including rheumatoid arthritis (Guggino deafness (Mencía stroke (Panagal psoriasis (Yan cirrhosis (Fernández-Ramos 2018) several forms cancer (Kwok 2017). "microRNA" reflect size active molecule agreed upon three Caenorhabditis elegans groups published same 2001 issue (Lagos-Quintana 2001; Lau Lee Ambros, 2001). Once started expand, came together publish guidelines (Ambros 2003), Registry founded were mistakenly (Griffiths-Jones, 2004). evolved dedicated online miRBase, continued responsible providing identifiers acting publications Researchers mature publicly after manuscript acceptance. format "mir-#" "miR-#" followed sequential reflects submission database. approves MIR#; example, 2 2, MIR17 represents gene, mir-17 stem-loop, miR-17 miRNA. However, complete extent transcript often known, entity entry frequently length miRNA, rather than transcript. encode identical miRNAs, identifier hyphenated numerical suffix; e.g., MIR1-1 MIR1-2 distinct loci miRNAs. paralogous differ one two nucleotides, letter suffix, e.g. MIR10A MIR10B. does accept any requests must go (please http://www.mirbase.org/registry.shtml). 2. provides nomenclature: highlighted here there link "MIR17 page"; out report miRBase; possible ortholog MGI rat RGD part cluster hosted within intron MIR17HG (miR-17-92a-1 gene)The gene; structure; microRNA, interacts AGO form AGO/miRNA silencing complex. accordance even though sometimes proteins therefore might considered separate sense. introns, less exons, (Fig 2). listing (Table 1), conventions discussed below. Recently, few ideas "improve" nomenclature, correcting particular show evolutionary relationships (e.g. Desvignes 2015; Fromm Budak advisors, understand desire perfect systems once becomes available. At time, experience taught us revised fully adopted may cause considerable confusion community. It appropriate find ways represent between maintain stable symbols. recently based publications. "MicroRNA MIR1/206 family" members MIR1-1, MIR206. miR-206 already 600 would unhelpful try alter symbol. MIR206 now page, corresponding Family MIPF0000038 lists orthologous species. possible, Reports genenames.org orthologs, Mouse Genomic (http://www.informatics.jax.org/) Rat (https://rgd.mcw.edu/), characterised 60 years ago (Hoagland 1958). term "transfer" (Smith 1959) function transferring amino acids cytosol cell ribosome bonded peptide according translated. Typical tRNAs vary 73 93 (Rich RajBhandary, 1976) distinctive cloverleaf secondary folds L-shaped tertiary (Kim 1973). end CCA acceptor site acid (Hou, 2010) loop three-nucleotide anticodon precisely pairs codons Watson-Crick base codon, while third nucleotide "wobble" pairing recognise codon. Post-transcriptional modifications position influence codon (Agris 2018). share characteristics make predict them sequence. (GtRNAdb) sets thousands Eukaryota, Archaea Bacteria, set 429 high most current reference genome, GRCh38. predictions made using analysis pipeline (Lowe Chan, 2016), uses probabilistic "covariance models" determine functional identity (i.e. isotype anticodon) putative undergo comparison isotype-specific covariance give confirmation classification. ID tRNA-[three code]-[anticodon]-[GtRNAdb identifier], tRNA-Ala-AGC-1-1. (Note "GtRNAdb identifier" actually up numbers, "transcript ID", second "locus multiple producing ID, numbers; Ala-AGC-1-1 Ala-AGC-1-2 whereas Ala-AGC-2-1 Ala-AGC-3-1 transcripts.) slightly condensed equivalent TR[one code]-[anticodon][GtRNAdb TRA-AGC1-1 3). predicts pseudogenes candidate include atypical features and/or capable translation. To sets, "Cytosolic cytosolic transfer" "Transfer genenames.org" 3. An explaining what (Anderson 1981) both non-canonical structures translation ribosomes mitochondria. While pathological mutations yet discovered, variety well-studied MELAS (mitochondrial encephalomyopathy, lactic acidosis stroke-like episodes) MERRF (myoclonic epilepsy ragged red fibres) (Suzuki Nagao, 2011; Abbott 2014). Mitochondrial MitoMap (Lott 2013); "MT-T + code"; MT-TA alanine. Most decoded tRNA, leucine serine genes—these distinguish individual loci: MT-TL1, MT-TL2, MT-TS1 MT-TS2. abundant around 150 stem (Matera 2007). cellular location, "U" stems historical "U-RNA" derived early observations their uridine content (Hodnett Busch, 1968). U-RNAs numbered apparent abundance when discovered (Chen Moore, 2015). Some subsequently (snoRNAs) resulting following numbering snRNAs: U1, U2, U4, U5, U6, U7, U11 U12. snRNAs involved splicing introns pre-mRNA minor spliceosome. spliceosome U5 U6 snRNPs, plus non-snRNP performs U2-type introns. U1 U2 snRNPs assemble joined preassembled U4/U6.U5 tri-snRNP. This series rearrangements formation U2/U6 catalytic core reaction (Anokhina 2013), finally release spliced disassembly splices U12-type < 0.5% (Turunen 2013). spliceosome, contrast consists U11, U12, U4atac U6atac, analogs U4 snRNAs. Minor fold similar snRNAs, limited similarity (Will Lührmann, 2005). "atac" U6atac refers AT/AC splice (Tarn Steitz, 1996). Instead splicing, U7 processing histone downstream element recruiting some shared (Strub 1984; Marz polymerase II, exception III (Singh Reddy, 1989; Younis All "RNU" "RNA, U# nuclear". GRCh38 U1-encoding RNU1-1, RNU1-2, RNU1-3 RNU1-4, although individuals 30 copies tandemly repeated (Lund Dahlberg, 1984). single (RNU2-1), resides 6 kb organised tandem array 10–20 (Van Arsdell Weiner, (RNU7-1), (RNU11), U12 (RNU12), (RNU4ATAC) (RNU6ATAC) There five RNU4-1, RNU6-2, literature (Sontheimer 1992): RNU5A-1, RNU5B-1, RNU5D-1, RNU5E-1 RNU5F-1. 1,000 divergent (Vazquez-Arango O'Reilly, presumed unexpressed pseudogenes. case family, present 1q21.1 expressed, proce
Language: Английский