Abstract
Background
RNA-seq
is
a
fundamental
technique
in
genomics,
yet
reference
bias,
where
transcripts
derived
from
non-reference
alleles
are
quantified
less
accurately,
can
undermine
the
accuracy
of
quantification
and
thus
conclusions
made
downstream.
Reference
bias
analysis
has
to
be
explored
complex
polyploid
genomes
despite
evidence
that
they
often
mosaic
wild
relative
introgressions,
which
introduce
blocks
highly
divergent
genes.
Results
Here
we
use
hexaploid
wheat
as
model
polyploid,
using
both
simulated
experimental
data
show
alignment
suffers
widespread
largely
driven
by
introgressed
This
leads
underestimation
gene
expression
incorrect
assessment
homoeologue
balance.
By
incorporating
models
ten
genome
assemblies
into
pantranscriptome
reference,
present
novel
method
reduce
readily
scaled
capture
more
variation
new
transcriptome
becomes
available.
Conclusions
study
shows
presence
introgressions
lead
analysis.
Caution
should
exercised
researchers
non-sample
for
methods,
such
one
presented
here,
considered.
Nucleic Acids Research,
Год журнала:
2023,
Номер
51(W1), С. W207 - W212
Опубликована: Май 5, 2023
g:Profiler
is
a
reliable
and
up-to-date
functional
enrichment
analysis
tool
that
supports
various
evidence
types,
identifier
types
organisms.
The
toolset
integrates
many
databases,
including
Gene
Ontology,
KEGG
TRANSFAC,
to
provide
comprehensive
in-depth
of
gene
lists.
It
also
provides
interactive
intuitive
user
interfaces
ordered
queries
custom
statistical
backgrounds,
among
other
settings.
multiple
programmatic
access
its
functionality.
These
can
be
easily
integrated
into
workflows
external
tools,
making
them
valuable
resources
for
researchers
who
want
develop
their
own
solutions.
has
been
available
since
2007
used
analyse
millions
queries.
Research
reproducibility
transparency
are
achieved
by
maintaining
working
versions
all
past
database
releases
2015.
849
species,
vertebrates,
plants,
fungi,
insects
parasites,
any
organism
through
user-uploaded
annotation
files.
In
this
update
article,
we
introduce
novel
filtering
method
highlighting
Ontology
driver
terms,
accompanied
new
graph
visualizations
providing
broader
context
significant
terms.
As
leading
list
interoperability
service,
offers
resource
genetics,
biology
medical
researchers.
freely
accessible
at
https://biit.cs.ut.ee/gprofiler.
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY,
Год журнала:
2024,
Номер
74(3)
Опубликована: Март 21, 2024
The
field
of
microbial
taxonomy
is
dynamic,
aiming
to
provide
a
stable
and
contemporary
classification
system
for
prokaryotes.
Traditionally,
reliance
on
phenotypic
characteristics
limited
the
comprehensive
understanding
diversity
evolution.
introduction
molecular
techniques,
particularly
DNA
sequencing
genomics,
has
transformed
our
perception
prokaryotic
diversity.
In
past
two
decades,
advancements
in
genome
have
transitioned
from
traditional
methods
genome-based
taxonomic
framework,
not
only
define
species,
but
also
higher
ranks.
As
technology
databases
rapidly
expand,
maintaining
updated
standards
crucial.
This
work
seeks
revise
2018
guidelines
applying
data
taxonomy,
adapting
minimal
recommendations
reflect
technological
progress
during
this
period.
Nature Microbiology,
Год журнала:
2023,
Номер
8(1), С. 174 - 187
Опубликована: Янв. 5, 2023
Elucidating
the
similarity
and
diversity
of
pathogen
effectors
is
critical
to
understand
their
evolution
across
fungal
phytopathogens.
However,
rapid
divergence
that
diminishes
sequence
similarities
between
putatively
homologous
has
largely
concealed
roots
effector
evolution.
Here
we
modelled
structures
26,653
secreted
proteins
from
14
agriculturally
important
phytopathogens,
six
non-pathogenic
fungi
one
oomycete
with
AlphaFold
2.
With
18,000
successfully
predicted
folds,
performed
structure-guided
comparative
analyses
on
two
aspects
evolution:
uniquely
expanded
sequence-unrelated
structurally
similar
(SUSS)
families
common
folds
present
species.
Extreme
expansion
lineage-specific
SUSS
was
found
only
in
several
obligate
biotrophs,
Blumeria
graminis
Puccinia
graminis.
The
highly
were
source
conserved
motifs,
such
as
Y/F/WxC
motif.
We
identified
new
classes
include
known
virulence
factors,
AvrSr35,
AvrSr50
Tin2.
Structural
comparisons
revealed
structural
further
diversify
through
domain
duplications
fusion
disordered
stretches.
Putatively
sub-
neo-functionalized
could
reconverge
regulation,
expanding
functional
pools
infection
cycle.
also
evidence
many
have
originated
ancestral
fungi.
Collectively,
our
study
highlights
diverse
mechanisms
supports
divergent
a
major
force
driving
proteins.
Nucleic Acids Research,
Год журнала:
2022,
Номер
50(W1), С. W670 - W676
Опубликована: Апрель 20, 2022
RSAT
(Regulatory
Sequence
Analysis
Tools)
enables
the
detection
and
analysis
of
cis-regulatory
elements
in
genomic
sequences.
This
software
suite
performs
(i)
de
novo
motif
discovery
(including
from
genome-wide
datasets
like
ChIP-seq/ATAC-seq)
(ii)
sequences
scanning
with
known
motifs,
(iii)
(quality
assessment,
comparisons
clustering),
(iv)
regulatory
variations
(v)
comparative
genomics.
comprises
50
tools.
Six
public
Web
servers
a
teaching
server)
are
offered
to
meet
needs
different
biological
communities.
philosophy
originality
are:
multi-modal
access
depending
on
user
needs,
through
web
forms,
command-line
for
local
installation
programmatic
services,
support
virtually
any
genome
(animals,
bacteria,
plants,
totalizing
over
10
000
genomes
directly
accessible).
Since
2018
NAR
Software
Issue,
we
have
developed
large
REST
API,
extended
additional
external
collections,
enhanced
some
tools
novel
tool
that
builds
or
refine
gene
networks
using
(network-interactions).
The
website
provides
extensive
documentation,
tutorials
published
protocols.
code
is
under
open-source
license
now
hosted
GitHub.
available
at
http://www.rsat.eu/.
Nature Biotechnology,
Год журнала:
2024,
Номер
unknown
Опубликована: Фев. 21, 2024
Abstract
In
the
era
of
biodiversity
genomics,
it
is
crucial
to
ensure
that
annotations
protein-coding
gene
repertoires
are
accurate.
State-of-the-art
tools
assess
genome
measure
completeness
a
repertoire
but
blind
other
errors,
such
as
overprediction
or
contamination.
We
introduce
OMArk,
software
package
relies
on
fast,
alignment-free
sequence
comparisons
between
query
proteome
and
precomputed
families
across
tree
life.
OMArk
assesses
not
only
also
consistency
whole
relative
closely
related
species
reports
likely
contamination
events.
Analysis
1,805
UniProt
Eukaryotic
Reference
Proteomes
with
demonstrated
strong
evidence
in
73
proteomes
identified
error
propagation
avian
annotation
resulting
from
use
fragmented
zebra
finch
reference.
This
study
illustrates
importance
comparing
prioritizing
based
their
quality
measures.
Nucleic Acids Research,
Год журнала:
2023,
Номер
52(D1), С. D1569 - D1578
Опубликована: Окт. 28, 2023
Abstract
PlantPAN
4.0
(http://PlantPAN.itps.ncku.edu.tw/)
is
an
integrative
resource
for
constructing
transcriptional
regulatory
networks
diverse
plant
species.
In
this
release,
the
gene
annotation
and
promoter
sequences
were
expanded
to
cover
115
can
help
users
characterize
evolutionary
differences
similarities
among
cis-regulatory
elements;
furthermore,
system
now
in
identification
of
conserved
non-coding
homologous
genes.
The
updated
transcription
factor
binding
site
repository
contains
3428
nonredundant
matrices
18305
factors;
expansion
helps
exploration
combinational
nucleotide
variants
elements
sequences.
Additionally,
genomic
landscapes
factors
manually
updated,
ChIP-seq
data
sets
derived
from
a
single-cell
green
alga
(Chlamydomonas
reinhardtii)
added.
Furthermore,
statistical
review
graphical
analysis
components
improved
offer
intelligible
information
through
analysis.
These
improvements
included
easy-to-read
experimental
condition
clusters,
searchable
gene-centered
interfaces
regions’
preferences
by
considering
clusters
peak
visualization
all
factors,
20
most
significantly
enriched
ontology
functions
factors.
Thus,
effectively
reconstruct
compare
across
species
experiments.
Nucleic Acids Research,
Год журнала:
2024,
Номер
53(D1), С. D444 - D456
Опубликована: Ноя. 20, 2024
Abstract
InterPro
(https://www.ebi.ac.uk/interpro)
is
a
freely
accessible
resource
for
the
classification
of
protein
sequences
into
families.
It
integrates
predictive
models,
known
as
signatures,
from
multiple
member
databases
to
classify
families
and
predict
presence
domains
significant
sites.
The
database
provides
annotations
over
200
million
sequences,
ensuring
extensive
coverage
UniProtKB,
standard
repository
includes
mappings
several
other
major
resources,
such
Gene
Ontology
(GO),
Protein
Data
Bank
in
Europe
(PDBe)
AlphaFold
Structure
Database.
In
this
publication,
we
report
on
status
(version
101.0),
detailing
new
developments
database,
associated
web
interface
software.
Notable
updates
include
increased
integration
structures
predicted
by
enhanced
description
using
artificial
intelligence.
Over
past
two
years,
more
than
5000
entries
have
been
created.
website
now
offers
access
85
000
its
serves
long-term
archive
retired
databases.
data,
software
tools
are
available.
Nucleic Acids Research,
Год журнала:
2021,
Номер
50(D1), С. D11 - D19
Опубликована: Ноя. 23, 2021
Abstract
The
European
Bioinformatics
Institute
(EMBL-EBI)
maintains
a
comprehensive
range
of
freely
available
and
up-to-date
molecular
data
resources,
which
includes
over
40
resources
covering
every
major
type
in
the
life
sciences.
This
year's
service
update
for
EMBL-EBI
new
PGS
Catalog
AlphaFold
DB,
updates
on
existing
including
COVID-19
Data
Platform,
trRosetta
RoseTTAfold
models
introduced
Pfam
InterPro,
launch
Genome
Integrations
with
Function
Sequence
by
UniProt
Ensembl.
Furthermore,
we
highlight
projects
through
has
contributed
to
development
community-driven
standards
guidelines,
Recommended
Metadata
Biological
Images
(REMBI),
BioModels
Reproducibility
Scorecard.
Training
is
one
EMBL-EBI’s
core
missions
key
component
provision
bioinformatics
services
users:
this
many
improvements
that
have
been
developed
online
training
offering.