Genome Biology and Evolution,
Год журнала:
2024,
Номер
16(12)
Опубликована: Дек. 1, 2024
Abstract
Proteins
that
emerge
de
novo
from
noncoding
DNA
could
negatively
or
positively
influence
cellular
physiology
in
the
sense
of
providing
a
possible
adaptive
advantage.
Here,
we
employ
two
approaches
to
study
such
effects
human
cell
line
by
expressing
random
sequences
and
mouse
genes
lack
homologs
genome.
We
show
both
lead
differential
growth
clones
dependent
on
they
express.
For
sequences,
53%
decreased
frequency,
about
8%
increased
frequency
joint
experiment.
Of
14
tested
similar
experiment,
10
decreased,
3
frequency.
When
individually
analysed,
each
gene
triggers
unique
transcriptomic
response
cells,
indicating
mostly
specific
rather
than
generalized
effects.
Structural
analysis
open
reading
frames
(ORFs)
reveals
range
intrinsic
disorder
scores
and/or
foldability
into
alpha-helices
beta
sheets,
but
these
do
not
correlate
with
their
cells.
Our
results
indicate
evolved
ORFs
easily
become
integrated
regulatory
pathways,
since
most
interact
components
pathways
therefore
directly
subject
positive
selection
if
general
conditions
allow
this.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Янв. 27, 2024
Abstract
Recent
studies
reveal
that
de
novo
gene
origination
from
previously
non-genic
sequences
is
a
common
mechanism
for
innovation.
These
young
genes
provide
an
opportunity
to
study
the
structural
and
functional
origins
of
proteins.
Here,
we
combine
high-quality
base-level
whole-genome
alignments
computational
modeling
origination,
evolution,
protein
structures
lineage-specific
genes.
We
identify
555
candidates
in
D.
melanogaster
originated
within
Drosophilinae
lineage.
Sequence
composition,
evolutionary
rates,
expression
patterns
indicate
possible
gradual
or
adaptive
shifts
with
their
ages.
Surprisingly,
find
little
overall
changes
several
potentially
well-folded
structures.
Ancestral
sequence
reconstruction
analysis
reveals
most
are
often
born
well-folded.
Single-cell
RNA-seq
testis
shows
although
enriched
spermatocytes,
biased
towards
early
spermatogenesis
stage,
indicating
important
but
less
emphasized
roles
germline
cells
testis.
This
provides
systematic
overview
origin,
-specific
Genome Biology and Evolution,
Год журнала:
2024,
Номер
16(6)
Опубликована: Май 16, 2024
Abstract
Recent
studies
in
the
rice
genome-wide
have
established
that
de
novo
genes,
evolving
from
noncoding
sequences,
enhance
protein
diversity
through
a
stepwise
process.
However,
pattern
and
rate
of
their
evolution
structure
over
time
remain
unclear.
Here,
we
addressed
these
issues
within
surprisingly
short
evolutionary
timescale
(<1
million
years
for
97%
Oryza
genes)
with
comparative
approaches
to
gene
duplicates.
We
found
genes
evolve
faster
than
duplicates
intrinsically
disordered
regions
(such
as
random
coils),
secondary
elements
α
helix
β
strand),
hydrophobicity,
molecular
recognition
features.
In
proteins,
specifically,
observed
an
8%
14%
decay
coils
region
lengths
2.3%
6.5%
increase
structured
elements,
features,
per
on
average.
These
patterns
structural
align
changes
amino
acid
composition
well.
also
revealed
higher
positive
charges
but
smaller
weights
proteins
Tertiary
predictions
showed
most
though
not
typically
well
folded
own,
readily
form
low-energy
compact
complexes
other
facilitated
by
extensive
residue
contacts
conformational
flexibility,
suggesting
faster-binding
scenario
promote
interaction.
analyses
illuminate
rapid
genomes,
originating
highlighting
quick
transformation
into
active,
complex-forming
components
remarkably
timeframe.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Фев. 18, 2024
Microproteins
encoded
by
small
open
reading
frames
(smORFs)
comprise
the
"dark
matter"
of
proteomes.
Although
functional
microproteins
were
identified
in
diverse
organisms
from
all
three
domains
life,
bacterial
smORFs
remain
poorly
characterized.
In
this
comprehensive
study
intergenic
(ismORFs,
15-70
codons)
5,668
genomes
family
Enterobacteriaceae,
we
67,297
clusters
ismORFs
subject
to
purifying
selection.
The
mainly
code
for
hydrophobic,
potentially
transmembrane,
unstructured,
or
minimally
structured
microproteins.
Using
AlphaFold
Multimer,
predicted
interactions
some
transcribed
with
proteins
neighboring
genes,
revealing
potential
regulate
activity
various
proteins,
particularly,
under
stress.
We
compiled
a
catalog
microprotein
families
different
levels
evidence
synteny
analysis,
structure
prediction,
and
transcription
translation
data.
This
offers
resource
investigation
biological
functions
Genome Biology and Evolution,
Год журнала:
2024,
Номер
16(4)
Опубликована: Апрель 1, 2024
De
novo
genes
emerge
from
previously
noncoding
stretches
of
the
genome.
Their
encoded
de
proteins
are
generally
expected
to
be
similar
random
sequences
and,
accordingly,
with
no
stable
tertiary
fold
and
high
predicted
disorder.
However,
structural
properties
whether
they
differ
during
stages
emergence
fixation
have
not
been
studied
in
depth
rely
heavily
on
predictions.
Here
we
generated
a
library
short
human
putative
varying
lengths
ages
sorted
candidates
according
their
compactness
disorder
propensity.
Using
Förster
resonance
energy
transfer
combined
Fluorescence-activated
cell
sorting,
were
able
screen
for
most
compact
protein
structures,
as
well
elongated
flexible
structures.
We
find
that
average
slightly
shorter
contain
lower
than
less
ones.
The
structures
least
correspond
expectations
more
secondary
structure
content
or
higher
content,
respectively.
Our
experiments
indicate
older
propensity
compared
young
discuss
possible
evolutionary
scenarios
implications
underlying
age-dependencies
proteins.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Янв. 24, 2024
Abstract
De
novo
genes
emerge
from
previously
non-coding
stretches
of
the
genome.
Their
en-coded
de
proteins
are
generally
expected
to
be
similar
random
sequences
and,
accordingly,
with
no
stable
tertiary
fold
and
high
predicted
disorder.
However,
structural
properties
whether
they
differ
during
stages
emergence
fixation
have
not
been
studied
in
depth
rely
heavily
on
predictions.
Here
we
generated
a
library
short
human
putative
varying
lengths
ages
sorted
candidates
according
their
compactness
disorder
propensity.
Using
Förster
resonance
energy
transfer
(FRET)
combined
Fluorescence-activated
cell
sorting
(FACS)
were
able
screen
for
most
compact
protein
structures,
as
well
elongated
flexible
structures.
Compact
average
slightly
shorter
contain
lower
than
less
ones.
The
structures
least
correspond
expectations
that
more
secondary
structure
content
or
higher
content,
respectively.
Our
experiments
indicate
older
propensity
compared
young
We
discuss
possible
evolutionary
scenarios
implications
underlying
age-dependencies
proteins.
Genome Biology and Evolution,
Год журнала:
2024,
Номер
16(7)
Опубликована: Июнь 16, 2024
Abstract
For
protein
coding
genes
to
emerge
de
novo
from
a
non-genic
DNA,
the
DNA
sequence
must
gain
an
open
reading
frame
(ORF)
and
ability
be
transcribed.
The
newborn
gene
can
further
evolve
accumulate
changes
in
its
sequence.
Consequently,
it
also
elongate
or
shrink
with
time.
Existing
literature
shows
that
older
have
longer
ORF,
but
is
not
clear
if
they
elongated
time
remained
of
same
length
since
their
inception.
To
address
this
question
we
developed
mathematical
model
ORF
elongation
as
Markov-jump
process,
show
ORFs
tend
keep
short
evolutionary
timescales.
We
change
occurs
likely
truncation.
Our
genomics
transcriptomics
data
analyses
seven
Drosophila
melanogaster
populations
are
agreement
model’s
prediction.
conclude
selection
could
facilitate
extension
may
explain
why
were
observed
old
studies
analysing
scales.
Alternatively,
shorter
purged
because
less
yield
functional
proteins.