bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 21, 2024
Here,
we
describe
the
"Obelisks,"
a
previously
unrecognised
class
of
viroid-like
elements
that
first
identified
in
human
gut
metatranscriptomic
data.
"Obelisks"
share
several
properties:
(i)
apparently
circular
RNA
~1kb
genome
assemblies,
(ii)
predicted
rod-like
secondary
structures
encompassing
entire
genome,
and
(iii)
open
reading
frames
coding
for
novel
protein
superfamily,
which
call
"Oblins".
We
find
Obelisks
form
their
own
distinct
phylogenetic
group
with
no
detectable
sequence
or
structural
similarity
to
known
biological
agents.
Further,
are
prevalent
tested
microbiome
metatranscriptomes
representatives
detected
~7%
analysed
stool
(29/440)
~50%
oral
(17/32).
Obelisk
compositions
appear
differ
between
anatomic
sites
capable
persisting
individuals,
continued
presence
over
>300
days
observed
one
case.
Large
scale
searches
29,959
(clustered
at
90%
nucleotide
identity),
examples
from
all
seven
continents
diverse
ecological
niches.
From
this
search,
subset
code
Obelisk-specific
variants
hammerhead
type-III
self-cleaving
ribozyme.
Lastly,
case
bacterial
species
(Streptococcus
sanguinis)
defined
laboratory
strains
harboured
specific
population.
As
such,
comprise
RNAs
have
colonised,
gone
unnoticed
in,
human,
global
microbiomes.
Nature Biotechnology,
Journal Year:
2024,
Volume and Issue:
unknown
Published: April 23, 2024
In
recent
years,
generative
protein
sequence
models
have
been
developed
to
sample
novel
sequences.
However,
predicting
whether
generated
proteins
will
fold
and
function
remains
challenging.
We
evaluate
a
set
of
20
diverse
computational
metrics
assess
the
quality
enzyme
sequences
produced
by
three
contrasting
models:
ancestral
reconstruction,
adversarial
network
language
model.
Focusing
on
two
families,
we
expressed
purified
over
500
natural
with
70-90%
identity
most
similar
benchmark
for
in
vitro
activity.
Over
rounds
experiments,
filter
that
improved
rate
experimental
success
50-150%.
The
proposed
drive
engineering
research
serving
as
helping
select
active
variants
testing.
Bioinformatics,
Journal Year:
2024,
Volume and Issue:
40(4)
Published: March 18, 2024
Abstract
Motivation
Reliable
prediction
of
protein
thermostability
from
its
sequence
is
valuable
for
both
academic
and
industrial
research.
This
problem
can
be
tackled
using
machine
learning
by
taking
advantage
the
recent
blossoming
deep
methods
analysis.
These
facilitate
training
on
more
data
and,
possibly,
enable
development
versatile
predictors
multiple
ranges
temperatures.
Results
We
applied
principle
transfer
to
predict
embeddings
generated
language
models
(pLMs)
an
input
sequence.
used
large
pLMs
that
were
pre-trained
hundreds
millions
known
sequences.
The
such
allowed
us
efficiently
train
validate
a
high-performing
method
over
one
million
sequences
we
collected
organisms
with
annotated
growth
Our
method,
TemStaPro
(Temperatures
Stability
Proteins),
was
CRISPR-Cas
Class
II
effector
proteins
(C2EPs).
Predictions
indicated
sharp
differences
among
groups
C2EPs
in
terms
largely
tune
previously
published
our
newly
obtained
experimental
data.
Availability
implementation
software
related
are
freely
available
https://github.com/ievapudz/TemStaPro
https://doi.org/10.5281/zenodo.7743637.
Microbial Genomics,
Journal Year:
2024,
Volume and Issue:
10(5)
Published: May 8, 2024
Improvements
in
the
accuracy
and
availability
of
long-read
sequencing
mean
that
complete
bacterial
genomes
are
now
routinely
reconstructed
using
hybrid
(i.e.
short-
long-reads)
assembly
approaches.
Complete
allow
a
deeper
understanding
evolution
genomic
variation
beyond
single
nucleotide
variants.
They
also
crucial
for
identifying
plasmids,
which
often
carry
medically
significant
antimicrobial
resistance
genes.
However,
small
plasmids
missed
or
misassembled
by
algorithms.
Here,
we
present
Hybracter
allows
fast,
automatic
scalable
recovery
near-perfect
first
approach.
can
be
run
either
as
assembler
only
assembler.
We
compared
to
existing
automated
tools
diverse
panel
samples
varying
levels
with
manually
curated
ground
truth
reference
genomes.
demonstrate
is
more
accurate
faster
than
gold
standard
Unicycler.
show
long-reads
most
comparable
methods
accurately
recovering
plasmids.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 21, 2024
Here,
we
describe
the
"Obelisks,"
a
previously
unrecognised
class
of
viroid-like
elements
that
first
identified
in
human
gut
metatranscriptomic
data.
"Obelisks"
share
several
properties:
(i)
apparently
circular
RNA
~1kb
genome
assemblies,
(ii)
predicted
rod-like
secondary
structures
encompassing
entire
genome,
and
(iii)
open
reading
frames
coding
for
novel
protein
superfamily,
which
call
"Oblins".
We
find
Obelisks
form
their
own
distinct
phylogenetic
group
with
no
detectable
sequence
or
structural
similarity
to
known
biological
agents.
Further,
are
prevalent
tested
microbiome
metatranscriptomes
representatives
detected
~7%
analysed
stool
(29/440)
~50%
oral
(17/32).
Obelisk
compositions
appear
differ
between
anatomic
sites
capable
persisting
individuals,
continued
presence
over
>300
days
observed
one
case.
Large
scale
searches
29,959
(clustered
at
90%
nucleotide
identity),
examples
from
all
seven
continents
diverse
ecological
niches.
From
this
search,
subset
code
Obelisk-specific
variants
hammerhead
type-III
self-cleaving
ribozyme.
Lastly,
case
bacterial
species
(Streptococcus
sanguinis)
defined
laboratory
strains
harboured
specific
population.
As
such,
comprise
RNAs
have
colonised,
gone
unnoticed
in,
human,
global
microbiomes.