Genome biology,
Journal Year:
2023,
Volume and Issue:
24(1)
Published: March 27, 2023
Abstract
Background
The
largest
sequence-based
models
of
transcription
control
to
date
are
obtained
by
predicting
genome-wide
gene
regulatory
assays
across
the
human
genome.
This
setting
is
fundamentally
correlative,
as
those
exposed
during
training
solely
sequence
variation
between
genes
that
arose
through
evolution,
questioning
extent
which
capture
genuine
causal
signals.
Results
Here
we
confront
predictions
state-of-the-art
regulation
against
data
from
two
large-scale
observational
studies
and
five
deep
perturbation
assays.
most
advanced
these
models,
Enformer,
large,
captures
determinants
promoters.
However,
fail
effects
enhancers
on
expression,
notably
in
medium
long
distances
particularly
for
highly
expressed
More
generally,
predicted
impact
distal
elements
expression
small
ability
correctly
integrate
long-range
information
significantly
more
limited
than
receptive
fields
suggest.
likely
caused
escalating
class
imbalance
actual
candidate
distance
increases.
Conclusions
Our
results
suggest
have
point
silico
study
promoter
regions
variants
can
provide
meaningful
insights
practical
guidance
how
use
them.
Moreover,
foresee
it
will
require
new
kinds
train
accurately
accounting
elements.
Nucleic Acids Research,
Journal Year:
2021,
Volume and Issue:
50(D1), P. D988 - D995
Published: Oct. 19, 2021
Ensembl
(https://www.ensembl.org)
is
unique
in
its
flexible
infrastructure
for
access
to
genomic
data
and
annotation.
It
has
been
designed
efficiently
deliver
annotation
at
scale
all
eukaryotic
life,
it
also
provides
deep
comprehensive
key
species.
Genomes
representing
a
greater
diversity
of
species
are
increasingly
being
sequenced.
In
response,
we
have
focussed
our
recent
efforts
on
expediting
the
new
assemblies.
Here,
report
release
greatest
annual
number
newly
annotated
genomes
history
via
dedicated
Rapid
Release
platform
(http://rapid.ensembl.org).
We
developed
method
generate
comparative
analyses
these
assemblies
and,
first
time,
non-vertebrate
eukaryotes.
Meanwhile,
continually
improve,
extend
update
high-value
reference
vertebrate
details
here.
range
specific
software
tools
tasks,
such
as
Variant
Effect
Predictor
(VEP)
interface
Recoder.
All
data,
freely
available
download
accessible
programmatically.
Nature Methods,
Journal Year:
2023,
Volume and Issue:
20(9), P. 1355 - 1367
Published: July 13, 2023
Abstract
Joint
profiling
of
chromatin
accessibility
and
gene
expression
in
individual
cells
provides
an
opportunity
to
decipher
enhancer-driven
regulatory
networks
(GRNs).
Here
we
present
a
method
for
the
inference
GRNs,
called
SCENIC+.
SCENIC+
predicts
genomic
enhancers
along
with
candidate
upstream
transcription
factors
(TFs)
links
these
target
genes.
To
improve
both
recall
precision
TF
identification,
curated
clustered
motif
collection
more
than
30,000
motifs.
We
benchmarked
on
diverse
datasets
from
different
species,
including
human
peripheral
blood
mononuclear
cells,
ENCODE
cell
lines,
melanoma
states
Drosophila
retinal
development.
Next,
exploit
predictions
study
conserved
TFs,
GRNs
between
mouse
types
cerebral
cortex.
Finally,
use
dynamics
regulation
differentiation
trajectories
effect
perturbations
state.
is
available
at
scenicplus.readthedocs.io
.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2022,
Volume and Issue:
unknown, P. 16123 - 16134
Published: June 1, 2022
Vision
Transformers
(ViTs)
and
their
multi-scale
hierarchical
variations
have
been
successful
at
capturing
image
representations
but
use
has
generally
studied
for
low-resolution
images
(e.g.
256
×
256,
384
384).
For
gigapixel
whole-slide
imaging
(WSI)
in
computational
pathology,
WSIs
can
be
as
large
150000
pixels
20
magnification
exhibit
a
structure
of
visual
tokens
across
varying
resolutions:
from
16
individual
cells,
to
4096
characterizing
interactions
within
the
tissue
microenvironment.
We
introduce
new
ViT
architecture
called
Hierarchical
Image
Pyramid
Transformer
(HIPT),
which
leverages
natural
inherent
using
two
levels
self-supervised
learning
learn
high-resolution
representations.
HIPT
is
pretrained
33
cancer
types
10,678
WSIs,
408,218
images,
104M
images.
benchmark
on
9
slide-level
tasks,
demonstrate
that:
1)
with
pretraining
outperforms
current
state-of-the-art
methods
subtyping
survival
prediction,
2)
ViTs
are
able
model
important
inductive
biases
about
phenotypes
tumor
Nucleic Acids Research,
Journal Year:
2023,
Volume and Issue:
52(D1), P. D891 - D899
Published: Nov. 11, 2023
Abstract
Ensembl
(https://www.ensembl.org)
is
a
freely
available
genomic
resource
that
has
produced
high-quality
annotations,
tools,
and
services
for
vertebrates
model
organisms
more
than
two
decades.
In
recent
years,
there
been
dramatic
shift
in
the
landscape,
with
large
increase
number
phylogenetic
breadth
of
reference
genomes,
alongside
major
advances
pan-genome
representations
higher
species.
order
to
support
these
efforts
accelerate
downstream
research,
continues
focus
on
scaling
rapid
annotation
new
genome
assemblies,
developing
methods
comparative
analysis,
expanding
depth
quality
our
annotations.
This
year
we
have
continued
expansion
global
biodiversity
doubling
annotated
genomes
Rapid
Release
site
over
1700,
driven
by
close
collaboration
projects
such
as
Darwin
Tree
Life.
We
also
strengthened
key
agricultural
species,
including
first
regulatory
builds
farmed
animals,
updated
tools
resources
scientific
community,
notably
Variant
Effect
Predictor.
data,
software,
are
available.