Genetics,
Journal Year:
2013,
Volume and Issue:
193(3), P. 877 - 896
Published: Jan. 11, 2013
Cloning
by
somatic
cell
nuclear
transfer
is
an
important
technology,
but
remains
limited
due
to
poor
rates
of
success.
Identifying
genes
supporting
clone
development
would
enhance
our
understanding
basic
embryology,
improve
applications
the
support
greater
establishing
pluripotent
stem
cells,
and
provide
new
insight
into
clinically
determinants
oocyte
quality.
For
first
time,
a
systems
genetics
approach
was
taken
discover
contributing
ability
early
cloned
embryo
development.
This
identified
primary
locus
on
mouse
chromosome
17
potential
loci
chromosomes
1
4.
A
combination
transcriptome
profiling
data,
expression
correlation
analysis,
functional
network
analyses
yielded
short
list
likely
candidate
in
two
categories.
The
major
category-including
with
strongest
genetic
associations
traits
(Epb4.1l3
Dlgap1)-encodes
proteins
associated
subcortical
cytoskeleton
other
cytoskeletal
elements
such
as
spindle.
second
category
encodes
chromatin
transcription
regulators
(Runx1t1,
Smchd1,
Chd7).
Smchd1
promotes
X
inactivation,
whereas
Chd7
regulates
pluripotency
genes.
Runx1t1
has
not
been
these
processes,
acts
transcriptional
repressor.
finding
that
cytoskeleton-associated
may
be
key
highlights
roles
for
cytoplasmic
components
reprogramming.
contribute
overall
process
downstream
effectors.
BMC Bioinformatics,
Journal Year:
2017,
Volume and Issue:
18(1)
Published: Jan. 3, 2017
Feature
selection,
aiming
to
identify
a
subset
of
features
among
possibly
large
set
that
are
relevant
for
predicting
response,
is
an
important
preprocessing
step
in
machine
learning.
In
gene
expression
studies
this
not
trivial
task
several
reasons,
including
potential
temporal
character
data.
However,
most
feature
selection
approaches
developed
microarray
data
cannot
handle
multivariate
without
previous
flattening,
which
results
loss
information.
We
propose
minimum
redundancy
-
maximum
relevance
(TMRMR)
approach,
able
flattening.
the
proposed
approach
we
compute
by
averaging
F-statistic
values
calculated
across
individual
time
steps,
and
between
genes
using
dynamical
warping
approach.
The
method
evaluated
on
three
datasets
from
human
viral
challenge
studies.
Obtained
show
outperforms
alternatives
widely
used
particular,
achieved
improvement
accuracy
34
out
54
experiments,
while
other
methods
outperformed
it
no
more
than
4
experiments.
filter-based
based
criteria.
incorporates
information
combining
relevance,
as
average
value
different
with
redundancy,
employing
As
evident
our
incorporating
into
process
leads
discriminative
features.
Bioinformatics,
Journal Year:
2010,
Volume and Issue:
27(1), P. 1 - 8
Published: Oct. 29, 2010
Abstract
Motivation:
Genome-wide
association
studies
(GWAS)
involving
half
a
million
or
more
single
nucleotide
polymorphisms
(SNPs)
allow
genetic
dissection
of
complex
diseases
in
holistic
manner.
The
common
practice
analyzing
one
SNP
at
time
does
not
fully
realize
the
potential
GWAS
to
identify
multiple
causal
variants
and
predict
risk
disease.
Existing
methods
for
joint
analysis
data
tend
miss
SNPs
that
are
marginally
uncorrelated
with
disease
have
high
false
discovery
rates
(FDRs).
Results:
We
introduce
GWASelect,
statistically
powerful
computationally
efficient
variable
selection
method
designed
tackle
unique
challenges
data.
This
searches
iteratively
over
conditional
on
previously
selected
is
thus
capable
capturing
correlated
as
well
those
A
special
resampling
mechanism
built
into
reduce
positive
findings.
Simulation
demonstrate
GWASelect
performs
under
wide
spectrum
linkage
disequilibrium
patterns
can
be
substantially
than
existing
while
having
lower
FDR.
In
addition,
regression
models
based
yield
accurate
prediction
methods.
advantages
illustrated
Wellcome
Trust
Case-Control
Consortium
(WTCCC)
Availability:
software
implementing
available
http://www.bios.unc.edu/~lin.
Access
WTCCC
data:
http://www.wtccc.org.uk/
Contact:
[email protected]
Supplementary
information:
Bioinformatics
Online.
Frontiers in Genetics,
Journal Year:
2015,
Volume and Issue:
6
Published: Sept. 10, 2015
During
the
past
decade,
findings
of
genome-wide
association
studies
(GWAS)
improved
our
knowledge
and
understanding
disease
genetics.
To
date,
thousands
SNPs
have
been
associated
to
diseases
other
complex
traits.
Statistical
analysis
typically
looks
for
between
a
phenotype
SNP
taken
individually
via
single-locus
tests.
However,
geneticists
admit
this
is
an
oversimplified
approach
tackle
complexity
underlying
biological
mechanisms.
Interaction
SNPs,
namely
epistasis,
must
be
considered.
Unfortunately,
epistasis
detection
gives
rise
analytic
challenges
since
analyzing
every
combination
at
present
impractical
scale.
In
review,
we
will
main
strategies
recently
proposed
detect
epistatic
interactions,
along
with
their
operating
principle.
Some
these
methods
are
exhaustive,
such
as
multifactor
dimensionality
reduction,
likelihood
ratio-based
tests
or
receiver
characteristic
curve
analysis;
some
non-exhaustive,
machine
learning
techniques
(random
forests,
Bayesian
networks)
combinatorial
optimization
approaches
(ant
colony
optimization,
computational
evolution
system).
Applied and Environmental Microbiology,
Journal Year:
2017,
Volume and Issue:
84(1)
Published: Oct. 27, 2017
ABSTRACT
We
present
a
metagenomic
study
of
Lake
Baikal
(East
Siberia).
Two
samples
obtained
from
the
water
column
under
ice
cover
(5
and
20
m
deep)
in
March
2016
have
been
deep
sequenced
reads
assembled
to
generate
metagenome-assembled
genomes
(MAGs)
that
are
representative
microbes
living
this
special
environment.
Compared
with
freshwater
bodies
studied
around
world,
had
an
unusually
high
fraction
Verrucomicrobia
.
Other
groups,
such
as
Actinobacteria
Proteobacteria
,
were
proportions
similar
those
found
other
lakes.
The
(and
probably
cells)
tended
be
small,
presumably
reflecting
extremely
oligotrophic
cold
prevalent
conditions.
novel
lineages
recruiting
very
little
distantly
related
microbes.
Despite
their
novelty,
they
showed
closest
relationship
discovered
by
approaches
lakes
reservoirs.
Some
them
particularly
MAGs
Baltic
Sea,
which,
although
it
is
brackish,
connected
ocean,
much
more
eutrophic,
has
climatological
Many
contained
rhodopsin
genes,
indicating
that,
spite
decreased
light
penetration
allowed
thick
ice/snow
cover,
photoheterotrophy
could
widespread
column,
either
because
enough
penetrates
or
already
adapted
summer
ice-less
SAR11
subtype
I/II
showing
striking
synteny
Pelagibacter
ubique
strains,
well
phage
infecting
bacterium
Polynucleobacter
IMPORTANCE
increasing
number
studies
on
different
bodies,
there
still
missing
component
suffering
long
seasonal
frozen
cycles.
Here,
we
describe
microbial
assemblies
appear
upper
Baikal,
largest
deepest
body
Earth.
This
lake
January
May,
which
generates
conditions
include
inverted
temperature
gradient
(colder
up),
decrease
due
ice,
and,
especially,
snow
open-ocean
high-altitude
than
brackish
systems.
As
expected,
most
reconstructed
others
environments,
like
Sea
Among
them,
was
broad
set
streamlined
small
genomes/intergenic
spacers,
including
new
nonmarine
-like
(subtype
I/II)
genome.
BMC Genomics,
Journal Year:
2017,
Volume and Issue:
18(1)
Published: March 29, 2017
Lactococcus
lactis
is
among
the
most
widely
studied
lactic
acid
bacterial
species
due
to
its
long
history
of
safe
use
and
economic
importance
dairy
industry,
where
it
exploited
as
a
starter
culture
in
cheese
production.In
current
study,
we
report
on
complete
sequencing
16
L.
subsp.
cremoris
genomes.
The
chromosomal
features
these
strains
conjunction
with
14
completely
sequenced,
publicly
available
lactococcal
chromosomes
were
assessed
particular
emphasis
discerning
subspecies
division,
evolution
niche
adaptation.
deduced
pan-genome
was
found
be
closed,
indicating
that
representative
data
sets
employed
for
this
analysis
are
sufficient
fully
describe
genetic
diversity
taxon.Niche
adaptation
appears
play
significant
role
governing
content
each
subspecies,
while
(differential)
genome
decay
redundancy
also
highlighted.
Microbiome,
Journal Year:
2017,
Volume and Issue:
5(1)
Published: May 4, 2017
Fecal
microbiota
transplantation
(FMT)
is
an
effective
treatment
for
recurrent
Clostridium
difficile
infection
and
shows
promise
treating
other
medical
conditions
associated
with
intestinal
dysbioses.
However,
we
lack
a
sufficient
understanding
of
which
microbial
populations
successfully
colonize
the
recipient
gut,
widely
used
approaches
to
study
ecology
FMT
experiments
fail
provide
enough
resolution
identify
that
are
likely
responsible
FMT-derived
benefits.
We
shotgun
metagenomics
together
assembly
binning
strategies
reconstruct
metagenome-assembled
genomes
(MAGs)
from
fecal
samples
single
donor.
then
metagenomic
mapping
track
occurrence
distribution
patterns
donor
MAGs
in
two
recipients.
Our
analyses
revealed
22%
92
highly
complete
bacterial
identified
colonized
remained
abundant
recipients
at
least
8
weeks.
Most
high
colonization
rate
belonged
order
Bacteroidales.
The
vast
majority
those
lacked
evidence
Clostridiales,
success
was
negatively
correlated
number
genes
related
sporulation.
analysis
151
publicly
available
gut
metagenomes
showed
both
were
prevalent,
ones
neither
rare
across
participants
Human
Microbiome
Project.
Although
our
dataset
link
between
taxonomy
ability
given
MAG,
also
belong
same
taxon
different
properties,
highlighting
importance
appropriate
level
explore
functional
basis
targets
cultivation,
hypothesis
generation,
testing
model
systems.
analytical
strategy
adopted
can
genomic
insights
into
may
be
critical
efficacy
due
their
metabolic
guide
cultivation
efforts
investigate
mechanistic
underpinnings
this
procedure
beyond
associations.
BioData Mining,
Journal Year:
2013,
Volume and Issue:
6(1)
Published: March 21, 2013
We
review
the
applicability
of
Bayesian
networks
(BNs)
for
discovering
relations
between
genes,
environment,
and
disease.
By
translating
probabilistic
dependencies
among
variables
into
graphical
models
vice
versa,
BNs
provide
a
comprehensible
modular
framework
representing
complex
systems.
first
describe
network
approach
its
to
understanding
genetic
environmental
basis
then
variety
algorithms
learning
structure
from
observational
data.
Because
their
relevance
real-world
applications,
topics
missing
data
causal
interpretation
are
emphasized.
The
BN
is
exemplified
through
application
population-based
study
bladder
cancer
in
New
Hampshire,
USA.
For
didactical
purposes,
we
intentionally
keep
this
example
simple.
When
applied
complete
records,
find
only
minor
differences
performance
results
different
algorithms.
Subsequent
incorporation
partial
records
EM
algorithm
gives
us
greater
power
detect
relations.
Allowing
structures
that
depart
strict
also
enhances
our
ability
discover
associations
including
gene-gene
(epistasis)
gene-environment
interactions.
While
already
powerful
tools
dissection
disease
generation
prognostic
models,
there
remain
some
conceptual
computational
challenges.
These
include
proper
handling
continuous
unmeasured
factors,
explicit
prior
knowledge,
evaluation
communication
robustness
substantive
conclusions
alternative
assumptions
manifestations.
BMC Bioinformatics,
Journal Year:
2011,
Volume and Issue:
12(1)
Published: March 31, 2011
Gene-gene
epistatic
interactions
likely
play
an
important
role
in
the
genetic
basis
of
many
common
diseases.
Recently,
machine-learning
and
data
mining
methods
have
been
developed
for
learning
relationships
from
data.
A
well-known
combinatorial
method
that
has
successfully
applied
detecting
epistasis
is
Multifactor
Dimensionality
Reduction
(MDR).
Jiang
et
al.
created
a
called
BNMBL
to
learn
Bayesian
network
(BN)
models.
They
compared
MDR
using
simulated
sets.
Each
these
sets
was
generated
model
associates
two
SNPs
with
disease
includes
18
unrelated
SNPs.
For
each
set,
were
used
score
all
2-SNP
models,
learned
significantly
more
correct
In
real
sets,
we
ordinarily
do
not
know
number
influence
phenotype.
may
perform
as
well
if
also
scored
models
containing
than
Furthermore,
other
BN
scoring
criteria
developed.
detect
even
better
BNMBL.
Although
BNs
are
promising
tool
data,
cannot
confidently
use
them
this
domain
until
determine
which
work
best
or
when
try
without
knowledge
model.
We
evaluated
performance
22
28,000
Alzheimer's
GWAS
set.
Our
results
surprising
criterion
large
values
hyperparameter
α
performed
best.
This
at
recall
hardest-to-detect
substantiating
previous
conclude
representing
holds
promise
identifying
variants
particular,
appears
alternatives.