BMC Bioinformatics,
Journal Year:
2019,
Volume and Issue:
20(1)
Published: Aug. 28, 2019
Mining
epistatic
loci
which
affects
specific
phenotypic
traits
is
an
important
research
issue
in
the
field
of
biology.
Bayesian
network
(BN)
a
graphical
model
can
express
relationship
between
genetic
and
phenotype.
Until
now,
it
has
been
widely
used
into
epistasis
mining
many
work.
However,
this
method
two
disadvantages:
low
learning
efficiency
easy
to
fall
local
optimum.
Genetic
algorithm
excellence
rapid
global
search
avoiding
falling
It
scalable
integrate
with
other
algorithms.
This
work
proposes
approach
based
on
tabu
(Epi-GTBN).
uses
heuristic
strategy
network.
The
individual
structure
be
evolved
through
operations
selection,
crossover
mutation.
help
find
optimal
structure,
then
further
mine
effectively.
In
order
enhance
diversity
population
obtain
more
effective
solution,
we
use
mutation
algorithm.
accelerate
convergence
We
compared
Epi-GTBN
recent
algorithms
using
both
simulated
real
datasets.
experimental
results
demonstrate
that
our
much
better
detection
accuracy
case
not
affecting
for
different
presented
methodology
(Epi-GTBN)
detection,
seen
as
interesting
addition
arsenal
complex
analyses.
IEEE/ACM Transactions on Computational Biology and Bioinformatics,
Journal Year:
2013,
Volume and Issue:
10(2), P. 361 - 371
Published: March 1, 2013
Genetic
association
is
a
challenging
task
for
the
identification
and
characterization
of
genes
that
increase
susceptibility
to
common
complex
multifactorial
diseases.
To
fully
execute
genetic
studies
diseases,
modern
geneticists
face
challenge
detecting
interactions
between
loci.
A
algorithm
(GA)
developed
detect
genotype
frequencies
cancer
cases
noncancer
based
on
statistical
analysis.
An
improved
(IGA)
proposed
improve
reliability
GA
method
high-dimensional
SNP-SNP
interactions.
The
strategy
offers
top
five
results
random
population
process,
in
which
they
guide
toward
significant
search
course.
IGA
increases
likelihood
quickly
maximum
ratio
difference
cases.
study
systematically
evaluates
joint
effect
23
SNP
combinations
six
steroid
hormone
metabolisms,
signaling-related
involved
breast
carcinogenesis
pathways
were
evaluated,
with
successfully
differences
possible
risks
subsequently
analyzed
by
odds-ratio
(OR)
risk-ratio
estimated
OR
best
barcode
significantly
higher
than
1
(between
1.15
7.01)
specific
two
13
SNPs.
Analysis
support
provides
values
over
3-SNP
13-SNP
more
interaction
profile
risk
also
provided.
IEEE/ACM Transactions on Computational Biology and Bioinformatics,
Journal Year:
2016,
Volume and Issue:
15(2), P. 599 - 612
Published: Dec. 2, 2016
In
this
era
of
genome-wide
association
studies
(GWAS),
the
quest
for
understanding
genetic
architecture
complex
diseases
is
rapidly
increasing
more
than
ever
before.
The
development
high
throughput
genotyping
and
next
generation
sequencing
technologies
enables
epidemiological
analysis
large
scale
data.
These
advances
have
led
to
identification
a
number
single
nucleotide
polymorphisms
(SNPs)
responsible
disease
susceptibility.
interactions
between
SNPs
associated
with
are
increasingly
being
explored
in
current
literature.
interaction
mathematically
challenging
computationally
complex.
challenges
been
addressed
by
data
mining
machine
learning
approaches.
This
paper
reviews
methods
related
software
packages
detect
SNP
that
contribute
diseases.
issues
need
be
considered
when
developing
these
models
review.
also
achievements
simulation
evaluate
performance
models.
Further,
it
discusses
future
analysis.
2022 International Joint Conference on Neural Networks (IJCNN),
Journal Year:
2017,
Volume and Issue:
unknown, P. 2743 - 2750
Published: May 1, 2017
This
paper
presents
a
novel
approach
based
on
the
analysis
of
genetic
variants
from
publicly
available
profiles
and
manually
curated
database,
National
Human
Genome
Research
Institute
Catalog.
Using
data
science
techniques,
are
identified
in
collected
participant
then
indexed
as
risk
Indexed
or
Single
Nucleotide
Polymorphisms
used
inputs
various
machine
learning
algorithms
for
prediction
obesity.
Body
mass
index
status
participants
is
divided
into
two
classes,
Normal
Class
Risk
Class.
Dimensionality
reduction
tasks
performed
to
generate
set
principal
variables
-
13
SNPs
application
methods.
The
models
evaluated
using
receiver
operator
characteristic
curves
area
under
curve.
Machine
techniques
including
gradient
boosting,
generalized
linear
model,
classification
regression
trees,
k-nearest
neighbours,
support
vector
machines,
random
forest
multilayer
perceptron
neural
network
comparatively
assessed
terms
their
ability
identify
most
important
factors
among
initial
6622
describing
variants,
age
gender,
classify
subject
one
body
related
classes
defined
this
study.
Our
simulation
results
indicated
that
generated
highest
curve
value
90.5%.
Genome Medicine,
Journal Year:
2009,
Volume and Issue:
1(2), P. 28 - 28
Published: Jan. 1, 2009
Despite
the
recent
success
of
genome-wide
association
studies
(GWASs)
in
identifying
loci
consistently
associated
with
coronary
artery
disease
(CAD),
a
large
proportion
genetic
components
CAD
and
its
metabolic
risk
factors,
including
plasma
lipids,
type
2
diabetes
body
mass
index,
remain
unattributed.
Gene-gene
gene-environment
interactions
might
produce
meaningful
improvement
quantification
determinants
CAD.
Testing
for
gene-gene
is
thus
new
frontier
large-scale
GWASs
There
are
several
anecdotal
examples
monogenic
susceptibility
to
which
phenotype
was
worsened
by
an
adverse
environment.
In
addition,
small-scale
candidate
gene
functional
hypotheses
have
identified
interactions.
For
future
evaluation
achieve
same
as
single
associations
reported
GWASs,
it
will
be
important
pre-specify
agreed
standards
study
design
statistical
power,
environmental
exposure
measurement,
phenomic
characterization
analytical
strategies.
Here
we
discuss
these
issues,
particularly
relation
investigation
potential
clinical
utility
PLoS ONE,
Journal Year:
2012,
Volume and Issue:
7(5), P. e37018 - e37018
Published: May 18, 2012
Possible
single
nucleotide
polymorphism
(SNP)
interactions
in
breast
cancer
are
usually
not
investigated
genome-wide
association
studies.
Previously,
we
proposed
a
particle
swarm
optimization
(PSO)
method
to
compute
these
kinds
of
SNP
interactions.
However,
this
PSO
does
guarantee
find
the
best
result
every
implement,
especially
when
high-dimensional
data
is
for
SNP-SNP
interactions.In
study,
propose
IPSO
algorithm
improve
reliability
identification
protective
barcodes
(SNP
combinations
and
genotypes
with
maximum
difference
between
cases
controls)
associated
cancer.
containing
different
numbers
SNPs
were
computed.
The
top
five
barcode
results
retained
computing
next
one-SNP-increase
each
processing
step.
Based
on
simulated
23
six
steroid
hormone
metabolisms
signalling-related
genes,
performance
our
evaluated.
Among
SNPs,
13
displayed
significant
odds
ratio
(OR)
values
(1.268
0.848;
p<0.05)
algorithm,
jointed
effect
terms
two
seven
show
significantly
decreasing
OR
(0.84
0.57;
p<0.05
0.001).
Using
four
0.77;
20
simulations,
medians
differences
generated
by
higher
than
PSO.
interquartile
ranges
boxplot,
as
well
upper
lower
hinges
n-SNP
(n
=
3∼10)
more
narrow
PSO,
suggesting
that
highly
reliable
identification.Overall,
robust
provide
exact
International Journal of Proteomics,
Journal Year:
2014,
Volume and Issue:
2014, P. 1 - 22
Published: Dec. 11, 2014
During
the
past,
there
was
a
massive
growth
of
knowledge
unknown
proteins
with
advancement
high
throughput
microarray
technologies.
Protein
function
prediction
is
most
challenging
problem
in
bioinformatics.
In
homology
based
approaches
were
used
to
predict
protein
function,
but
they
failed
when
new
different
from
previous
one.
Therefore,
alleviate
problems
associated
traditional
approaches,
numerous
computational
intelligence
techniques
have
been
proposed
recent
past.
This
paper
presents
state-of-the-art
comprehensive
review
various
for
predictions
using
sequence,
structure,
protein-protein
interaction
network,
and
gene
expression
data
wide
areas
applications
such
as
DNA
RNA
binding
sites,
subcellular
localization,
enzyme
functions,
signal
peptides,
catalytic
residues,
nuclear/G-protein
coupled
receptors,
membrane
proteins,
pathway
analysis
datasets.
also
summarizes
result
obtained
by
many
researchers
solve
these
appropriate
datasets
improve
performance.
The
summary
shows
that
ensemble
classifiers
integration
multiple
heterogeneous
are
useful
prediction.
Annals of Translational Medicine,
Journal Year:
2019,
Volume and Issue:
7(24), P. 813 - 813
Published: Dec. 1, 2019
Identified
genetic
variants
from
genome
wide
association
studies
frequently
show
only
modest
effects
on
the
disease
risk,
leading
to
"missing
heritability"
problem.
An
avenue,
account
for
a
part
of
this
"missingness"
is
evaluate
gene-gene
interactions
(epistasis)
thereby
elucidating
their
effect
complex
diseases.
This
can
potentially
help
with
identifying
gene
functions,
pathways,
and
drug
targets.
However,
exhaustive
evaluation
all
possible
among
millions
single
nucleotide
polymorphisms
(SNPs)
raises
several
issues,
otherwise
known
as
"curse
dimensionality".
The
dimensionality
involved
in
epistatic
analysis
such
exponentially
growing
SNPs
diminishes
usefulness
traditional,
parametric
statistical
methods.
With
immense
popularity
multifactor
reduction
(MDR),
non-parametric
method,
proposed
2001,
that
classifies
multi-dimensional
genotypes
into
one-
dimensional
binary
approaches,
led
emergence
fast-growing
collection
methods
were
based
MDR
approach.
Moreover,
machine-learning
(ML)
random
forests
neural
networks
(NNs),
deep-learning
(DL)
hybrid
approaches
have
also
been
applied
profusely,
recent
years,
tackle
issue
associated
whole
interaction
studies.
searching
or
variable
selection
ML
methods,
still
pose
risk
missing
out
relevant
SNPs.
Furthermore,
interpretability
issues
are
major
hindrance
DL
To
minimize
loss
information,
Python
tools
PySpark
take
advantage
distributed
computing
resources
cloud,
bring
back
smaller
subsets
data
further
local
analysis.
Parallel
be
powerful
resource
stands
fight
"curse".
supports
standard
libraries
C
extensions
thus
making
it
convenient
write
codes
deliver
dramatic
improvements
processing
speed
extraordinarily
large
sets
data.
Genome biology,
Journal Year:
2024,
Volume and Issue:
25(1)
Published: Nov. 19, 2024
Epistasis
refers
to
changes
in
the
effect
on
phenotype
of
a
unit
genetic
information,
such
as
single
nucleotide
polymorphism
or
gene,
dependent
context
other
units.
Such
interactions
are
both
biologically
plausible
and
good
candidates
explain
observations
which
not
fully
explained
by
an
additive
heritability
model.
However,
search
for
epistasis
has
so
far
largely
failed
recover
this
missing
heritability.
We
identify
key
challenges
propose
that
future
works
need
leverage
idealized
systems,
known
biology
even
previously
identified
epistatic
interactions,
order
guide
new
interactions.