The
identification
of
human
proteins
that
are
amenable
to
pharmacologic
modulation
without
significant
off-target
effects
remains
an
important
unsolved
challenge.
Computational
methods
have
been
devised
identify
features
which
distinguish
between
“druggable”
and
“undruggable”
proteins,
finding
protein
sequence,
tissue
cellular
localization,
biological
role,
position
in
the
protein-protein
interaction
network
all
discriminant
factors.
However,
many
prior
efforts
automate
assessment
druggability
suffer
from
low
performance
or
poor
interpretability.
We
developed
a
neural
network-based
machine
learning
model
capable
generating
sub-scores
based
on
each
four
distinct
categories,
combining
them
form
overall
score.
achieves
excellent
separating
drugged
undrugged
proteome,
with
area
under
receiver
operating
characteristic
(AUC)
0.95.
Our
use
multiple
allows
potential
targets
interest
contributors
druggability,
leading
more
interpretable
holistic
novel
targets.
Journal of Clinical Medicine,
Journal Year:
2022,
Volume and Issue:
11(21), P. 6305 - 6305
Published: Oct. 26, 2022
Objectives:
To
develop
a
machine
learning
(ML)-based
framework
using
red
blood
cell
(RBC)
parameters
for
the
prediction
of
α+-thalassemia
trait
(α+-thal
trait)
and
to
compare
diagnostic
performance
with
conventional
method
single
RBC
parameter
or
combination
parameters.
Methods:
A
retrospective
study
was
conducted
on
possible
couples
at
risk
fetus
hemoglobin
H
(Hb
disease).
Subjects
molecularly
confirmed
normal
status
(not
thalassemia),
α+-thal
trait,
two-allele
α-thalassemia
mutation
were
included.
Clinical
(age
gender)
(Hb,
Hct,
MCV,
MCH,
MCHC,
RDW,
count)
obtained
from
their
antenatal
thalassemia
screen
retrieved
analyzed
method.
The
evaluated.
Results:
In
total,
594
cases
(female/male:
330/264,
mean
age:
29.7
±
6.6
years)
included
in
analysis.
There
229
controls,
160
205
category,
respectively.
ML-derived
model
improved
performance,
giving
sensitivity
80%
specificity
81%.
experimental
results
indicated
that
DeepThal
achieved
better
compared
other
ML-based
methods
terms
independent
test
dataset,
an
accuracy
80.77%,
70.59%,
Matthews
correlation
coefficient
(MCC)
0.608.
Of
all
parameters,
MCH
<
28.95
pg
as
had
highest
predicting
AUC
0.857
95%
CI
0.816−0.899.
derived
binary
logistic
regression
analysis
exhibited
0.868
0.830−0.906,
80.1%
75.1%.
Conclusions:
dataset
is
sufficient
demonstrate
capable
accurately
trait.
It
anticipated
will
be
useful
tool
scientific
community
large-scale
Journal of Cheminformatics,
Journal Year:
2023,
Volume and Issue:
15(1)
Published: July 19, 2023
Abstract
The
identification
of
human
proteins
that
are
amenable
to
pharmacologic
modulation
without
significant
off-target
effects
remains
an
important
unsolved
challenge.
Computational
methods
have
been
devised
identify
features
which
distinguish
between
“druggable”
and
“undruggable”
proteins,
finding
protein
sequence,
tissue
cellular
localization,
biological
role,
position
in
the
protein–protein
interaction
network
all
discriminant
factors.
However,
many
prior
efforts
automate
assessment
druggability
suffer
from
low
performance
or
poor
interpretability.
We
developed
a
neural
network-based
machine
learning
model
capable
generating
sub-scores
based
on
each
four
distinct
categories,
combining
them
form
overall
score.
achieves
excellent
separating
drugged
undrugged
proteome,
with
area
under
receiver
operating
characteristic
(AUC)
0.95.
Our
use
multiple
allows
potential
targets
interest
contributors
druggability,
leading
more
interpretable
holistic
novel
targets.
BMC Bioinformatics,
Journal Year:
2023,
Volume and Issue:
24(1)
Published: July 28, 2023
Abstract
Background
The
identification
of
tumor
T
cell
antigens
(TTCAs)
is
crucial
for
providing
insights
into
their
functional
mechanisms
and
utilizing
potential
in
anticancer
vaccines
development.
In
this
context,
TTCAs
are
highly
promising.
Meanwhile,
experimental
technologies
discovering
characterizing
new
expensive
time-consuming.
Although
many
machine
learning
(ML)-based
models
have
been
proposed
identifying
TTCAs,
there
still
a
need
to
develop
robust
model
that
can
achieve
higher
rates
accuracy
precision.
Results
study,
we
propose
stacking
ensemble
learning-based
framework,
termed
StackTTCA,
accurate
large-scale
TTCAs.
Firstly,
constructed
156
different
baseline
by
using
12
feature
encoding
schemes
13
popular
ML
algorithms.
Secondly,
these
were
trained
employed
create
probabilistic
vector.
Finally,
the
optimal
vector
was
determined
based
selection
strategy
then
used
construction
our
stacked
model.
Comparative
benchmarking
experiments
indicated
StackTTCA
clearly
outperformed
several
classifiers
existing
methods
terms
independent
test,
with
an
0.932
Matthew's
correlation
coefficient
0.866.
Conclusions
summary,
framework
could
help
precisely
rapidly
identify
true
follow-up
verification.
addition,
developed
online
web
server
(
http://2pmlab.camt.cmu.ac.th/StackTTCA
)
maximize
user
convenience
high-throughput
screening
novel
PLoS ONE,
Journal Year:
2023,
Volume and Issue:
18(8), P. e0290538 - e0290538
Published: Aug. 25, 2023
Hepatitis
C
virus
(HCV)
infection
is
a
concerning
health
issue
that
causes
chronic
liver
diseases.
Despite
many
successful
therapeutic
outcomes,
no
effective
HCV
vaccines
are
currently
available.
Focusing
on
T
cell
activity,
the
primary
effector
for
clearance,
epitopes
of
(TCE-HCV)
considered
promising
elements
to
accelerate
vaccine
efficacy.
Thus,
accurate
and
rapid
identification
TCE-HCVs
recommended
obtain
more
efficient
therapy
infection.
In
this
study,
novel
sequence-based
stacked
approach,
termed
TROLLOPE,
proposed
accurately
identify
from
sequence
information.
Specifically,
we
employed
12
different
feature
descriptors
heterogeneous
perspectives,
such
as
physicochemical
properties,
composition-transition-distribution
information
composition
These
were
used
in
cooperation
with
popular
machine
learning
(ML)
algorithms
create
144
base-classifiers.
To
maximize
utility
these
base-classifiers,
selection
strategy
determine
collection
potential
base-classifiers
integrated
them
develop
meta-classifier.
Comprehensive
experiments
based
both
cross-validation
independent
tests
demonstrated
superior
predictive
performance
TROLLOPE
compared
conventional
ML
classifiers,
test
accuracies
0.745
0.747,
respectively.
Finally,
user-friendly
online
web
server
(
http://pmlabqsar.pythonanywhere.com/TROLLOPE
)
has
been
developed
serve
research
efforts
large-scale
follow-up
experimental
verification.
OMICS A Journal of Integrative Biology,
Journal Year:
2024,
Volume and Issue:
28(3), P. 148 - 161
Published: March 1, 2024
Breast
cancer
is
the
lead
cause
of
cancer-related
deaths
among
women
globally.
metastasis
a
complex
and
still
inadequately
understood
process
key
dimension
mortality
attendant
to
breast
cancer.
This
study
reports
dysregulated
genes
across
metastatic
stages
tissues,
shedding
light
on
their
molecular
interplay
in
disease
pathogenesis
new
possibilities
for
drug
discovery.
Comprehensive
analyses
gene
expression
data
from
primary
tumor,
circulating
tumor
cells,
distant
sites
brain,
lung,
liver,
bone
were
conducted.
Genes
multiple
tissues
identified
as
cascade
genes,
are
further
classified
based
functional
associations
with
metastasis-related
mechanisms.
Their
interactions
HUB
interactome
networks
scrutinized,
followed
by
pathway
enrichment
analysis.
Validation
potential
targets
included
assessments
survival,
druggability,
prognostic
marker
status,
secretome
annotation,
protein
expression,
cell
type
association.
Results
displayed
critical
those
specific
sites,
revealing
involvement
collagen
degradation
assembly
fibrils
other
multimeric
structure
pathways
driving
metastasis.
Notably,
pivotal
FABP4,
CXCL12,
APOD,
IGF1
emerged
high
potential,
linked
significant
druggability
survival
scores,
establishing
them
targets.
The
significance
this
research
lies
its
uncover
novel
biomarkers
early
detection,
therapeutic
targets,
deeper
understanding
mechanisms
underpinning
cancer,
an
eye
precision/personalized
medicine.
PubMed,
Journal Year:
2023,
Volume and Issue:
22, P. 915 - 927
Published: Jan. 1, 2023
Efficiently
and
precisely
identifying
drug
targets
is
crucial
for
developing
discovering
potential
medications.
While
conventional
experimental
approaches
can
accurately
pinpoint
these
targets,
they
suffer
from
time
constraints
are
not
easily
adaptable
to
high-throughput
processes.
On
the
other
hand,
computational
approaches,
particularly
those
utilizing
machine
learning
(ML),
offer
an
efficient
means
accelerate
prediction
of
druggable
proteins
based
solely
on
their
primary
sequences.
Recently,
several
state-of-the-art
methods
have
been
developed
predicting
analyzing
proteins.
These
showed
high
diversity
in
terms
benchmark
datasets,
feature
extraction
schemes,
ML
algorithms,
evaluation
strategies
webserver/software
usability.
Thus,
our
objective
reexamine
conduct
a
comprehensive
assessment
strengths
weaknesses
across
multiple
aspects.
In
this
study,
we
deliver
first
survey
regarding
silico
First,
provided
information
existing
datasets
types
employed.
Second,
investigated
effectiveness
protein
identification
each
dataset.
Third,
summarized
important
features
used
field
webserver/software.
Finally,
addressed
present
valuable
guidance
scientific
community
designing
novel
models.
We
anticipate
that
review
will
provide
development
more
accurate
predictors.
Crime Prevention and Community Safety,
Journal Year:
2024,
Volume and Issue:
26(4), P. 440 - 489
Published: Nov. 20, 2024
This
research
addresses
the
potential
for
tackling
crime
volumes
and
improving
analytics
through
new
enhancement
strategies.
The
use
of
machine
learning
deep
solutions
is
increasing
in
prediction,
as
many
other
fields.
study
aims
to
strengthen
proactive
approaches
criminology
by
evaluating
effectiveness
stacking-based
ensemble
(S-BEL)
model,
which
enhance
overall
performance
combining
strengths
various
algorithms
improve
facilitate
prevention
analyzes
six
studies
leveraging
S-BEL
model
along
with
28
articles
on
seven
utilizing
models,
56
general
prediction
studies.
findings
highlight
that
stands
out
a
prominent
technique
providing
valuable
insights
law
enforcement.
Targets,
Journal Year:
2024,
Volume and Issue:
2(4), P. 446 - 469
Published: Dec. 4, 2024
Alzheimer's
disease
is
a
neurodegenerative
that
continues
to
have
rising
number
of
cases.
While
extensive
research
has
been
conducted
in
the
last
few
decades,
only
drugs
approved
by
FDA
for
treatment,
and
even
fewer
aim
be
curative
rather
than
manage
symptoms.
There
remains
an
urgent
need
understanding
pathogenesis,
as
well
identifying
new
targets
further
drug
discovery.
(AD)
known
stem
from
build-up
amyloid
beta
(Aβ)
plaques
tangles
tau
proteins.
Furthermore,
inflammation
brain
arise
degeneration
tissue
insoluble
material.
Therefore,
there
potential
link
between
pathology
AD
brain,
especially
progresses
later
stages
where
neuronal
death
levels
are
higher.
Proteins
relevant
both
thus
make
ideal
therapeutics;
however,
proteins
evaluated
determine
which
would
therapeutic
treatments,
or
'druggable'.
Druggability
analysis
was
using
two
structure-based
methods
(i.e.,
Drug-Like
Density
SiteMap),
sequence-based
approach,
SPIDER.
The
most
druggable
were
then
single-nuclei
sequencing
data
their
clinical
relevance
AD.
For
each
top
five
targets,
small
molecule
docking
used
evaluate
able
bind
with
chosen
included
DRD2
(inhibits
adenylyl
cyclase
activity),
C9
(binds
C5B8
form
membrane
attack
complex),
C4b
C2a
C3
convertase),
C5AR1
(GPCR
binds
C5a),
GABA-A-R
involved
inhibiting
neurotransmission).
Each
target
had
multiple
inhibitors
FDA-approved
list
decent
binding
infinities.
Among
these
inhibitors,
found
more
one
protein
target.
They
C15H14N2O2
v316
(Paracetamol),
treat
pain/inflammation
originally
cataracts
relieve
headaches/fever,
respectively.
These
results
provide
groundwork
experimental
investigation
trials.