The
identification
of
human
proteins
that
are
amenable
to
pharmacologic
modulation
without
significant
off-target
effects
remains
an
important
unsolved
challenge.
Computational
methods
have
been
devised
identify
features
which
distinguish
between
“druggable”
and
“undruggable”
proteins,
finding
protein
sequence,
tissue
cellular
localization,
biological
role,
position
in
the
protein-protein
interaction
network
all
discriminant
factors.
However,
many
prior
efforts
automate
assessment
druggability
suffer
from
low
performance
or
poor
interpretability.
We
developed
a
neural
network-based
machine
learning
model
capable
generating
sub-scores
based
on
each
four
distinct
categories,
combining
them
form
overall
score.
achieves
excellent
separating
drugged
undrugged
proteome,
with
area
under
receiver
operating
characteristic
(AUC)
0.95.
Our
use
multiple
allows
potential
targets
interest
contributors
druggability,
leading
more
interpretable
holistic
novel
targets.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 8, 2024
Abstract
Drug
design
and
development
are
central
to
clinical
research,
yet
ninety
percent
of
drugs
fail
reach
the
clinic,
often
due
inappropriate
selection
drug
targets.
Conventional
methods
for
target
identification
lack
precision
sensitivity.
While
various
computational
tools
have
been
developed
predict
druggability
proteins,
they
focus
on
limited
subsets
human
proteome
or
rely
solely
amino
acid
properties.
To
address
challenge
class
imbalance
between
proteins
with
without
approved
drugs,
we
propose
a
novel
Partitioning
Method.
We
evaluated
potential
20,273
reviewed
which
2,636
drugs.
Our
comprehensive
analysis
183
features,
encompassing
biophysical
sequence-derived
properties,
achieved
median
AUC
0.86
in
predictions.
utilize
SHAP
(Shapley
Additive
Explanations)
scores
identify
key
predictors
interpret
their
contribution
druggability.
688
investigational
from
DrugBank
(
https://go.drugbank.com/
)
using
our
tool,
DrugProtAI
https://drugprotai.pythonanywhere.com/
).
tool
offers
predictions
access
2M+
publications
targets
effects,
aiding
development.
believe
that
insights
into
will
significantly
advance
propel
field
forward.
The
identification
of
human
proteins
that
are
amenable
to
pharmacologic
modulation
without
significant
off-target
effects
remains
an
important
unsolved
challenge.
Computational
methods
have
been
devised
identify
features
which
distinguish
between
“druggable”
and
“undruggable”
proteins,
finding
protein
sequence,
tissue
cellular
localization,
biological
role,
position
in
the
protein-protein
interaction
network
all
discriminant
factors.
However,
many
prior
efforts
automate
assessment
druggability
suffer
from
low
performance
or
poor
interpretability.
We
developed
a
neural
network-based
machine
learning
model
capable
generating
sub-scores
based
on
each
four
distinct
categories,
combining
them
form
overall
score.
achieves
excellent
separating
drugged
undrugged
proteome,
with
area
under
receiver
operating
characteristic
(AUC)
0.95.
Our
use
multiple
allows
potential
targets
interest
contributors
druggability,
leading
more
interpretable
holistic
novel
targets.