British Journal of Clinical Pharmacology,
Journal Year:
2024,
Volume and Issue:
90(6), P. 1514 - 1524
Published: March 20, 2024
Health
food
products
(HFPs)
are
foods
and
related
to
maintaining
promoting
health.
HFPs
may
sometimes
cause
unforeseen
adverse
health
effects
by
interacting
with
drugs.
Considering
the
importance
of
information
on
interactions
between
drugs,
this
study
aimed
establish
a
workflow
extract
Drug-HFP
Interactions
(DHIs)
from
open
resources.
Discover Oncology,
Journal Year:
2025,
Volume and Issue:
16(1)
Published: March 17, 2025
This
study
proposes
an
advanced
machine
learning
(ML)
framework
for
breast
cancer
diagnostics
by
integrating
transcriptomic
profiling
with
optimized
feature
selection
and
classification
techniques.
A
dataset
of
1759
samples
(987
patients,
772
healthy
controls)
was
analyzed
using
Recursive
Feature
Elimination,
Boruta,
ElasticNet
selection.
Dimensionality
reduction
techniques,
including
Non-Negative
Matrix
Factorization
(NMF),
Autoencoders,
transformer-based
embeddings
(BioBERT,
DNABERT),
were
applied
to
enhance
model
interpretability.
Classifiers
such
as
XGBoost,
LightGBM,
ensemble
voting,
Multi-Layer
Perceptron,
Stacking
trained
grid
search
cross-validation.
Model
evaluation
conducted
accuracy,
AUC,
MCC,
Kappa
Score,
ROC,
PR
curves,
external
validation
performed
on
independent
175
samples.
XGBoost
LightGBM
achieved
the
highest
test
accuracies
(0.91
0.90)
AUC
values
(up
0.92),
particularly
NMF
BioBERT.
The
Voting
method
exhibited
best
accuracy
(0.92),
confirming
its
robustness.
Transformer-based
techniques
significantly
improved
performance
compared
conventional
approaches
like
PCA
Decision
Trees.
proposed
ML
enhances
diagnostic
interpretability,
demonstrating
strong
generalizability
dataset.
These
findings
highlight
potential
precision
oncology
personalized
diagnostics.
Nature Communications,
Journal Year:
2025,
Volume and Issue:
16(1)
Published: March 20, 2025
Abstract
Accurate
prediction
of
enzyme
kinetic
parameters
is
crucial
for
exploration
and
modification.
Existing
models
face
the
problem
either
low
accuracy
or
poor
generalization
ability
due
to
overfitting.
In
this
work,
we
first
developed
unbiased
datasets
evaluate
actual
performance
these
methods
proposed
a
deep
learning
model,
CataPro,
based
on
pre-trained
molecular
fingerprints
predict
turnover
number
(
k
c
t
),
Michaelis
constant
K
m
catalytic
efficiency
/
).
Compared
with
previous
baseline
models,
CataPro
demonstrates
clearly
enhanced
datasets.
representational
mining
project,
by
combining
traditional
methods,
identified
an
(SsCSO)
19.53
times
increased
activity
compared
initial
(CSO2)
then
successfully
engineered
it
improve
its
3.34
times.
This
reveals
high
potential
as
effective
tool
future
discovery
PLoS neglected tropical diseases,
Journal Year:
2025,
Volume and Issue:
19(4), P. e0012985 - e0012985
Published: April 29, 2025
Background
The
identification
of
B-cell
epitopes
(BCEs)
is
fundamental
to
advancing
epitope-based
vaccine
design,
therapeutic
antibody
development,
and
diagnostics,
such
as
in
neglected
tropical
diseases
caused
by
parasitic
pathogens.
However,
the
structural
complexity
parasite
antigens
high
cost
experimental
validation
present
certain
challenges.
Advances
Artificial
Intelligence
(AI)-driven
protein
engineering,
particularly
through
machine
learning
deep
learning,
offer
efficient
solutions
enhance
prediction
accuracy
reduce
costs.
Methodology/Principal
findings
Here,
we
deepBCE-Parasite,
a
Transformer-based
model
designed
predict
linear
BCEs
from
peptide
sequences.
By
leveraging
state-of-the-art
self-attention
mechanism,
achieved
remarkable
predictive
performance,
achieving
an
approximately
81%
AUC
0.90
both
10-fold
cross-validation
independent
testing.
Comparative
analyses
against
12
handcrafted
features
four
conventional
algorithms
(GNB,
SVM,
RF,
LGBM)
highlighted
superior
power
model.
As
case
study,
deepBCE-Parasite
predicted
eight
leucine
aminopeptidase
(LAP)
Fasciola
hepatica
proteomic
data.
Dot-blot
immunoassays
confirmed
specific
binding
seven
synthetic
peptides
positive
sera,
validating
their
IgG
reactivity
demonstrating
model’s
efficacy
BCE
prediction.
Conclusions/Significance
demonstrates
excellent
performance
predicting
across
diverse
pathogens,
offering
valuable
tool
for
design
vaccines,
antibodies,
diagnostic
applications
parasitology.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Oct. 21, 2023
A
bstract
Large
Language
Models
(LLMs)
have
garnered
significant
recognition
in
the
life
sciences
for
their
capacity
to
comprehend
and
utilize
knowledge.
The
contemporary
expectation
diverse
industries
extends
beyond
employing
LLMs
merely
as
chatbots;
instead,
there
is
a
growing
emphasis
on
harnessing
potential
adept
analysts
proficient
dissecting
intricate
issues
within
these
sectors.
realm
of
bioinformatics
no
exception
this
trend.
In
paper,
we
introduce
B
ioinfo
-B
ench
,
novel
yet
straightforward
benchmark
framework
suite
crafted
assess
academic
knowledge
data
mining
capabilities
foundational
models
bioinformatics.
systematically
gathered
from
three
distinct
perspectives:
acquisition,
analysis,
application,
facilitating
comprehensive
examination
LLMs.
Our
evaluation
encompassed
prominent
ChatGPT,
Llama,
Galactica.
findings
revealed
that
excel
drawing
heavily
upon
training
retention.
However,
proficiency
addressing
practical
professional
queries
conducting
nuanced
inference
remains
constrained.
Given
insights,
are
poised
delve
deeper
into
domain,
engaging
further
extensive
research
discourse.
It
pertinent
note
project
currently
progress,
all
associated
materials
will
be
made
publicly
accessible.
1
International Journal of Molecular Sciences,
Journal Year:
2024,
Volume and Issue:
25(22), P. 12233 - 12233
Published: Nov. 14, 2024
The
complexities
inherent
in
drug
development
are
multi-faceted
and
often
hamper
accuracy,
speed
efficiency,
thereby
limiting
success.
This
review
explores
how
recent
developments
machine
learning
(ML)
significantly
impacting
target-based
discovery,
particularly
small-molecule
approaches.
Simplified
Molecular
Input
Line
Entry
System
(SMILES),
which
translates
a
chemical
compound's
three-dimensional
structure
into
string
of
symbols,
is
now
widely
used
design,
mining,
repurposing.
Utilizing
ML
natural
language
processing
techniques,
SMILES
has
revolutionized
lead
identification,
high-throughput
screening
virtual
screening.
models
enhance
the
accuracy
predicting
binding
affinity
selectivity,
reducing
need
for
extensive
experimental
Additionally,
deep
learning,
with
its
strengths
analyzing
spatial
sequential
data
through
convolutional
neural
networks
(CNNs)
recurrent
(RNNs),
shows
promise
screening,
target
de
novo
design.
Fragment-based
approaches
also
benefit
from
algorithms
techniques
like
generative
adversarial
(GANs),
predict
fragment
properties
affinities,
aiding
hit
selection
design
optimization.
Structure-based
relies
on
high-resolution
protein
structures,
leverages
accurate
predictions
interactions.
While
challenges
such
as
interpretability
quality
remain,
ML's
transformative
impact
accelerates
increasing
efficiency
innovation.
Its
potential
to
deliver
new
improved
treatments
various
diseases
significant.
Agronomy,
Journal Year:
2024,
Volume and Issue:
14(12), P. 2756 - 2756
Published: Nov. 21, 2024
Genomic
selection
serves
as
an
effective
way
for
crop
genetic
breeding,
capable
of
significantly
shortening
the
breeding
cycle
and
improving
accuracy
breeding.
Phenotype
prediction
can
help
identify
variants
associated
with
specific
phenotypes.
This
provides
a
data-driven
criterion
genomic
selection,
making
process
more
efficient
targeted.
Deep
learning
has
become
important
tool
phenotype
due
to
its
abilities
in
automatic
feature
learning,
nonlinear
modeling,
high-dimensional
data
processing.
Current
deep
models
have
improvements
various
aspects,
such
predictive
performance
computation
time,
but
they
still
limitations
capturing
complex
relationships
between
genotype
phenotype,
indicating
that
there
is
room
improvement
prediction.
study
innovatively
proposes
new
method
called
DeepAT,
which
mainly
includes
input
layer,
extraction
relationship
capture
output
layer.
predict
wheat
yield
based
on
innovations
following
four
aspects:
(1)
The
layer
DeepAT
extract
representative
vectors
from
SNP
data.
By
introducing
ReLU
activation
function,
it
enhances
model’s
ability
express
features
accelerates
convergence
speed;
(2)
handle
while
retaining
much
useful
information
possible;
(3)
effectively
captures
low-dimensional
through
self-attention
mechanism;
(4)
Compared
traditional
RNN
structures,
model
training
stable.
Using
public
dataset
AGT,
comparative
experiments
three
machine
six
methods
found
exhibited
better
than
other
methods,
achieving
99.98%,
mean
squared
error
(MSE)
only
28.93
tones,
Pearson
correlation
coefficient
close
1,
predicted
values
closely
matching
observed
values.
perspective
learning-assisted
great
potential
smart
Current Bioinformatics,
Journal Year:
2024,
Volume and Issue:
19(9), P. 810 - 824
Published: Feb. 2, 2024
Introduction:
More
recent
self-supervised
deep
language
models,
such
as
Bidirectional
Encoder
Representations
from
Transformers
(BERT),
have
performed
the
best
on
some
tasks
by
contextualizing
word
embeddings
for
a
better
dynamic
representation.
Their
proteinspecific
versions,
ProtBERT,
generated
protein
sequence
embeddings,
which
resulted
in
performance
several
bioinformatics
tasks.
Besides,
number
of
different
post-translational
modifications
are
prominent
cellular
development
and
differentiation.
The
current
biological
experiments
can
detect
these
modifications,
but
within
longer
duration
with
significant
cost.
Methods:
In
this
paper,
to
comprehend
accompanying
processes
concisely
more
rapidly,
we
propose
DEEPPTM
predict
modification
(PTM)
sites
sequences
efficiently.
Different
than
methods,
enhances
prediction
integrating
specialized
ProtBERT-based
attention-based
vision
transformers
(ViT),
reveals
associations
between
types
content.
Additionally,
it
infer
over
species.
Results:
Human
mouse
ROC
AUCs
predicting
Succinylation
were
0.793
0.661
respectively,
once
10-fold
cross-validation
is
applied.
Similarly,
obtained
0.776,
0.764,
0.734
AUC
scores
inferring
ubiquitination,
crotonylation,
glycation
sites,
respectively.
According
detailed
computational
experiments,
lessens
time
spent
laboratory
while
outperforming
competing
methods
well
baselines
all
4
sites.
our
case,
learning
look
favorable
ProtBERT
features
traditional
machine
techniques.
Conclusion:
protein-specific
model
effective
original
BERT
PTM
Our
code
datasets
be
found
at
https://github.com/seferlab/deepptm.