Scientific Reports,
Journal Year:
2024,
Volume and Issue:
14(1)
Published: Nov. 7, 2024
Viral
oncoproteins
play
crucial
roles
in
transforming
normal
cells
into
cancer
cells,
representing
a
significant
factor
the
etiology
of
various
cancers.
Traditionally,
identifying
these
is
both
time-consuming
and
costly.
With
advancements
computational
biology,
bioinformatics
tools
based
on
machine
learning
have
emerged
as
effective
methods
for
predicting
biological
activities.
Here,
first
time,
we
propose
an
innovative
approach
that
combines
Generative
Adversarial
Networks
(GANs)
with
supervised
to
enhance
accuracy
generalizability
viral
oncoprotein
prediction.
Our
methodology
evaluated
multiple
models,
including
Random
Forest,
Multilayer
Perceptron,
Light
Gradient
Boosting
Machine,
eXtreme
Boosting,
Support
Vector
Machine.
In
ten-fold
cross-validation
our
training
dataset,
GAN-enhanced
Forest
model
demonstrated
superior
performance
metrics:
0.976
accuracy,
F1
score,
0.977
precision,
sensitivity,
1.0
AUC.
During
independent
testing,
this
achieved
0.982
These
results
establish
new
tool,
VirOncoTarget,
accessible
via
web
application.
We
anticipate
VirOncoTarget
will
be
valuable
resource
researchers,
enabling
rapid
reliable
prediction
advancing
understanding
their
role
biology.
Scientific Reports,
Journal Year:
2025,
Volume and Issue:
15(1)
Published: Jan. 2, 2025
Worldwide,
Cancer
remains
a
significant
health
concern
due
to
its
high
mortality
rates.
Despite
numerous
traditional
therapies
and
wet-laboratory
methods
for
treating
cancer-affected
cells,
these
approaches
often
face
limitations,
including
costs
substantial
side
effects.
Recently
the
selectivity
of
peptides
has
garnered
attention
from
scientists
their
reliable
targeted
actions
minimal
adverse
Furthermore,
keeping
outcomes
existing
computational
models,
we
propose
highly
effective
model
namely,
pACP-HybDeep
accurate
prediction
anticancer
peptides.
In
this
model,
training
are
numerically
encoded
using
an
attention-based
ProtBERT-BFD
encoder
extract
semantic
features
along
with
CTDT-based
structural
information.
k-nearest
neighbor-based
binary
tree
growth
(BTG)
algorithm
is
employed
select
optimal
feature
set
multi-perspective
vector.
The
selected
vector
subsequently
trained
CNN
+
RNN-based
deep
learning
model.
Our
proposed
demonstrated
accuracy
95.33%,
AUC
0.97.
To
validate
generalization
capabilities
our
achieved
accuracies
94.92%,
92.26%,
91.16%
on
independent
datasets
Ind-S1,
Ind-S2,
Ind-S3,
respectively.
efficacy,
reliability
test
establish
it
as
valuable
tool
researchers
in
academia
pharmaceutical
drug
design.
PLoS ONE,
Journal Year:
2025,
Volume and Issue:
20(2), P. e0317396 - e0317396
Published: Feb. 10, 2025
In
recent
years,
the
challenge
of
imbalanced
data
has
become
increasingly
prominent
in
machine
learning,
affecting
performance
classification
algorithms.
This
study
proposes
a
novel
data-level
oversampling
method
called
Cluster-Based
Reduced
Noise
SMOTE
(CRN-SMOTE)
to
address
this
issue.
CRN-SMOTE
combines
for
minority
classes
with
cluster-based
noise
reduction
technique.
approach,
it
is
crucial
that
samples
from
each
category
form
one
or
two
clusters,
feature
conventional
methods
do
not
achieve.
The
proposed
evaluated
on
four
datasets
(ILPD,
QSAR,
Blood,
and
Maternal
Health
Risk)
using
five
metrics:
Cohen’s
kappa,
Matthew’s
correlation
coefficient
(MCC),
F1-score,
precision,
recall.
Results
demonstrate
consistently
outperformed
state-of-the-art
(RN-SMOTE),
SMOTE-Tomek
Link,
SMOTE-ENN
across
all
datasets,
particularly
notable
improvements
observed
QSAR
Risk
indicating
its
effectiveness
enhancing
performance.
Overall,
experimental
findings
indicate
RN-SMOTE
100%
cases,
achieving
average
6.6%
Kappa,
4.01%
MCC,
1.87%
1.7%
2.05%
recall,
setting
SMOTE’s
neighbors’
number
5.
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 3, 2024
Inflammation
is
a
biological
response
to
harmful
stimuli,
playing
crucial
role
in
facilitating
tissue
repair
by
eradicating
pathogenic
microorganisms.
However,
when
inflammation
becomes
chronic,
it
leads
numerous
serious
disorders,
particularly
autoimmune
diseases.
Anti-inflammatory
peptides
(AIPs)
have
emerged
as
promising
therapeutic
agents
due
their
high
specificity,
potency,
and
low
toxicity.
identifying
AIPs
using
traditional
vivo
methods
time-consuming
expensive.
Recent
advancements
computational-based
intelligent
models
for
offered
cost-effective
alternative
various
inflammatory
diseases,
owing
selectivity
toward
targeted
cells
with
side
effects.
In
this
paper,
we
propose
novel
computational
model,
namely,
DeepAIPs-Pred,
the
accurate
prediction
of
AIP
sequences.
The
training
samples
are
represented
LBP-PSSM-
LBP-SMR-based
evolutionary
image
transformation
methods.
Additionally,
capture
contextual
semantic
features,
employed
attention-based
ProtBERT-BFD
embedding
QLC
structural
features.
Furthermore,
differential
evolution
(DE)-based
weighted
feature
integration
utilized
produce
multiview
vector.
SMOTE-Tomek
Links
introduced
address
class
imbalance
problem,
two-layer
selection
technique
proposed
reduce
select
optimal
Finally,
self-normalized
bidirectional
temporal
convolutional
networks
(SnBiTCN)
trained
achieving
significant
predictive
accuracy
94.92%
an
AUC
0.97.
generalization
our
model
validated
two
independent
datasets,
demonstrating
higher
performance
improvement
∼2
∼10%
accuracies
than
existing
state-of-the-art
Ind-I
Ind-II,
respectively.
efficacy
reliability
DeepAIPs-Pred
highlight
its
potential
valuable
tool
drug
development
research
academia.
PLoS ONE,
Journal Year:
2025,
Volume and Issue:
20(1), P. e0317843 - e0317843
Published: Jan. 30, 2025
Detecting
low
birth
weight
is
crucial
for
early
identification
of
at-risk
pregnancies
which
are
associated
with
significant
neonatal
and
maternal
morbidity
mortality
risks.
This
study
presents
an
efficient
interpretable
framework
unsupervised
detection
low,
very
extreme
weights.
While
traditional
approaches
to
managing
class
imbalance
require
labeled
data,
our
explores
the
use
learning
detect
anomalies
indicative
scenarios.
method
particularly
valuable
in
contexts
where
data
scarce
or
labels
anomaly
not
available,
allowing
preliminary
insights
that
can
inform
further
labeling
more
focused
supervised
efforts.
We
employed
fourteen
different
algorithms
evaluated
their
performance
using
Area
Under
Receiver
Operating
Characteristics
(AUCROC)
Precision-Recall
Curve
(AUCPR)
metrics.
Our
experiments
demonstrated
One
Class
Support
Vector
Machine
(OCSVM)
Empirical-Cumulative-distribution-based
Outlier
Detection
(ECOD)
effectively
identified
across
categories.
The
OCSVM
attained
AUCROC
0.72
AUCPR
0.0253
LBW
detection,
while
ECOD
model
showed
competitive
0.045
cases.
Additionally,
a
novel
feature
perturbation
technique
was
introduced
enhance
interpretability
models
by
providing
into
relative
importance
various
prenatal
features.
proposed
interpretation
methodology
validated
clinician
experts
reveals
promise
intervention
strategies
improved
care.
PLoS ONE,
Journal Year:
2025,
Volume and Issue:
20(3), P. e0318491 - e0318491
Published: March 11, 2025
To
enhance
the
accuracy
and
response
speed
of
risk
early
warning
system,
this
study
develops
a
novel
system
that
combines
Fuzzy
C-Means
(FCM)
clustering
algorithm
Random
Forest
(RF)
model.
Firstly,
based
on
operational
theory,
market
risk,
research
development
financial
human
resource
are
selected
as
primary
indicators
for
enterprise
assessment.
Secondly,
Criteria
Importance
Through
Intercriteria
Correlation
(CRITIC)
weight
method
is
employed
to
determine
importance
these
indicators,
thereby
enhancing
model's
prediction
ability
stability.
Following
this,
FCM
utilized
pre-processing
sample
data
improve
efficiency
classification.
Finally,
an
improved
RF
model
constructed
by
optimizing
parameters
algorithm.
The
mainly
from
RESSET/DB,
covering
issuance,
trading,
rating
fixed-income
products
such
bonds,
government
corporate
provides
basic
information,
net
value,
position,
performance
funds.
experimental
results
show
achieves
F1
score
87.26%,
87.95%,
Area
under
Curve
(AUC)
91.20%,
precision
89.29%,
recall
87.48%.
They
respectively
6.45%,
4.45%,
5.09%,
4.81%,
3.83%
higher
than
traditional
In
study,
successfully
constructed,
models
their
handle
complex
significantly
improved.