Development and validation of a clinically applicable diagnostic model for invasive pulmonary aspergillosis in patients with structural lung diseases
Abstract
Background
Patients
with
structural
lung
disease
are
prone
to
develop
lower
respiratory
tract
infections,
especially
those
caused
by
Aspergillus,
due
irreversible
damage
the
parenchyma
and
interstitium.
Early
diagnosis
of
invasive
Aspergillus
infection
is
difficult,
delayed
treatment
associated
a
high
risk
mortality.
Therefore,
in
this
study,
we
established
diagnostic
prediction
model
for
patients
aim
providing
foundation
early
detection.
Methods
We
conducted
retrospective
cohort
study
analyzing
inpatients
diseases
admitted
Beijing
Chest
Hospital
between
January
1,
2020,
December
31,
2024.
Data
were
randomly
divided
into
training
(70%)
validation
sets
(30%)
using
stratified
random
sampling
maintain
proportional
representation
key
demographics.For
variable
selection,
performed
univariate
analysis
identify
potential
predictors
pulmonary
aspergillosis
(IPA)
diseases.
Variables
achieving
significance
at
P
<
0.1
retained
further
analysis.
Subsequently,
applied
Lasso
regression
10-fold
cross-validation
determine
feature
importance
weights.
Based
on
combined
criteria
(P
0.05)
odds
ratio
magnitude,
top
five
candidate
selected
inclusion
stepwise
multivariate
logistic
model.The
final
was
visualized
through
nomogram
incorporating
factors.
Model
performance
comprehensively
evaluated
using:Discrimination:
Receiver
operating
characteristic
(ROC)
curve
area
under
(AUC);Calibration:
Hosmer-Lemeshow
goodness-of-fit
test;Clinical
Utility:
Decision
(DCA)
clinical
impact
(CIC);Diagnostic
Metrics:
Sensitivity,
specificity,
positive
predictive
value
(PPV),
negative
(NPV).To
enhance
generalizability,
six
machine
learning
algorithms
including
Naive
Bayes
(NB),
Tree
(DT),
K-Nearest
Neighbors
(KNN),
Random
Forest
(RF),
Support
Vector
Machine
(SVM),
XGBoost
employed
comparative
validation.
Ensemble
techniques
implemented
optimize
across
different
algorithms.
Results
A
total
204
eligible
included
(84
IAI
120
without
IAI).
After
selection
via
LASSO
regression,
multiple
performed,
following
four
independent
factors
ultimately
identified:
coexisting
diabetes,
radiological
cavitary
manifestations,
blood
IgG
antibody,
BALF-mNGS.
The
AUC
0.88
(95%
CI
0.82–0.94),
visual
created.
At
optimal
cutoff
(0.431),
sensitivity
specificity
set
reached
0.81
CI:
0.68–0.93)
0.92
0.81–1.00),
respectively,
(PPV)
as
0.94
0.85–1.00),
demonstrating
good
performance.
validated
classifiers
showed
stable
performance:
0.977
0.960–0.994),
GNB
0.890
0.841–0.939),
decision
tree
0.987
0.976–0.998),
SVM
0.884
0.828–0.939),
KNN
0.909
0.860–0.946),
forest
0.979
0.963–0.996).
Conclusions
multimodal
that
integrates
clinical,
imaging
microbiological
data,
after
being
verified
classification
methods,
can
effectively
Research Square (Research Square), Journal Year: 2025, Volume and Issue: unknown
Published: May 19, 2025
Language: Английский