bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Авг. 29, 2023
ABSTRACT
We
present
an
instrument-independent
benchmarking
procedure
and
software
(LFQ_bout)
for
validation
comparative
evaluation
of
the
performance
LC-MS/MS
data
processing
workflows
in
bottom-up
proteomics.
It
enables
back-to-back
comparison
common
emerging
workflows,
e.g.
diaPASEF
or
ScanningSWATH,
evaluates
impact
arbitrary,
inadequately
documented
settings
black-box
algorithms.
The
enhances
overall
quantitative
accuracy
while
enabling
detection
major
error
types.
Molecular & Cellular Proteomics,
Год журнала:
2024,
Номер
23(2), С. 100712 - 100712
Опубликована: Янв. 4, 2024
Data-independent
acquisition
(DIA)
mass
spectrometry
(MS)
has
emerged
as
a
powerful
technology
for
high-throughput,
accurate
and
reproducible
quantitative
proteomics.
This
review
provides
comprehensive
overview
of
recent
advances
in
both
the
experimental
computational
methods
DIA
proteomics,
from
data
schemes
to
analysis
strategies
software
tools.
are
categorized
based
on
design
precursor
isolation
windows,
highlighting
wide-window,
overlapping-window,
narrow-window,
scanning
quadrupole-based,
parallel
accumulation-serial
fragmentation
(PASEF)-enhanced
methods.
For
analysis,
major
classified
into
spectrum
reconstruction,
sequence-based
search,
library-based
de
novo
sequencing
sequencing-independent
approaches.
A
wide
array
tools
implementing
these
reviewed,
with
details
their
overall
workflows
scoring
approaches
at
different
steps.
The
generation
optimization
spectral
libraries,
which
critical
resources
also
discussed.
Publicly
available
benchmark
datasets
covering
global
proteomics
phosphoproteomics
summarized
facilitate
performance
evaluation
various
workflows.
Continued
synergistic
developments
versatile
components
expected
further
enhance
power
DIA-based
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Май 9, 2024
Abstract
Identification
of
differentially
expressed
proteins
in
a
proteomics
workflow
typically
encompasses
five
key
steps:
raw
data
quantification,
expression
matrix
construction,
normalization,
missing
value
imputation
(MVI),
and
differential
analysis.
The
plethora
options
each
step
makes
it
challenging
to
identify
optimal
workflows
that
maximize
the
identification
proteins.
To
their
common
properties,
we
conduct
an
extensive
study
involving
34,576
combinatoric
experiments
on
24
gold
standard
spike-in
datasets.
Applying
frequent
pattern
mining
techniques
top-ranked
workflows,
uncover
high-performing
rules
demonstrate
optimality
has
conserved
properties.
Via
machine
learning,
confirm
are
indeed
predictable,
with
average
cross-validation
F1
scores
Matthew’s
correlation
coefficients
surpassing
0.84.
We
introduce
ensemble
inference
integrate
results
from
individual
top-performing
for
expanding
proteome
coverage
resolve
inconsistencies.
Ensemble
provides
gains
pAUC
(up
4.61%)
G-mean
11.14%)
facilitates
effective
aggregation
information
across
varied
quantification
approaches
such
as
topN,
directLFQ,
MaxLFQ
intensities,
spectral
counts.
However,
further
development
evaluation
needed
establish
acceptable
frameworks
conducting
multiple
workflows.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Июнь 2, 2024
Abstract
Mass
spectrometry
(MS)-based
proteomics
continues
to
evolve
rapidly,
opening
more
and
application
areas.
The
scale
of
data
generated
on
novel
instrumentation
acquisition
strategies
pose
a
challenge
bioinformatic
analysis.
Search
engines
need
make
optimal
use
the
for
biological
discoveries
while
remaining
statistically
rigorous,
transparent
performant.
Here
we
present
alphaDIA,
modular
open-source
search
framework
independent
(DIA)
proteomics.
We
developed
feature-free
identification
algorithm
particularly
suited
detecting
patterns
in
produced
by
sensitive
time-of-flight
instruments.
It
naturally
adapts
novel,
eTicient
scan
modes
that
are
not
yet
accessible
previous
algorithms.
Rigorous
benchmarking
demonstrates
competitive
quantification
performance.
While
supporting
empirical
spectral
libraries,
propose
new
strategy
named
end-to-end
transfer
learning
using
fully
predicted
libraries.
This
entails
continuously
optimizing
deep
neural
network
predicting
machine
experiment
specific
properties,
enabling
generic
DIA
analysis
any
post-translational
modification
(PTM).
AlphaDIA
provides
high
performance
running
locally
or
cloud,
community.
Journal of Proteome Research,
Год журнала:
2024,
Номер
unknown
Опубликована: Сен. 9, 2024
A
thorough
evaluation
of
the
quality,
reproducibility,
and
variability
bottom-up
proteomics
data
is
necessary
at
every
stage
a
workflow,
from
planning
to
analysis.
We
share
vignettes
applying
adaptable
quality
control
(QC)
measures
assess
sample
preparation,
system
function,
quantitative
System
suitability
samples
are
repeatedly
measured
longitudinally
with
targeted
methods,
we
examples
where
they
used
on
three
instrument
platforms
identify
severe
failures
track
function
over
months
years.
Internal
QCs
incorporated
protein
peptide
levels
allow
our
team
preparation
issues
differentiate
sample-specific
issues.
External
QC
prepared
alongside
experimental
verify
consistency
potential
results
during
batch
correction
normalization
before
assessing
biological
phenotypes.
combine
these
controls
rapid
analysis
(Skyline),
longitudinal
metrics
(AutoQC),
server-based
deposition
(PanoramaWeb).
propose
that
this
integrated
approach
useful
starting
point
for
groups
facilitate
assessment
ensure
valuable
time
collect
best
possible.
Data
available
Panorama
Public
ProteomeXchange
under
identifier
PXD051318.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Апрель 13, 2024
A
thorough
evaluation
of
the
quality,
reproducibility,
and
variability
bottom-up
proteomics
data
is
necessary
at
every
stage
a
workflow
from
planning
to
analysis.
We
share
vignettes
applying
adaptable
quality
control
(QC)
measures
assess
sample
preparation,
system
function,
quantitative
System
suitability
samples
are
repeatedly
measured
longitudinally
with
targeted
methods,
we
examples
where
they
used
on
three
instrument
platforms
identify
severe
failures
track
function
over
months
years.
Internal
QCs
incorporated
protein
peptide-level
allow
our
team
preparation
issues
differentiate
sample-specific
issues.
External
QC
prepared
alongside
experimental
verify
consistency
potential
results
during
batch
correction
normalization
before
assessing
biological
phenotypes.
combine
these
controls
rapid
analysis
(Skyline),
longitudinal
metrics
(AutoQC),
server-based
deposition
(PanoramaWeb).
propose
that
this
integrated
approach
useful
starting
point
for
groups
facilitate
assessment
ensure
valuable
time
collect
best
possible.
Data
available
Panorama
Public
ProteomeXchange
under
identifier
PXD051318.
Nature Communications,
Год журнала:
2025,
Номер
16(1)
Опубликована: Янв. 26, 2025
Abstract
Urinary
proteomics
is
emerging
as
a
potent
tool
for
detecting
sensitive
and
non-invasive
biomarkers.
At
present,
the
comparability
of
urinary
data
across
diverse
liquid
chromatography−mass
spectrometry
(LC-MS)
platforms
remains
an
area
that
requires
investigation.
In
this
study,
we
conduct
comprehensive
evaluation
proteome
multiple
LC-MS
platforms.
To
systematically
analyze
assess
quality
large-scale
data,
develop
control
(QC)
system
named
MSCohort,
which
extracted
81
metrics
individual
experiment
whole
cohort
evaluation.
Additionally,
present
standard
operating
procedure
(SOP)
high-throughput
analysis
based
on
MSCohort
QC
system.
Our
study
involves
20
reveals
that,
when
combined
with
unified
SOP,
generated
by
data-independent
acquisition
(DIA)
workflow
in
urine
samples
exhibit
high
robustness,
sensitivity,
reproducibility
Furthermore,
apply
SOP
to
hybrid
benchmarking
clinical
colorectal
cancer
(CRC)
including
527
experiments.
Across
three
different
platforms,
analyses
report
quantitative
consistent
disease
patterns.
This
work
lays
groundwork
studies
spanning
paving
way
precision
medicine
research.
Fluctuating
salinity
is
symptomatic
of
climate
change
challenging
aquatic
species.
The
melting
polar
ice,
rising
sea
levels,
coastal
surface
and
groundwater
salinization,
increased
evaporation
in
arid
habitats
alter
worldwide.
Moreover,
the
frequency
intensity
extreme
weather
events
such
as
rainstorms
floods
increase,
causing
rapid
shifts
brackish
habitat
salinity.
Such
alterations
disrupt
homeostasis
ultimately
diminish
fitness,
organisms
by
interfering
with
metabolism,
reproduction,
immunity,
other
critical
aspects
physiology.
Proteins
are
central
to
these
physiological
mechanisms.
They
represent
molecular
building
blocks
phenotypes
that
govern
organismal
responses
environmental
challenges.
Environmental
cues
regulate
proteins
a
concerted
fashion,
necessitating
holistic
analyses
proteomes
for
comprehending
stress
responses.
Proteomics
approaches
reveal
causes
population
declines
enable
bioindication
geared
toward
timely
interventions
prevent
local
extinctions.
effects
on
have
been
performed
since
mid-1990s,
propelled
invention
two-dimensional
protein
gels,
soft
ionization
techniques
mass
spectrometry
(MS),
nano-liquid
chromatography
1970s
1980s.
This
review
summarizes
current
knowledge
regulation
from
organisms,
including
key
methodological
advances
over
past
decades.
Journal of Proteome Research,
Год журнала:
2025,
Номер
unknown
Опубликована: Фев. 13, 2025
Rapid
advances
in
depth
and
throughput
of
untargeted
mass-spectrometry-based
proteomic
technologies
enable
large-scale
cohort
proteogenomic
analyses.
As
such,
the
data
infrastructure
search
engines
required
to
process
must
also
scale.
This
challenge
is
amplified
that
rely
on
library-free
match
between
runs
(MBR)
search,
which
enhanced
depth-per-sample
completeness.
However,
date,
no
MBR-based
could
scale
cohorts
thousands
or
more
individuals.
Here,
we
present
a
strategy
deploy
distributed
cloud
environment
without
source
code
modification,
thereby
enhancing
resource
scalability
throughput.
Additionally,
an
algorithm,
Scalable
MBR,
replicates
MBR
procedure
popular
DIA-NN
software
for
samples.
We
demonstrate
can
MS
raw
files
few
hours
compared
days
original
results
are
almost
indistinguishable
those
native
MBR.
additionally
show
empirical
spectra
generated
by
better
approximates
semiempirical
alternatives
such
as
ID-RT-IM
preserving
user
choice
use
libraries
large
analysis.
The
method
has
been
tested
over
15,000
injections
available
Proteograph
Analysis
Suite.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2025,
Номер
unknown
Опубликована: Март 11, 2025
Abstract
Quantitative
readout
is
essential
in
proteomics,
yet
current
bioinformatics
methods
lack
a
framework
to
handle
the
inherent
multi-level
nature
of
data
(fragments,
MS1
isotopes,
charge
states,
modifications,
peptides
and
genes).
We
present
AlphaQuant,
which
introduces
tree-based
quantification
.
This
approach
organizes
quantitative
into
hierarchical
tree
across
levels.
It
allows
differential
analyses
at
fragment
level,
recovering
up
50-fold
more
regulated
proteins
compared
state-of-the-art
approach.
Using
gradient
boosting
on
features,
we
address
largely
unsolved
challenge
scoring
accuracy,
as
opposed
precision.
Our
method
clusters
with
similar
behavior,
providing
new
protein
grouping
problem
enabling
identification
proteoforms
directly
from
bottom-up
data.
Combined
deep
learning
classification,
infer
phosphopeptides
proteome
alone,
validating
our
findings
EGFR
stimulation
then
describe
proteoform
diversity
mouse
tissues,
revealing
distinct
patterns
post
translational
modifications
alternative
splicing.