bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2025,
Номер
unknown
Опубликована: Май 20, 2025
Mass
spectrometry-based
metaproteomics,
the
identification
and
quantification
of
thousands
proteins
expressed
by
complex
microbial
communities,
has
become
pivotal
for
unraveling
functional
interactions
within
microbiomes.
However,
metaproteomics
data
analysis
encounters
many
challenges,
including
search
tandem
mass
spectra
against
a
protein
sequence
database
using
proteomics
algorithms.
We
used
ground-truth
dataset
to
assess
spectral
library
searching
method
established
approaches.
spectrometry
collected
data-dependent
acquisition
(DDA-MS)
was
analyzed
approaches
(MaxQuant
FragPipe),
as
well
Scribe
with
Prosit
predicted
libraries.
FASTA
databases
that
included
sequences
from
species
present
in
along
background
sequences,
estimate
error
rates
effects
on
detection,
peptide-spectral
match
quality,
quantification.
Using
engine
resulted
more
detected
at
1%
false
discovery
rate
(FDR)
compared
MaxQuant
or
FragPipe,
while
FragPipe
peptides
verified
PepQuery.
able
detect
low-abundance
microbiome
accurate
quantifying
community
composition.
This
research
provides
insights
guidance
researchers
aiming
optimize
results
their
DDA-MS
data.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Март 15, 2024
Circular
RNAs
(circRNAs)
are
covalently
closed
non-coding
lacking
the
5'
cap
and
poly-A
tail.
Nevertheless,
it
has
been
demonstrated
that
certain
circRNAs
can
undergo
active
translation.
Therefore,
aberrantly
expressed
in
human
cancers
could
be
an
unexplored
source
of
tumor-specific
antigens,
potentially
mediating
anti-tumor
T
cell
responses.
This
study
presents
immunopeptidomics
workflow
with
a
specific
focus
on
generating
circRNA-specific
protein
fasta
reference.
The
main
goal
this
is
to
streamline
process
identifying
validating
leukocyte
antigen
(HLA)
bound
peptides
originating
from
circRNAs.
We
increase
analytical
stringency
our
by
retaining
identified
independently
two
mass
spectrometry
search
engines
and/or
applying
group-specific
FDR
for
canonical-derived
circRNA-derived
peptides.
A
subset
specifically
encoded
region
spanning
back-splice
junction
(BSJ)
validated
targeted
MS,
direct
Sanger
sequencing
respective
transcripts.
Our
identifies
54
unique
BSJ-spanning
immunopeptidome
melanoma
lung
cancer
samples.
approach
enlarges
catalog
proteins
explored
immunotherapy.
Journal of Proteome Research,
Год журнала:
2025,
Номер
unknown
Опубликована: Фев. 6, 2025
The
high
throughput
analysis
of
proteins
with
mass
spectrometry
(MS)
is
highly
valuable
for
understanding
human
biology,
discovering
disease
biomarkers,
identifying
therapeutic
targets,
and
exploring
pathogen
interactions.
To
achieve
these
goals,
specialized
proteomics
subfields,
including
plasma
proteomics,
immunopeptidomics,
metaproteomics,
must
tackle
specific
analytical
challenges,
such
as
an
increased
identification
ambiguity
compared
to
routine
experiments.
Technical
advancements
in
MS
instrumentation
can
mitigate
issues
by
acquiring
more
discerning
information
at
higher
sensitivity
levels.
This
exemplified
the
incorporation
ion
mobility
parallel
accumulation
serial
fragmentation
(PASEF)
technologies
timsTOF
instruments.
In
addition,
AI-based
bioinformatics
solutions
help
overcome
integrating
data
into
workflow.
Here,
we
introduce
TIMS2Rescore,
a
data-driven
rescoring
workflow
optimized
DDA-PASEF
from
platform
includes
new
MS2PIP
spectrum
prediction
models
IM2Deep,
deep
learning-based
peptide
predictor.
Furthermore,
fully
streamline
throughput,
TIMS2Rescore
directly
accepts
Bruker
raw
search
results
ProteoScape
many
other
engines,
Sage
PEAKS.
We
showcase
performance
on
immunopeptidomics
(HLA
class
I
II),
metaproteomics
sets.
open-source
freely
available
https://github.com/compomics/tims2rescore.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Май 10, 2024
Abstract
Immunopeptidomics
is
crucial
for
immunotherapy
and
vaccine
development.
Because
the
generation
of
immunopeptides
from
their
parent
proteins
does
not
adhere
to
clear-cut
rules,
rather
than
being
able
use
known
digestion
patterns,
every
possible
protein
subsequence
within
human
leukocyte
antigen
(HLA)
class-specific
length
restrictions
needs
be
considered
during
sequence
database
searching.
This
leads
an
inflation
search
space
results
in
lower
spectrum
annotation
rates.
Peptide-spectrum
match
(PSM)
rescoring
a
powerful
enhancement
standard
searching
that
boosts
performance.
We
analyze
302,105
unique
synthesized
non-tryptic
peptides
ProteomeTools
project
on
timsTOF-Pro
generate
ground-truth
dataset
containing
93,227
MS/MS
spectra
74,847
peptides,
used
fine-tune
deep
learning-based
fragment
ion
intensity
prediction
model
Prosit.
demonstrate
up
3-fold
improvement
identification
immunopeptides,
as
well
increased
detection
low
input
samples.
Journal of Proteome Research,
Год журнала:
2024,
Номер
23(8), С. 3200 - 3207
Опубликована: Март 16, 2024
Rescoring
of
peptide-spectrum
matches
(PSMs)
has
emerged
as
a
standard
procedure
for
the
analysis
tandem
mass
spectrometry
data.
This
emphasizes
need
software
maintenance
and
continuous
improvement
such
algorithms.
We
introduce
MS
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Июнь 3, 2024
Abstract
A
pressing
statistical
challenge
in
the
field
of
mass
spectrometry
proteomics
is
how
to
assess
whether
a
given
software
tool
provides
accurate
error
control.
Each
for
searching
such
data
uses
its
own
internally
implemented
methodology
reporting
and
controlling
error.
Many
these
tools
are
closed
source,
with
incompletely
documented
methodology,
strategies
validating
inconsistent
across
tools.
In
this
work,
we
identify
three
different
methods
false
discovery
rate
(FDR)
control
use
field,
one
which
invalid,
can
only
provide
lower
bound
rather
than
an
upper
bound,
valid
but
under-powered.
The
result
that
has
very
poor
understanding
well
doing
respect
FDR
control,
particularly
analysis
data-independent
acquisition
(DIA)
data.
We
therefore
propose
new,
more
powerful
method
evaluating
setting,
then
employ
method,
along
existing
bounding
technique,
characterize
variety
popular
search
find
data-dependent
(DDA)
generally
seem
at
peptide
level,
whereas
none
DIA
consistently
controls
level
all
datasets
investigated.
Furthermore,
problem
becomes
much
worse
when
latter
evaluated
protein
level.
These
results
may
have
significant
implications
various
downstream
analyses,
since
proper
potential
reduce
noise
lists
thereby
boost
power.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Июнь 3, 2024
Abstract
Recent
developments
in
machine-learning
(ML)
and
deep-learning
(DL)
have
immense
potential
for
applications
proteomics,
such
as
generating
spectral
libraries,
improving
peptide
identification,
optimizing
targeted
acquisition
modes.
Although
new
ML/DL
models
various
properties
are
frequently
published,
the
rate
at
which
these
adopted
by
community
is
slow,
mostly
due
to
technical
challenges.
We
believe
that,
make
better
use
of
state-of-the-art
models,
more
attention
should
be
spent
on
making
easy
accessible
community.
To
facilitate
this,
we
developed
Koina,
an
open-source
containerized,
decentralized
online-accessible
high-performance
prediction
service
that
enables
model
usage
any
pipeline.
Using
widely
used
FragPipe
computational
platform
example,
show
how
Koina
can
easily
integrated
with
existing
proteomics
software
tools
integrations
improve
data
analysis.