We
have
outlined
several
problems
with
the
state
of
error
rate
studies
on
firearm
and
toolmark
examination.
Fundamentally,
we
do
not
know
what
is
for
these
types
comparisons.
This
a
failure
scientific
study
toolmarks,
rather
than
examiners
themselves,
but
until
this
corrected
multiple
that
meet
criteria
described
in
Section
3,
cannot
support
use
evidence
criminal
proceedings.
Journal of Forensic Sciences,
Journal Year:
2022,
Volume and Issue:
68(1), P. 86 - 100
Published: Oct. 1, 2022
Abstract
This
black
box
study
assessed
the
performance
of
forensic
firearms
examiners
in
United
States.
It
involved
three
different
types
and
173
volunteers
who
performed
a
total
8640
comparisons
both
bullets
cartridge
cases.
The
overall
false‐positive
error
rate
was
estimated
as
0.656%
0.933%
for
cases,
respectively,
while
false
negatives
2.87%
1.87%
respectively.
majority
errors
were
made
by
limited
number
examiners.
Because
chi‐square
tests
independence
strongly
suggest
that
probabilities
are
not
same
each
examiner,
these
maximum‐likelihood
estimates
based
on
beta‐binomial
probability
model
do
depend
an
assumption
equal
examiner‐specific
rates.
Corresponding
95%
confidence
intervals
(0.305%,
1.42%)
(0.548%,
1.57%)
positives
(1.89%,
4.26%)
(1.16%,
2.99%)
results
this
consistent
with
prior
studies,
despite
its
comprehensive
design
challenging
specimens.
Journal of Forensic Sciences,
Journal Year:
2024,
Volume and Issue:
69(4), P. 1334 - 1349
Published: April 29, 2024
Abstract
Several
studies
have
recently
attempted
to
estimate
practitioner
accuracy
when
comparing
fired
ammunition.
But
whether
this
research
has
included
sufficiently
challenging
comparisons
dependent
upon
expertise
for
accurate
conclusions
regarding
source
remains
largely
unexplored
in
the
literature.
Control
groups
of
lay
people
comprise
one
means
vetting
question,
assessing
comparison
samples
were
at
least
enough
distinguish
between
experts
and
novices.
This
article
therefore
utilizes
such
a
group,
specifically
82
attorneys,
as
post
hoc
control
juxtaposes
their
performance
on
set
cartridge
case
images
from
commonly
cited
study
(Duez
et
al.
J
Forensic
Sci.
2018;63:1069–1084)
with
that
original
participant
pool
professionals.
Despite
lacking
kind
formalized
training
experience
common
latter,
our
participants
displayed
an
ability,
generally,
cases
by
same
versus
different
guns
327
they
performed.
And
while
rates
lagged
substantially
behind
those
professionals
same‐source
comparisons,
different‐source
was
essentially
indistinguishable
trained
examiners.
indicates
although
we
vetted
may
provide
useful
information
about
professional
performing
it
little
offer
terms
measuring
examiners'
ability
guns.
If
similar
issues
pervade
other
studies,
then
there
is
reason
rely
false‐positive
generated.
Law Probability and Risk,
Journal Year:
2020,
Volume and Issue:
19(3-4), P. 317 - 364
Published: Dec. 1, 2020
Abstract
In
the
past
decade,
and
in
response
to
recommendations
set
forth
by
National
Research
Council
Committee
on
Identifying
Needs
of
Forensic
Sciences
Community
(2009),
scientists
have
conducted
several
black-box
studies
that
attempt
estimate
error
rates
firearm
examiners.
Most
these
resulted
vanishingly
small
rates,
at
least
one
them
(D.
P.
Baldwin,
S.
J.
Bajic,
M.
Morris,
D.
Zamzow.
A
Study
False-Positive
False-Negative
Error
Rates
Cartridge
Case
Comparisons.
Technical
report,
Ames
Lab
IA,
Performing,
Fort
Belvoir,
VA,
April
2014.)
was
cited
President’s
Advisors
Science
Technology
(PCAST)
during
Obama
administration,
as
an
example
a
well-designed
experiment.
What
has
received
little
attention,
however,
is
actual
calculation
particular,
effect
inconclusive
findings
those
estimates.
The
treatment
inconclusives
assessment
errors
far-reaching
implications
legal
system.
Here,
we
revisit
area
firearms
examination,
investigating
their
results.
It
clear
there
are
stark
differences
rate
results
regions
with
different
norms
for
training
reporting
conclusions.
More
surprisingly,
decisions
materials
from
sources
notably
higher
than
same-source
some
regions.
To
mitigate
effects
this
difference
propose
unifying
approach
directly
applicable
forensic
laboratories
settings.
Journal of Forensic Sciences,
Journal Year:
2021,
Volume and Issue:
66(5), P. 1704 - 1720
Published: May 31, 2021
Abstract
The
forensic
science
pattern
comparison
areas,
including
fingerprints,
footwear,
and
firearms,
have
been
criticized
for
their
subjective
nature.
While
much
research
has
attempted
to
move
these
disciplines
more
objective
methods,
examiners
are
still
coming
conclusions
based
on
own
training
experience.
To
complement
this
subjectivity,
black
box
studies
necessary
establish
the
accuracy
of
feature‐comparison
methods.
However,
when
cartridges
fired
by
a
firearm
create
cartridge
case
test
sets
there
may
be
significant
variability
within
resulting
impressions.
This
can
result
in
different
participants
receiving
with
varying
levels
difficulty
differences
impression
quality.
Therefore,
between
is
not
straightforward.
compare
examiners,
method
called
double‐casting
was
used
plastic
reproductions.
Double‐casts
twenty‐one
master
cases
were
created
mailed
examiners.
double‐casts
ensured
that
all
comparing
exhibits
same
level
detail.
tasked
determining
if
unknown
each
set
as
three
knowns.
Automated
comparisons
also
set.
results
from
study
showed
examiner
examining
evidence.
Furthermore,
it
shown
automated
metrics
would
benefit
quality
control
measure
correct
any
potential
errors
strengthen
conclusions.
Journal of Forensic Sciences,
Journal Year:
2020,
Volume and Issue:
66(2), P. 557 - 570
Published: Oct. 26, 2020
Abstract
The
digital
examination
of
scanned
or
measured
3D
surface
topography
is
referred
to
as
Virtual
Comparison
Microscopy
(VCM).
Within
the
discipline
firearm
and
toolmark
examination,
VCM
enables
review
comparison
microscopic
toolmarks
on
fired
ammunition
components.
In
coming
years,
this
technique
may
supplement
potentially
replace
light
microscope
primary
instrument
used
for
examination.
This
paper
describes
a
error
rate
validation
study
involving
107
participants.
included
40
test
sets
cartridge
cases
from
firearms
with
variety
makes,
models,
calibers.
Participants
commercially
available
software
which
allowed
data
distribution,
specimen
visualization,
submission
conclusions.
also
participants
annotate
areas
similarity
dissimilarity
support
their
cohort
76
qualified
United
States
Canadian
examiners
that
completed
had
an
overall
false‐positive
3
errors
693
comparisons
(0.43%)
false‐negative
0
491
(0.0%).
accuracy
supplemented
by
participant's
provided
annotations
provide
insight
into
cause
consistency
across
independent
examinations
conducted
in
study.
ability
obtain
highly
accurate
conclusions
fires
wide
range
supports
hypothesis
useful
tool
within
crime
laboratory.
Forensic Science International Synergy,
Journal Year:
2020,
Volume and Issue:
2, P. 521 - 539
Published: Jan. 1, 2020
This
review
paper
covers
the
forensic-relevant
literature
in
shoe
and
tool
mark
examination
from
2016
to
2019
as
a
part
of
19th
Interpol
International
Forensic
Science
Managers
Symposium.
The
papers
are
also
available
at
website
at:
https://www.interpol.int/content/download/14458/file/Interpol%20Review%20Papers%202019.pdf.
Journal of Forensic Sciences,
Journal Year:
2024,
Volume and Issue:
69(6), P. 2028 - 2040
Published: Aug. 22, 2024
Abstract
Traditionally,
firearm
and
toolmark
examiners
manually
evaluate
the
similarity
of
features
on
two
bullets
using
comparison
microscopy.
Advances
in
microscopy
have
made
it
possible
to
collect
3D
topographic
data,
several
automated
algorithms
been
introduced
for
bullet
striae
these
data.
In
this
study,
open‐source
approaches
cross‐correlation,
congruent
matching
profile
segments,
consecutive
striations,
a
random
forest
model
were
evaluated.
A
statistical
characterization
was
performed
four
datasets
consecutively
manufactured
firearms
provide
challenging
scenario.
Each
approach
applied
all
samples
pairwise
fashion,
classification
performance
compared.
Based
findings,
Bayesian
network
empirically
learned
constructed
leverage
strengths
each
individual
approach,
relationship
between
results,
combine
them
into
posterior
probability
given
comparison.
The
evaluated
similarly
approaches,
results
developed
classified
99.6%
correctly,
resultant
distributions
significantly
separated
more
so
than
when
used
isolation.
Journal of Forensic Sciences,
Journal Year:
2022,
Volume and Issue:
67(3), P. 899 - 910
Published: March 7, 2022
Abstract
Silicone
casts
are
widely
used
by
practitioners
in
the
comparative
analysis
of
forensic
items.
Fractured
surfaces
carry
unique
details
that
can
provide
accurate
quantitative
comparisons
fragments.
In
this
study,
a
statistical
comparison
protocol
was
applied
to
set
3D
topological
images
fractured
surface
pairs
and
their
replicas
confidence
between
items
silicone
cast
replicas.
A
10
stainless
steel
samples
were
from
same
metal
rod
under
controlled
conditions
replicated
using
standard
casting
technique.
Six
maps
with
50%
overlap
acquired
for
each
pair.
Spectral
analyses
utilized
identify
correlation
features
at
different
length
scales
topology.
We
selected
two
frequency
bands
over
critical
wavelength
(greater
than
two‐grain
diameters)
comparison.
Our
model
matrix‐variate
t
‐distribution
accounts
match
non‐match
population
densities.
decision
rule
identified
probability
matched
unmatched
surfaces.
The
proposed
methodology
correctly
classified
posterior
exceeding
99.96%.
Moreover,
replication
technique
shows
potential
accurately
replicating
fracture
greater
20
μm,
which
far
exceeds
feature
range
on
most
metallic
alloy
framework
establishes
basis
limits
articles
while
providing
reliable
mechanics‐based
Journal of Forensic Sciences,
Journal Year:
2024,
Volume and Issue:
69(6), P. 2346 - 2348
Published: Aug. 26, 2024
We
read
Mr.
Marshall's
commentary
with
interest,
but
unfortunately
his
submission—riddled
ad
hominem
attacks
on
the
motivations
of
firearms
examination's
critics
and
just
shy
reference-less—nigh-uniformly
concentrates,
not
our
study
design
or
data,
bemoaning
notion
that
examiners'
grandiose
claims
near
infallibility
(e.g.,
"practical
impossibility"
error
[1])
should
rest
empirical
grounds
rather
than
tradition
intention.
In
this
way,
Marshall
has
built
a
soapbox
in
lieu
scientific
critique,
one,
despite
warnings
old
parable,
raised
up
sand
stone.
Indeed,
preference
for
weaving
self-serving
narrative
victimization
over
engaging
substantively
findings
evinces
itself
even
when
considering
attention
to
detail
he
must
have
employed
reading
article.
How
else,
through
cursory
review
could
criticize
us
treating
Duez
et
al.
[2]
as
an
"error
rate
study"—rather
simply
"validation
technique
virtual
comparison
microscopy"—given
repeated
explicit
discussion
ways
which
FBI,
DOJ,
one
study's
authors
promoted
it
proof
low
false-positive
field
examination
[3-6]?
Had
thoroughly
evaluated
treatment
control
groups—including
its
reference,
long
storied
history
controls
[7],
also
studies
utilized
novices
contextualize
performance
forensic
professionals
[3,
8,
9]—would
he,
really
characterize
suggestion
validation
include
such
groups
"new
unusual"?
And
why
else
would
felt
necessary
point
out
limitations
(the
use
attorneys
group
lay
people
photos
opposed
3D
scans
cartridge
cases)
we
had
already
forthrightly
acknowledged
original
article
[3]?
gone
further—for
example,
managed
muster
any
argument
supported
by
references
data
might
erred
suggesting
actually
disadvantaged
participants—he
found
ready
willing
engage
robust
back
forth,
if
reconsider
conclusions.
That
did
leaves
little
respond
to.
But
because
cannot,
at
juncture,
ask
go
try
again,
few
points
warrant
brief
discussion.
A
condition
is
something
happen
happen.
Being
human
going
college,
colleges
do
admit
other
animals.
sufficient
that,
present,
bound
happen;
so
being
all
beings
college.
contrast,
dropping
lighted
match
into
bucket
gasoline
starting
fire,
condition,
there
are
many
rubbing
two
sticks
together.
Notice
empty
be
neither
nor
fire;
usually
conditions
concur
act
specific
causal
consequences.
[11]
Along
similar
lines,
makes
sense
article's
focus
(much
less
accuse
malicious
intent)
merely
ability
separate
professional
novice
does
alone
suffice
render
capable
definitively
proving
validity.
Just
value
no
set
ever
(in
isolation)
resolve
debate,
suitable
research
they
answer
question
interest.
Physicians,
after
all,
ignored
HIV
necessary,
sufficient,
developing
more
serious
AIDS
[14].
Viewed
light,
characterization
some
sinisterly
plotted
"gotcha"
moment
shows
true
colors
knee-jerk
reaction
well-reasoned
critique.
Second,
solitary
foray
rebutting
conclusions
own—his
recent
paper
Growns
[15]—relies
gross
mischaracterizations
aims
limitations.
short,
portrays
establishing
likely
outperform
comparing
fired
bullets
cases.
paper,
tested
only
visual
abilities
included
single
within
participant
pool,
done
nothing
sort.
specifically
reject
very
inference
draws
from
their
findings,
noting
"[a]s
participants
were
untrained
novices,
unclear
whether
these
results
generalize
practicing
professionals"
[15].
Really
though,
grateful
directing
al.'s
excellent
because,
reality,
supports
made
article,
citing
showing
domain
expertise
fields
acknowledging
none
exist
realm
examination,
emphasizing
larger-than-expected
role
"domain-general
ability"
"varies
naturally
general
population"
emerging
training
experience
(a
finding
logically
impel
further
dispense
need
compare
performance)
words,
while
remain
perplexed
belief
citation
served
ends—especially
inconsistency
between
view
validity
settled
statement
"research
beginning
explore
comparison"
[15]—we
thank
him
including
how
adequately
establish
accuracy
practitioners
Third,
fares
better
mounting
defense
own
effort
validate
microscopy
[16].
He
quoted
saying
"[t]here
was
intention
select
pairs
elimination
sets
attempt
lead
making
false
positive
source
attribution,"
parenthetical
followed:
"(e.g.
strong
carry
subclass
characteristics
pairs),"
doing
failed
recognize
way
latter
"changes
meaning"
quotation,
difficulty
coauthors
indeed,
Marshall—now
originally
wrote
them,
cannot
say—who
misunderstands
words.
The
highlights
uses
e.g.
(meaning
example
many)
i.e.
is).
Thus,
portraying
Knowles
efforts
whatsoever
inject
sets,
misunderstood
misrepresent
paper.
now
fault
assume
grammatical
usage
errors
part
coauthors.
More
point,
however,
used
said
parenthetical,
concerns
about
bias
against
testing
challenging
eliminations
stand.
provides
support
anecdotal
conjecture
rarity
characteristics,
justify
ignoring
evaluation
practitioner
grappling
marks,
especially
produced
staggering
rates
misidentification
test
examiners
[12].
troublingly
still,
calls
assertions
nonexistent
challenge
"categorically
misleading,"
evidence
cites
applied
exclusively
identifications.
repeatedly
dismisses
inconsequential,
different
(just
20%
total
study)
"were
significant
final
outcome
research"
listing
first
(and
thus
logically,
primary)
rationale
"to
reduce
potential
identification
response
bias"
Given
harms
criminal
defendants
based
individual
[17,
18],
laboratory's
policies
can
excuse
shortcomings
satisfy
who
demanded
misidentification.
Having
dispensed
substantive
divine
commentary,
final,
albeit
philosophical,
distinction
draw
views
his.
Early
remarks,
paradoxically
accuses
both
remaining
perpetually
tethered
record
existed
2016
well
to,
opinion,
"move
goal
posts"
what
adherents
discipline
demonstrate
assuage
doubts
frankly
understand
former
claim
given
himself
concedes,
expanded
understanding
appropriate
harkening
PCAST's
criteria.
assertion
find
disquieting.
No
scientist
bemoan
chance
expand
base
knowledge
develop
new
techniques
technologies.
most
sickly
stunted
version
method
harbor
precedent
past.
Down
road
dunkers
women
prove
witchcraft,
phrenologists
measuring
bumps
skulls
uphold
racist
hegemony,
backwater
physicians
still
bleeding
those
burdened
illness;
them
enraged
empiricism
replace
tradition.
hold
grander
vision
endeavor
research,
relish
opportunity
naysayers
wrong
proven
themselves,
life
liberty
impacted
methods
treasure
bare
minimum,
ground
beliefs
desire.
If
hope
see
again
solid
literature
under
law,
take
expansive
formidable
view.
They
cease
begrudging
goalposts
continue
long,
hard
grind
reflection
research.
make
Marshalls
world
minority.