The false promise of firearms examination validation studies: Lay controls, simplistic comparisons, and the failure to soundly measure misidentification rates
Richard E. Gutierrez,
No information about this author
Emily J. Prokesch
No information about this author
Journal of Forensic Sciences,
Journal Year:
2024,
Volume and Issue:
69(4), P. 1334 - 1349
Published: April 29, 2024
Abstract
Several
studies
have
recently
attempted
to
estimate
practitioner
accuracy
when
comparing
fired
ammunition.
But
whether
this
research
has
included
sufficiently
challenging
comparisons
dependent
upon
expertise
for
accurate
conclusions
regarding
source
remains
largely
unexplored
in
the
literature.
Control
groups
of
lay
people
comprise
one
means
vetting
question,
assessing
comparison
samples
were
at
least
enough
distinguish
between
experts
and
novices.
This
article
therefore
utilizes
such
a
group,
specifically
82
attorneys,
as
post
hoc
control
juxtaposes
their
performance
on
set
cartridge
case
images
from
commonly
cited
study
(Duez
et
al.
J
Forensic
Sci.
2018;63:1069–1084)
with
that
original
participant
pool
professionals.
Despite
lacking
kind
formalized
training
experience
common
latter,
our
participants
displayed
an
ability,
generally,
cases
by
same
versus
different
guns
327
they
performed.
And
while
rates
lagged
substantially
behind
those
professionals
same‐source
comparisons,
different‐source
was
essentially
indistinguishable
trained
examiners.
indicates
although
we
vetted
may
provide
useful
information
about
professional
performing
it
little
offer
terms
measuring
examiners'
ability
guns.
If
similar
issues
pervade
other
studies,
then
there
is
reason
rely
false‐positive
generated.
Language: Английский
The Hawthorne effect in studies of firearm and toolmark examiners
Journal of Forensic Sciences,
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 10, 2025
Abstract
The
Hawthorne
effect
refers
to
the
tendency
of
individuals
behave
differently
when
they
know
are
being
studied.
In
forensic
science
domain,
concerns
have
been
raised
about
“strategic
examiner,”
where
examiner
uses
different
decision
thresholds
depending
on
whether
in
a
test
situation
or
working
an
actual
case.
blind
testing
conducted
by
Houston
Forensic
Science
Center
(“HFSC”)
firearms
examination
presents
unique
opportunity
hypothesis
that
rate
inconclusive
calls
differs
for
discovered
vs.
undiscovered
tests
firearm
examination.
Over
5
years,
529
item
comparisons
were
filtered
into
casework
at
HFSC.
items
was
56.4%,
while
39.3%.
Thus,
percentage
43.5%
higher
among
than
items.
This
pattern
results
held
bullet
(83%
59%)
and
cartridge
case
(29%
20%)
both
same‐source
different‐source
comparisons.
These
findings
corroborate
examiners
tested
demonstrate
necessity
if
research
goal
is
evaluate
performance
conducting
casework.
Language: Английский
Shifting decision thresholds can undermine the probative value and legal utility of forensic pattern-matching evidence
Proceedings of the National Academy of Sciences,
Journal Year:
2023,
Volume and Issue:
120(41)
Published: Oct. 2, 2023
Forensic
pattern
analysis
requires
examiners
to
compare
the
patterns
of
items
such
as
fingerprints
or
tool
marks
assess
whether
they
have
a
common
source.
This
article
uses
signal
detection
theory
model
examiners'
reported
conclusions
(e.g.,
identification,
inconclusive,
exclusion),
focusing
on
connection
between
examiner's
decision
threshold
and
probative
value
forensic
evidence.
It
Bayesian
network
explore
how
shifts
in
thresholds
may
affect
rates
ratios
true
false
convictions
hypothetical
legal
system.
demonstrates
that
small
thresholds,
which
arise
from
contextual
bias,
can
dramatically
pattern-matching
evidence
its
utility
Language: Английский
Scientific guidelines for evaluating the validity of forensic feature-comparison methods
Proceedings of the National Academy of Sciences,
Journal Year:
2023,
Volume and Issue:
120(41)
Published: Oct. 2, 2023
When
it
comes
to
questions
of
fact
in
a
legal
context-particularly
about
measurement,
association,
and
causality-courts
should
employ
ordinary
standards
applied
science.
Applied
sciences
generally
develop
along
path
that
proceeds
from
basic
scientific
discovery
some
natural
process
the
formation
theory
how
works
what
causes
fail,
development
an
invention
intended
assess,
repair,
or
improve
process,
specification
predictions
instrument's
actions
and,
finally,
empirical
validation
determine
instrument
achieves
effect.
These
elements
are
salient
deeply
embedded
cultures
medicine
engineering,
both
which
primarily
grew
sciences.
However,
inventions
underlie
most
forensic
science
disciplines
have
few
roots
science,
they
do
not
sound
theories
justify
their
predicted
results
tests
prove
work
as
advertised.
Inspired
by
"Bradford
Hill
Guidelines"-the
dominant
framework
for
causal
inference
epidemiology-we
set
forth
four
guidelines
can
be
used
establish
validity
comparison
methods
generally.
This
is
checklist
establishing
threshold
minimum
validity,
no
magic
formula
determines
when
particular
hypotheses
passed
necessary
threshold.
We
illustrate
these
considering
discipline
firearm
tool
mark
examination.
Language: Английский
Inconclusive conclusions in forensic science: rejoinders to Scurich, Morrison, Sinha and Gutierrez
Hal R. Arkes,
No information about this author
Jonathan J. Koehler
No information about this author
Law Probability and Risk,
Journal Year:
2022,
Volume and Issue:
21(3-4), P. 175 - 177
Published: Dec. 1, 2022
To
the
Editor,
We
thank
professors
Scurich,
Morrison,
Sinha
and
Mr
Gutierrez
for
their
thoughtful
comments
on
our
article
(Arkes
Koehler,
2021).
agree
with
Scurich
(2023)
that
when
an
examiner
knows
he
or
she
is
being
tested,
results
of
such
a
test
are
highly
suspect.
If
can
avoid
making
errors
by
deeming
comparison
to
be
inconclusive
if
inconclusives
never
deemed
indicative
error,
then
'strategic'
inflate
accuracy
levels
rendering
decision
any
difficult
test.
Such
will
not
provide
unbiased
measure
examiner's
accuracy.
But
this
reason
enough
change
way
measured.
In
support
different
view
matter,
provides
analogy
offered
Kaye
et
al.
(2022)
in
which
student
answers
'I
don't
know'
true–false
question.
should
know
answer,
say
answer
counted
as
error.
think
Kaye's
apt
context.
Teachers
charged
determining
whether
For
example,
topic
was
covered
required
reading
lecture,
answer.
forensic
test,
one
cannot
Dror
(2020)
suggested
strategies
might
help
determine
but
response
2022)
we
reasons
why
those
inadequate.
Language: Английский
Authors' response
Richard E. Gutierrez,
No information about this author
Emily J. Prokesch
No information about this author
Journal of Forensic Sciences,
Journal Year:
2024,
Volume and Issue:
69(6), P. 2346 - 2348
Published: Aug. 26, 2024
We
read
Mr.
Marshall's
commentary
with
interest,
but
unfortunately
his
submission—riddled
ad
hominem
attacks
on
the
motivations
of
firearms
examination's
critics
and
just
shy
reference-less—nigh-uniformly
concentrates,
not
our
study
design
or
data,
bemoaning
notion
that
examiners'
grandiose
claims
near
infallibility
(e.g.,
"practical
impossibility"
error
[1])
should
rest
empirical
grounds
rather
than
tradition
intention.
In
this
way,
Marshall
has
built
a
soapbox
in
lieu
scientific
critique,
one,
despite
warnings
old
parable,
raised
up
sand
stone.
Indeed,
preference
for
weaving
self-serving
narrative
victimization
over
engaging
substantively
findings
evinces
itself
even
when
considering
attention
to
detail
he
must
have
employed
reading
article.
How
else,
through
cursory
review
could
criticize
us
treating
Duez
et
al.
[2]
as
an
"error
rate
study"—rather
simply
"validation
technique
virtual
comparison
microscopy"—given
repeated
explicit
discussion
ways
which
FBI,
DOJ,
one
study's
authors
promoted
it
proof
low
false-positive
field
examination
[3-6]?
Had
thoroughly
evaluated
treatment
control
groups—including
its
reference,
long
storied
history
controls
[7],
also
studies
utilized
novices
contextualize
performance
forensic
professionals
[3,
8,
9]—would
he,
really
characterize
suggestion
validation
include
such
groups
"new
unusual"?
And
why
else
would
felt
necessary
point
out
limitations
(the
use
attorneys
group
lay
people
photos
opposed
3D
scans
cartridge
cases)
we
had
already
forthrightly
acknowledged
original
article
[3]?
gone
further—for
example,
managed
muster
any
argument
supported
by
references
data
might
erred
suggesting
actually
disadvantaged
participants—he
found
ready
willing
engage
robust
back
forth,
if
reconsider
conclusions.
That
did
leaves
little
respond
to.
But
because
cannot,
at
juncture,
ask
go
try
again,
few
points
warrant
brief
discussion.
A
condition
is
something
happen
happen.
Being
human
going
college,
colleges
do
admit
other
animals.
sufficient
that,
present,
bound
happen;
so
being
all
beings
college.
contrast,
dropping
lighted
match
into
bucket
gasoline
starting
fire,
condition,
there
are
many
rubbing
two
sticks
together.
Notice
empty
be
neither
nor
fire;
usually
conditions
concur
act
specific
causal
consequences.
[11]
Along
similar
lines,
makes
sense
article's
focus
(much
less
accuse
malicious
intent)
merely
ability
separate
professional
novice
does
alone
suffice
render
capable
definitively
proving
validity.
Just
value
no
set
ever
(in
isolation)
resolve
debate,
suitable
research
they
answer
question
interest.
Physicians,
after
all,
ignored
HIV
necessary,
sufficient,
developing
more
serious
AIDS
[14].
Viewed
light,
characterization
some
sinisterly
plotted
"gotcha"
moment
shows
true
colors
knee-jerk
reaction
well-reasoned
critique.
Second,
solitary
foray
rebutting
conclusions
own—his
recent
paper
Growns
[15]—relies
gross
mischaracterizations
aims
limitations.
short,
portrays
establishing
likely
outperform
comparing
fired
bullets
cases.
paper,
tested
only
visual
abilities
included
single
within
participant
pool,
done
nothing
sort.
specifically
reject
very
inference
draws
from
their
findings,
noting
"[a]s
participants
were
untrained
novices,
unclear
whether
these
results
generalize
practicing
professionals"
[15].
Really
though,
grateful
directing
al.'s
excellent
because,
reality,
supports
made
article,
citing
showing
domain
expertise
fields
acknowledging
none
exist
realm
examination,
emphasizing
larger-than-expected
role
"domain-general
ability"
"varies
naturally
general
population"
emerging
training
experience
(a
finding
logically
impel
further
dispense
need
compare
performance)
words,
while
remain
perplexed
belief
citation
served
ends—especially
inconsistency
between
view
validity
settled
statement
"research
beginning
explore
comparison"
[15]—we
thank
him
including
how
adequately
establish
accuracy
practitioners
Third,
fares
better
mounting
defense
own
effort
validate
microscopy
[16].
He
quoted
saying
"[t]here
was
intention
select
pairs
elimination
sets
attempt
lead
making
false
positive
source
attribution,"
parenthetical
followed:
"(e.g.
strong
carry
subclass
characteristics
pairs),"
doing
failed
recognize
way
latter
"changes
meaning"
quotation,
difficulty
coauthors
indeed,
Marshall—now
originally
wrote
them,
cannot
say—who
misunderstands
words.
The
highlights
uses
e.g.
(meaning
example
many)
i.e.
is).
Thus,
portraying
Knowles
efforts
whatsoever
inject
sets,
misunderstood
misrepresent
paper.
now
fault
assume
grammatical
usage
errors
part
coauthors.
More
point,
however,
used
said
parenthetical,
concerns
about
bias
against
testing
challenging
eliminations
stand.
provides
support
anecdotal
conjecture
rarity
characteristics,
justify
ignoring
evaluation
practitioner
grappling
marks,
especially
produced
staggering
rates
misidentification
test
examiners
[12].
troublingly
still,
calls
assertions
nonexistent
challenge
"categorically
misleading,"
evidence
cites
applied
exclusively
identifications.
repeatedly
dismisses
inconsequential,
different
(just
20%
total
study)
"were
significant
final
outcome
research"
listing
first
(and
thus
logically,
primary)
rationale
"to
reduce
potential
identification
response
bias"
Given
harms
criminal
defendants
based
individual
[17,
18],
laboratory's
policies
can
excuse
shortcomings
satisfy
who
demanded
misidentification.
Having
dispensed
substantive
divine
commentary,
final,
albeit
philosophical,
distinction
draw
views
his.
Early
remarks,
paradoxically
accuses
both
remaining
perpetually
tethered
record
existed
2016
well
to,
opinion,
"move
goal
posts"
what
adherents
discipline
demonstrate
assuage
doubts
frankly
understand
former
claim
given
himself
concedes,
expanded
understanding
appropriate
harkening
PCAST's
criteria.
assertion
find
disquieting.
No
scientist
bemoan
chance
expand
base
knowledge
develop
new
techniques
technologies.
most
sickly
stunted
version
method
harbor
precedent
past.
Down
road
dunkers
women
prove
witchcraft,
phrenologists
measuring
bumps
skulls
uphold
racist
hegemony,
backwater
physicians
still
bleeding
those
burdened
illness;
them
enraged
empiricism
replace
tradition.
hold
grander
vision
endeavor
research,
relish
opportunity
naysayers
wrong
proven
themselves,
life
liberty
impacted
methods
treasure
bare
minimum,
ground
beliefs
desire.
If
hope
see
again
solid
literature
under
law,
take
expansive
formidable
view.
They
cease
begrudging
goalposts
continue
long,
hard
grind
reflection
research.
make
Marshalls
world
minority.
Language: Английский
Authors' response
Richard E. Gutierrez,
No information about this author
Emily J. Prokesch
No information about this author
Journal of Forensic Sciences,
Journal Year:
2024,
Volume and Issue:
70(1), P. 405 - 408
Published: Nov. 13, 2024
See
Original
Article
here
Commentary
on
We
thank
the
commentors
for
drawing
our
attention
to
two
typographical
errors
in
original
article—the
spelling
of
Dr.
Lilien's
name
and
inconclusive
rate
provided
table
2
"Duez
examiners,"
which
should
have
been
13%
as
opposed
15%—though
we
emphasize
that
latter
did
not
carry
over
into
other
figures
calculations
(e.g.,
confidence
intervals)
throughout
remainder
piece.
Unfortunately,
beyond
its
contribution
copy
editing,
their
letter
amounts
to,
at
best,
much
ado
about
nothing,
and,
worst,
something
akin
statistical
malpractice.
But
wherever
along
spectrum
readers
place
Weller
et
al.'s
commentary
after
reviewing
reply,
analysis
al.
provide
cannot
support
strongly
held
(and
potentially
financially
motivated)
belief
they
contradicted
central
claims
article
proven
value
work
(Duez
[1])
even
evaluation
performance
different
source
comparisons.
Indeed,
given
set
out
explore
if
post
hoc
inclusion
a
control
group
could
insights
whether
comparisons
Duez
(especially
comparisons)
adequately
explored
"full
range
distribution
types
difficulty
normally
seen
casework"
[2-4],
well
end
this
response
now
merely
by
noting
commentors'
admission
samples
used
"look
so
'laypersons'
are
unlikely
misidentify
them."
because
gimmicks
might
otherwise
prove
persuasive
casual
those
with
only
most
superficial
understanding
hypothesis
testing,
feel
obliged
offer
more
fulsome
rebuttal.
To
begin,
caveat
"feel[ing]
no
need
publish
detailed
line-by-line
analysis"
does
little
compensate
perfunctory,
self-absolving,
internally
inconsistent
discussion
appropriate
characteristics
group,
use
static
images,
binning
examiners
trainees.
In
article,
specifically
cautioned
drew
participant
pool
from
"a
sample
convenience"
was
necessarily
"representative
defense
attorneys
whole,"
less
novices
writ
large
[4].
And
conceded
would
preferred
lay
off
same
materials
participants
itself
(i.e.,
3D
scans
cartridge
cases
CCTS2);
indeed,
recommended
future
studies
endeavor
correct
both
limitations
criticisms
ultimately
fail
expand
credibly
upon
these
concessions.
Initially,
speculate
obliquely
lit
images
may
advantaged
participants.
argument
disserves
firearms
examiners—whose
expertise,
possess
any,
it
reduces
mere
lighting
rather
than
comparison
ability—it
also
ignores
own
role
forcing
us
less-than-ideal
samples:
neither
explain
years-long
failure
make
CCTS2
public
uploading
them
reference
database
maintained
National
Institute
Standards
Technology
[5],
nor
corrected
lack
transparency
article.
Instead,
chosen
path
scientific
entrapment,
ensuring
outside
researchers
attempt
reproduce
or
reassess
findings
without
opening
themselves
up
criticism
regarding
inexact
matching
items.
At
bottom,
Hobson's
choice
created
coexist
legitimate
open
debate.
Much
is
true
how
categorize
They
again
"some
trainees
had
experience
toolmark
examination
'lay'
attorneys"
NFC
group.
However,
do
forthrightly
acknowledge
(by
failing
collect
sufficient
demographic
data
creating
any
ambiguities
level
training
enjoyed
trainee
It
strikes
thoroughly
unfair
disingenuous
exploit
collection
about)
contrasting
failures
fronts.
putting
aside
concerns
hypocrisy
self-absolution,
simultaneous
treatment
attorney
quasi-experts—only
three
whom
ever
completed
case
before
study
majority
never
cross-examined
examiner
received
field
[4]—and
underserving
moniker
withstand
scrutiny.
With
all
due
respect
Cambridge
Dictionary
cite
define
"layperson"
[6],
prefer
draw
expertise
English
wordsmiths,
but
courts
must
grapple
qualify
particular
witness
[7].
As
noted
legal
definition
an
expert
sets
low
bar
indeed
[4,
8].
while
relatively
early
programs
likely
satisfy
such
minimal
requirements
(armed
be
days
education
specific
in,
comparisons),
surely
(on
basis
indirect
gained
through
cross-examination,
20
min)
[9-11].
Sad
day
though
practitioners
outcompete
outsiders
zero
experience,
lines
drawn
when
qualifying
witnesses
decision,
main
conclusions,
bin
considering
best
practice
guidance
classifying
professionals
feature
fields
recommending
introductory
groups
[12].
Unless
until
forensic
community
imposes
mandatory
courts,
simply
claiming
otherwise.
Preliminaries
aside,
devote
bulk
using
Fisher's
exact
test
evaluate
cherrypicked
portion
overall,
alleging
selective
approach
has
"rejected
[our]
published
conclusions"
shown
statistically
significant
difference
between
doubt
wisdom
non-statisticians
debating
merits
application
various
mathematical
methods
testing
relationship
training/experience
performance.
avoided
p-values
significance
thresholds
outright
concern
contradict
provoked
controversy
across
fields,
some
journals
banning
outright,
scholars
issuing
complex
guard
against
misleading
unsupported
inferences
[13-15].
continue
believe
that—by
reporting
intervals
focusing
patterns
trends
data—we
engaged
practices
[13],
decision
forge
ahead
morass
alone
(with
statistician,
partnership)
made
avoiding
accompanying
reality
debate
non-experts)
impossible.
Before
diving
test,
admit
considerable
confusion
fit
draw.
Specifically,
repeatedly,
quite
forcefully,
contend
conclusions,"
results
report,
confined
just
al.,
almost
entirely
confirm
findings.
expressed
bias
away
eliminations
16,
17].
(at
least
results)
supports
concern:
13
orders
magnitude
separate
report
comparing
versus
samples.
point
indicating
accuracy
same-source
"against
sweeping
definitive
conclusions
identification
cases,"
stated
"our
show
separation
participants"
metrics
(performance
sensitivity,
false-negative
rate,
rate)
Our
conclusion
result
align.
false-positive
rates
examiners"
were
indistinguishable
Again,
confirms
conclusion.
observed
terms
specificity
(further
explaining
disadvantages
faced
latter,
including
time
constraints
inability
zoom
align
comparable
participants)
The
showing
professional
novice
metrics,
caveated
conclusions.
overall
thus
incorporating
overreliance
on)
arguably
run
counter
reported
then,
can
find
contradiction
(rather
widespread
correspondence
reality)
form
"p
hacking"
[14,
15],
claim
rebutted
limiting
study's
full
(our
all").
clear,
caused
inappropriately
binary
logic
deploying
discreet
(like
0.05
al.)
[14].
Rather,
consistently
"researcher
degrees
freedom"—that
is,
ability
pick
choose
what
exclude
"[w]hich
conditions
combined
ones
compared"—allow
authors
manipulate
near-guarantee
favored
[15].
American
Statistical
Association
chastised
practices,
emphasizing
"[c]herry-picking
promising
findings,
known
dredging,
chasing,
questing,
inference,
'p-hacking,'
leads
spurious
excess
literature
vigorously
avoided"
remedy
problems,
ASA
voiced
consistent
mandate
transparency,
"P-values
related
analyses
selectively,"
"[c]onducting
multiple
…
certain
renders
essentially
uninterpretable"
[14],
eliminating
observations,
"authors
observations
included"
abide
principles:
although
explained
length
why
ought
compared
neglect
include
study.
favors,
choosing
instead
exclusively
favor
desired
(clean
professionals,
probative
value).
regardless
process
preceding
creation
commentary,
afoul
principle,
conceals
malignant
heart
inferences.
Had
they,
did,
(its
overall),
claimed
way.
Table
1,
generated
uniformly
exceed
threshold
wide
margins
words,
content
rely
thresholds,
dispensed
handwringing
caveats
around
specificity:
there
world
bars
qualification
non-existent
necessary
examiner,
remain
firm
therefore
offers
labs,
comes
questions
ground
truth
eliminations.
Whether
knew
refused
stubbornly
failed
reinforce
preexisting
beliefs,
arrived
ignorance,
say.
inferential
inadequacies,
forcefully
back
PCAST
emphasis
involvement
(during
validation
testing)
"independent
third
parties
stake
outcome"
[3].
While
law
enforcement
chaffed
requirement
[18-20],
see
emblematic
PCAST's
wisdom.
Those
financial
virtual
microscopy
techniques
sought
validate,
defend
commentary.
become
component
company's
marketing,
appearing
website
below
text
"Cadre's
scanning
hardware,
VCM
software,
algorithms
developed,
validated,
peer
reviewed
journals"
[21].
view,
sway
interests
explain,
analysis,
several
concerning
aspects
struggle
disinterested
like
arguments
authority
(their
extended
recitation
awards
received)
refusing
one
(Mr.
Weller)
found
"remarkably
credible"
"lack[ing]
objectivity
virtually
every
area,"
Illinois
judge,
part,
his
highly
[22].
doubt,
finances
line,
try
buttress
pointing
follow-up
[23],
author
under
oath
involved
where
"similarities,
features
kind
dupe
trick
making
false
positive"
[24].
pretend
independence
disinterest
ourselves
(given
long
defending
indigent
accused
confronting
admissibility
evidence),
allegiance
transparently
fully
supplying
raw
(including
linked
data)
scrutiny
researchers.
For
commentors,
whose
stems
philosophy
dollars
cents,
demand
much.
prior
work,
leap.
Language: Английский
More unjustified inferences from limited data in
Richard E. Gutierrez
No information about this author
Law Probability and Risk,
Journal Year:
2024,
Volume and Issue:
23(1)
Published: Jan. 1, 2024
Abstract
In
recent
years,
multiple
scholars
have
criticized
the
design
of
studies
exploring
accuracy
firearms
examination
methods.
Rosenblum
et
al.
extend
those
criticisms
to
work
Guyll
on
practitioner
performance
when
comparing
fired
cartridge
cases.
But
while
thoroughly
dissect
issues
regarding
equiprobability
bias
and
positive
predictive
values
in
study,
they
do
not
delve
as
deeply
into
other
areas
such
variability
participant
performance,
well
sampling
participants
test
samples,
that
further
undercut
ability
generalize
al.’s
results.
This
commentary
extends
what
began
explores
how
low
rates
error
reported
by
likely
underestimate
potential
for
misidentifications
casework.
Ultimately,
given
convenience
authors
should
gone
beyond
descriptive
statistics
instead
draw
conclusive
inferences
classify
“a
highly
valid
forensic
technique.”
Language: Английский
Inconclusive Conclusions in Forensic Science: Rejoinders to Scurich, Morrison, Sinha & Gutierrez
Hal R. Arkes,
No information about this author
Jonathan J. Koehler
No information about this author
SSRN Electronic Journal,
Journal Year:
2023,
Volume and Issue:
unknown
Published: Jan. 1, 2023
We
agree
with
Scurich
(2023)
that
when
an
examiner
knows
he
or
she
is
being
tested,
the
results
of
such
a
test
highly
suspect.
If
can
avoid
making
errors
by
deeming
comparison
to
be
inconclusive,
and
if
inconclusives
are
never
deemed
indicative
error,
then
“strategic”
inflate
accuracy
levels
rendering
inconclusive
decision
for
any
difficult
test.
Such
will
not
provide
unbiased
measure
examiner’s
accuracy.
But
this
reason
enough
change
way
measured.
support
view
expressed
in
Morrison
(2023),
but
our
paper
we
accepted
world
as
it
currently
exists,
one
which
examiners
use
categorical
conclusions.
Finally,
Sinha
Gutierrez
blind
testing
resolve
all
issues
mentioned
their
final
sentence.
think
would
major
step
right
direction.
Language: Английский