Cited by Balancing validity and reliability as a function of sampling variability in forensic voice comparison

Accuracy of comparison decisions by forensic firearms examiners DOI

Keith L. Monson,

Erich D. Smith,

Eugene M. Peters

et al.

Journal of Forensic Sciences, Journal Year: 2022, Volume and Issue: 68(1), P. 86 - 100

Published: Oct. 1, 2022

Abstract This black box study assessed the performance of forensic firearms examiners in United States. It involved three different types and 173 volunteers who performed a total 8640 comparisons both bullets cartridge cases. The overall false‐positive error rate was estimated as 0.656% 0.933% for cases, respectively, while false negatives 2.87% 1.87% respectively. majority errors were made by limited number examiners. Because chi‐square tests independence strongly suggest that probabilities are not same each examiner, these maximum‐likelihood estimates based on beta‐binomial probability model do depend an assumption equal examiner‐specific rates. Corresponding 95% confidence intervals (0.305%, 1.42%) (0.548%, 1.57%) positives (1.89%, 4.26%) (1.16%, 2.99%) results this consistent with prior studies, despite its comprehensive design challenging specimens.

Language: Английский

Citations

Validity of forensic cartridge-case comparisons DOI

Max Guyll, Stephanie Madon, Yueran Yang

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2023, Volume and Issue: 120(20)

Published: May 8, 2023

This article presents key findings from a research project that evaluated the validity and probative value of cartridge-case comparisons under field-based conditions. Decisions provided by 228 trained firearm examiners across US showed forensic comparison is characterized low error rates. However, inconclusive decisions constituted over one-fifth all rendered, complicating evaluation technique's ability to yield unambiguously correct decisions. Specifically, restricting only conclusive identification elimination yielded true-positive true-negative rates exceeding 99%, but incorporating inconclusives caused these values drop 93.4% 63.5%, respectively. The asymmetric effect on two occurred because were rendered six times more frequently for different-source than same-source comparisons. Considering value, which decision's usefulness determining comparison's ground-truth state, predicted their corresponding states with near perfection. Likelihood ratios (LRs) further greatly increase odds state matching asserted decision. Inconclusive also possessed predicting status having LR indicating they status. study manipulated difficulty using models produce dissimilar markings. model chosen being difficult received comparisons, resulting in lower rate compared less model. Relatedly, exhibited strongly predictive

Language: Английский

Citations

Misuse of statistical method results in highly biased interpretation of forensic evidence in DOI

Michael Rosenblum, Elizabeth T. Chin, Elizabeth L. Ogburn

et al.

Law Probability and Risk, Journal Year: 2024, Volume and Issue: 23(1)

Published: Jan. 1, 2024

Since the National Academy of Sciences released their report outlining paths for improving reliability, standards, and policies in forensic sciences (NAS, 2009), there has been heightened interest evaluating scientific validity science disciplines.Guyll et al. (2023) seek to evaluate cartridge-case comparisons.They conducted an experiment test accuracy firearms examiners.They then describe how triers fact such as a judge or jury criminal case, who are initially unbiased have not yet seen any evidence, should apply results case at hand.Specifically, Guyll use Bayes' rule calculate posterior probability that cartridge found crime scene was fired from reference gun (often linked defendant), given decision examiner.A key input this calculation is prior odds gun, which set 1 claim be unbiased.However, we explain below, typically highly biased against defendant can lead judges jurors trials grossly misunderstand interpret evidence.It imperative address erroneous statistical argument (2023), being presented by prosecution ongoing homicide (DC Superior Court, 2023).We discuss some other aspects study design analysis well.Our focus on specific issues exhaustive.

Language: Английский

Citations

The false promise of firearms examination validation studies: Lay controls, simplistic comparisons, and the failure to soundly measure misidentification rates DOI

Richard E. Gutierrez,

Emily J. Prokesch

Journal of Forensic Sciences, Journal Year: 2024, Volume and Issue: 69(4), P. 1334 - 1349

Published: April 29, 2024

Abstract Several studies have recently attempted to estimate practitioner accuracy when comparing fired ammunition. But whether this research has included sufficiently challenging comparisons dependent upon expertise for accurate conclusions regarding source remains largely unexplored in the literature. Control groups of lay people comprise one means vetting question, assessing comparison samples were at least enough distinguish between experts and novices. This article therefore utilizes such a group, specifically 82 attorneys, as post hoc control juxtaposes their performance on set cartridge case images from commonly cited study (Duez et al. J Forensic Sci. 2018;63:1069–1084) with that original participant pool professionals. Despite lacking kind formalized training experience common latter, our participants displayed an ability, generally, cases by same versus different guns 327 they performed. And while rates lagged substantially behind those professionals same‐source comparisons, different‐source was essentially indistinguishable trained examiners. indicates although we vetted may provide useful information about professional performing it little offer terms measuring examiners' ability guns. If similar issues pervade other studies, then there is reason rely false‐positive generated.

Language: Английский

Citations

The Hawthorne effect in studies of firearm and toolmark examiners DOI

Nicholas Scurich, Thomas D. Albright,

Peter Stout

et al.

Journal of Forensic Sciences, Journal Year: 2025, Volume and Issue: unknown

Published: April 10, 2025

Abstract The Hawthorne effect refers to the tendency of individuals behave differently when they know are being studied. In forensic science domain, concerns have been raised about “strategic examiner,” where examiner uses different decision thresholds depending on whether in a test situation or working an actual case. blind testing conducted by Houston Forensic Science Center (“HFSC”) firearms examination presents unique opportunity hypothesis that rate inconclusive calls differs for discovered vs. undiscovered tests firearm examination. Over 5 years, 529 item comparisons were filtered into casework at HFSC. items was 56.4%, while 39.3%. Thus, percentage 43.5% higher among than items. This pattern results held bullet (83% 59%) and cartridge case (29% 20%) both same‐source different‐source comparisons. These findings corroborate examiners tested demonstrate necessity if research goal is evaluate performance conducting casework.

Language: Английский

Citations

Repeatability and reproducibility of comparison decisions by firearms examiners DOI

Keith L. Monson,

Erich D. Smith,

Eugene M. Peters

et al.

Journal of Forensic Sciences, Journal Year: 2023, Volume and Issue: 68(5), P. 1721 - 1740

Published: July 2, 2023

Abstract In a comprehensive study to assess various aspects of the performance qualified forensic firearms examiners, volunteer examiners compared both bullets and cartridge cases fired from three different types firearms. They rendered opinions on each comparison according Association Firearm & Tool Mark Examiners (AFTE) Range Conclusions, as Identification, Inconclusive (A, B, or C), Elimination, Unsuitable. this part study, sets used previously characterize overall accuracy were blindly resubmitted repeatability (105 examiners; 5700 comparisons cases) reproducibility (191 bullets, 193 cases; 5790 comparisons) examinations. Data gathered using prevailing AFTE also recategorized into two hypothetical scoring systems. Consistently positive differences between observed agreement expected indicate that exceed chance agreement. When averaged over cases, decisions (involving all five levels Range) was 78.3% for known matches 64.5% nonmatches. Similarly 67.3%% 36.5% For reproducibility, many disagreements definitive inconclusive category. Examiner are reliable trustworthy in sense identifications unlikely when comparing non‐matching items, eliminations they matching items.

Language: Английский

Citations

Accuracy and reproducibility of forensic tire examination decisions DOI

Nicole Richetelli,

Jan LeMay,

Kensley M. Dunagan

et al.

Forensic Science International, Journal Year: 2024, Volume and Issue: 358, P. 112009 - 112009

Published: March 28, 2024

Language: Английский

Citations

An overview of log likelihood ratio cost in forensic science – Where is it used and what values can we expect? DOI

Stijn van Lierop,

Daniel Ramos, Marjan Sjerps

et al.

Forensic Science International Synergy, Journal Year: 2024, Volume and Issue: 8, P. 100466 - 100466

Published: Jan. 1, 2024

There is increasing support for reporting evidential strength as a likelihood ratio (LR) and interest in (semi-)automated LR systems. The log-likelihood cost (Cllr) popular metric such systems, penalizing misleading LRs further from 1 more. Cllr = 0 indicates perfection while an uninformative system. However, beyond this, what constitutes "good" unclear. Aiming to provide handles on when "good", we studied 136 publications Results show use heavily depends the field, e.g., being absent DNA analysis. Despite more automated systems over time, proportion remains stable. Noticeably, values lack clear patterns depend area, analysis dataset. As become prevalent, comparing them becomes crucial. This hampered by different studies using datasets. We advocate public benchmark datasets advance field.

Language: Английский

Citations

The influence of perceived difficulty, availability of marks, and examination time on the conclusions of firearms examiners DOI

Keith L. Monson,

Erich D. Smith,

Eugene M. Peters

et al.

Journal of Forensic Sciences, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 14, 2025

Abstract Concurrent with studies on the accuracy, repeatability, and reproducibility of decisions based comparisons fired bullet cartridge cases, we also collected opinions participating examiners as to characteristics specimens provided difficulty making comparisons. Examiners rated ease which they determined every conclusion (easy, average, hard) estimated qualitatively amount visual information available them in determining a (limited, some, extensive). Comparisons deemed hard were perceived generally have somewhat fewer markings conducive for assessment, while where limited produced larger number inconclusive determinations. Perceived increased wider separation firing order (within or between three defined segments 700–850 total firings). The repeatability these qualitative assessments exceeded 60% their average was ~50%. Examination times did not vary significantly when rendering identification, elimination, inconclusive, although identifications appear taken slightly longer than those cases. Hard comparisons, limited, treated substantially differently from any other types comparison. No correlation found attempted. These results tend contradict assertions by critics that are tempted declare save time avoid an elimination identification conclusion, non‐representative casework, affected degree examiner participation.

Language: Английский

Citations

Shining a Light on Forensic Black-Box Studies DOI

Kori Khan,

Alicia L. Carriquiry

Statistics and Public Policy, Journal Year: 2023, Volume and Issue: 10(1)

Published: May 23, 2023

Forensic science plays a critical role in the United States criminal justice system. For decades, many feature-based fields of forensic science, such as firearm and toolmark identification, developed outside scientific community's purview. The results these studies are widely relied on by judges nationwide. However, this reliance is misplaced. Black-box to date suffer from inappropriate sampling methods high rates missingness. Current black-box ignore both problems arriving at error rate estimates presented courts. We explore impact each type limitation using available data court materials. show that rely non-representative samples examiners. Using case study popular ballistics study, we find evidence may commit fewer errors than wider population which they came. also missingness non-ignorable. recent latent print ignoring likely systematic underestimates rates. Finally, offer concrete steps overcome limitations.

Language: Английский

Citations