Cross entropy and log likelihood ratio cost as performance measures for multi‐conclusion categorical outcomes scales DOI

Eric M. Warren,

John C. Handley,

H. David Sheets

et al.

Journal of Forensic Sciences, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 10, 2024

Abstract The inconclusive category in forensics reporting is the appropriate response many cases, but it poses challenges estimating an “error rate”. We discuss use of a class information‐theoretic measures related to cross entropy as alternative set metrics that allows for performance evaluation results presented using multi‐category scales. This paper shows how this metrics, and particular log likelihood ratio cost, which already with forensic methods machine learning communities, can be readily adapted widely used multiple conclusions Bayesian credible intervals on these estimated numerical methods. application published test shown. It demonstrated, results, reducing number categories proficiency from five or six three increases entropy, indicating higher was justified, they increased level agreement ground truth.

Language: Английский

Why the post-identification era is long overdue: Commentary on the current controversy over forensic feature comparison as applied to forensic firearms examination DOI
Alex Biedermann, Christophe Champod

The International Journal of Evidence & Proof, Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 5, 2024

In this commentary, we critically review recurring arguments for and against the discipline of forensic feature comparison as applied to firearms examination from various commentators within outside science. One mainstream criticisms that address, among others, is field cannot demonstrate sufficient proficiency robustness based on empirical (i.e., black-box) studies. While lack empirically demonstrated examiner a valid concern powerful concept in short term (e.g., admissibility proceedings), many critics reduce their discussion solely need measure through error rates. However, exclusive focus aggregate expert performance metrics, here referred diagnosticism, remains surface-level perspective. It provides an incomplete account because these metrics do not represent—but are often confused with—the notion evidentiary value findings, i.e., observations made examined items individual cases. We argue diagnosticism should be contrasted complemented with selectivity, diagnostic capacity observed marks features items. scientists report probed ability quantify selectivity probative findings). By ceasing express source attribution opinions (identification/individualisation), which now widely exposed unscientific, disciplines could move further into long-awaited post-identification era pioneered by other fields such genetics.

Language: Английский

Citations

1

Incorrect statistical reasoning in Guyll et al. leads to biased claims about strength of forensic evidence DOI Creative Commons
Michael Rosenblum, Elizabeth T. Chin, Elizabeth L. Ogburn

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2024, Volume and Issue: 121(45)

Published: Oct. 28, 2024

Giant meteorite impacts during Earth's early history likely had significant effects on life. We studied the surface environment and life of a Paleoarchean impactor ~50 to 200× larger than famous K-Pg impactor. The ...Large must have strongly affected habitability Earth. Rocks Archean Eon record at least 16 major impact events, involving bolides 10 km in diameter. These probably severe, albeit temporary, ...

Language: Английский

Citations

1

More unjustified inferences from limited data in DOI

Richard E. Gutierrez

Law Probability and Risk, Journal Year: 2024, Volume and Issue: 23(1)

Published: Jan. 1, 2024

Abstract In recent years, multiple scholars have criticized the design of studies exploring accuracy firearms examination methods. Rosenblum et al. extend those criticisms to work Guyll on practitioner performance when comparing fired cartridge cases. But while thoroughly dissect issues regarding equiprobability bias and positive predictive values in study, they do not delve as deeply into other areas such variability participant performance, well sampling participants test samples, that further undercut ability generalize al.’s results. This commentary extends what began explores how low rates error reported by likely underestimate potential for misidentifications casework. Ultimately, given convenience authors should gone beyond descriptive statistics instead draw conclusive inferences classify “a highly valid forensic technique.”

Language: Английский

Citations

0

Methodological problems in every black-box study of forensic firearm comparisons DOI Creative Commons
Maria Cuellar, Susan VanderPlas, Amanda Luby

et al.

Law Probability and Risk, Journal Year: 2024, Volume and Issue: 23(1)

Published: Jan. 1, 2024

Abstract Reviews conducted by the National Academy of Sciences (2009) and President’s Council Advisors on Science Technology (2016) concluded that field forensic firearm comparisons has not been demonstrated to be scientifically valid. Scientific validity requires adequately designed studies examiner performance in terms accuracy, repeatability, reproducibility. Researchers have performed “black-box” with goal estimating these measures. As statisticians expertise experimental design, we a literature search such date then evaluated design statistical analysis methods used each study. Our conclusion is all our methodological flaws are so grave they render invalid, is, incapable establishing scientific firearms examination. Notably, error rates among examiners, both collectively individually, remain unknown. Therefore, statements about common origin bullets or cartridge cases based examination “individual” characteristics do basis. We provide some recommendations for future studies.

Language: Английский

Citations

0

Cross entropy and log likelihood ratio cost as performance measures for multi‐conclusion categorical outcomes scales DOI

Eric M. Warren,

John C. Handley,

H. David Sheets

et al.

Journal of Forensic Sciences, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 10, 2024

Abstract The inconclusive category in forensics reporting is the appropriate response many cases, but it poses challenges estimating an “error rate”. We discuss use of a class information‐theoretic measures related to cross entropy as alternative set metrics that allows for performance evaluation results presented using multi‐category scales. This paper shows how this metrics, and particular log likelihood ratio cost, which already with forensic methods machine learning communities, can be readily adapted widely used multiple conclusions Bayesian credible intervals on these estimated numerical methods. application published test shown. It demonstrated, results, reducing number categories proficiency from five or six three increases entropy, indicating higher was justified, they increased level agreement ground truth.

Language: Английский

Citations

0