Using Psychophysics and Signal Detection Theory to Improve Eyewitness Testimony
Post by Cody Walters
What’s the science?
Eyewitness testimony plays a major role in the criminal justice system, yet 70% of those who are wrongfully convicted are imprisoned, in large part, on the basis of witness misidentification. While some research has been done to try to find ways to improve eyewitness performance, the fact remains that traditional police lineups are prone to various forms of decisional bias. This week in Nature Communications, Gepshtein and colleagues explored a more nuanced lineup methodology that avoids the pitfalls associated with traditional lineups and provides finer-grain metrics for gauging eyewitness reliability.
How did they do it?
Simultaneous and sequential lineups have been traditionally used by law enforcement and involve the participant either being presented with all the suspects at once or one at a time, respectively. The authors introduced a third lineup type: the paired comparison design. Each participant watched a brief movie clip depicting a crime being committed. The following day, participants were shown the same six faces and told that the perpetrator (called the ‘target’) may or may not be present in the current lineup. During the paired comparison lineup, participants viewed two suspect photographs at a time and were asked to indicate which of the two more closely resembled the perpetrator. This method of relative judgment between two stimuli is a well-established technique in the field of psychophysics (i.e., the study of the relationship between physical stimuli and perception). The authors then fit a line to these voting data and constructed a voting function, where the slope of the line was used to quantify voting consistency: a voting function with a slope of zero would indicate that each face in the lineup was voted for equally often (indicating low voting consistency), whereas a large slope would indicate that each face in the lineup was not voted for equally often (indicating high voting consistency).
Lastly, the authors used receiver operator characteristic (ROC) curves to quantify the voting distribution data. To provide a simple example of how to construct an ROC curve, assume you have two partially overlapping distributions representing votes for face 1 (the perpetrator) and votes for face 2 (a filler), with the voting score represented along the x-axis. You can set an arbitrary cutoff (i.e. a decision criterion) in between those two distributions, meaning that a classifier will consider all vote scores to the right of that cutoff as a vote for the rightmost category. However, since the distributions are overlapping, this necessarily means that there will be misclassifications. Multiple such cutoffs can be positioned at varying positions along the two distributions, and the ‘hit-to-miss’ ratio for each cutoff is plotted to form a curve. The area under the ROC curve can then be used as a metric for classification accuracy.
What did they find?
The authors found that the paired comparison lineup resulted in the same rate of target identification as traditional lineups with the added benefit of having a lower lineup rejection rate (which results from participants failing to select a suspect). An additional advantage of the paired comparison design is that it provides access to information about the consistency with which each face is selected as well as the strength of participants’ recognition memory (i.e., the degree to which a given face matches their memory of the perpetrator). This is because the consistency of a participant’s votes over multiple rounds of paired comparisons is inversely proportional to the variance of the recognition memory.
The authors plotted the average voting scores for each lineup face split by subjects whose highest voting score either correctly or incorrectly identified the perpetrator. Unsurprisingly, the voting score distribution for the perpetrator was significantly above the voting score distributions for the other lineup faces among participants who correctly identified the perpetrator. However, the voting score distribution for the perpetrator was (counterintuitively) also significantly above the voting scores for the other lineup faces among subjects that incorrectly identified the perpetrator. This result is explained by the consistency with which subjects voted: though the highest voting scores among the incorrect subjects were for non-target faces, they often ranked the target face as their second choice. This unique feature of the paired comparison lineup provides access to the hidden structure of recognition memory, allowing experimenters to infer the target face from aggregate voting data even when it was not the top-ranked face among individual subjects.
The authors generated ROC curves derived from simultaneous and sequential lineup data and showed that the paired comparisons ROC curve matches (and potentially outperforms) the data from simultaneous and sequential lineups in terms of classification accuracy without the bias that comes from subjects having to make a definitive identification. Importantly, the paired comparison lineup method allows for experimenters to create ROC curves for individual subjects, an option that is unavailable when using traditional lineups.
What’s the impact?
The authors studied a novel lineup design that leverages principles from psychophysics and analyses from signal detection theory. This new approach provides a method for determining the strength of either individual or aggregate eyewitness recognition memory in a probabilistic manner, an improvement on existing methods that require definitive decision criterion. This work has the potential to pave the way for a more effective, science-based approach to eyewitness testimony.
Gepshtein et al. A perceptual scaling approach to eyewitness identification. Nature Communications (2020). Access the publication here.