Detecting Fake Videos With AI and Human Judgement

Post by Lina Teichmann

The takeaway

Videos manipulated with neural networks can make it very difficult to distinguish fiction from reality. These ‘deepfakes’ can be used to spread false information, thereby posing a serious challenge to society. Combining artificial intelligence (AI) models and human judgment results in superior performance when detecting deepfakes in comparison to models or human judgment alone.

What's the science?

Machine-manipulated videos called ‘deepfakes’ can have harmful consequences when undetected. Leading AI models can be used to detect fake videos, however, in a large competition on deepfake detection, the leading model achieved an accuracy of only 65%, with chance level being at 50%. This week in PNAS, Groh and colleagues examined how humans, AI models, and the combination of the two perform in deepfake detection. They investigated the error patterns and specific strengths and weaknesses of AI models and large groups of humans when distinguishing between real and fake videos.

How did they do it?

The authors ran two separate experiments. In Experiment 1, a deepfake video was shown alongside the authentic video and participants had to indicate which one was fake. In Experiment 2, participants viewed one video and had to indicate how confident they were that the video was real or fake. In this experiment, participants could also see the AI model’s prediction and update their confidence rating accordingly. Both experiments included random manipulations to evoke a specific emotion and to obstruct facial features in the video. These manipulations were used to test whether incidental emotion has an effect on deepfake detection and whether specialized face processing abilities in humans influence their performance.

What did they find?

The leading AI model could correctly identify 80% of a test set of videos as either real or fake. This was similar to the participants’ response averaged per video, however, there was some variance between individuals. While the overall accuracy between the human crowd and the model was similar, the types of mistakes they made differed. When participants had access to the model predictions, they outperformed both the model and the participants who did not access model predictions. However, inaccurate model predictions had a negative impact on humans’ ability to detect deepfakes. There was some evidence that eliciting anger in participants lowered the ability to detect deepfake videos, highlighting why deepfake videos in social media may be even more problematic. The results also show that face processing is critical for people to distinguish fake from real videos.

What's the impact?

The emergence of neural networks in computer vision has led to many new opportunities, such as improving medical diagnoses and enhancing accuracy in forensic examinations. On the flipside, neural networks are able to generate pictures and videos that look authentic but are fake. Classifying machine-manipulated videos as fake is critical in fighting the spread of misinformation. Overall, this study highlights that groups of humans are just as capable in detecting deepfakes as leading AI models. Groh and colleagues illuminate how we can best combine the strengths of AI models and humans to flag deepfakes and ultimately overcome a massive challenge for society at large.