Using a Framework to Evaluate the Performance of Explainable AIs on Deepfake Detection Models

Session Number

CMPS 22

Advisor(s)

Dr.Yan Yan

Mr.Junyi Wu, Illinois Institute of Technology

Discipline

Computer Science

Start Date

17-4-2024 10:25 AM

End Date

17-4-2024 10:40 AM

Abstract

As the number of deepfake image generators has skyrocketed, so has the number of deepfake detection models. However, explainability in these models remains underexplored, which is the key to building trust with users, and these are black box models. To address this issue, we studied how humans can understand the model’s justification behind its decision by integrating explainable AIs (XAIs) with a deepfake detection model. We tested the model on deepfaked images of human faces by integrating it with 2 XAI methods - Gradient-weighted Class Activation Mapping (Grad-CAM) and Local Interpretable Model-Agnostic Explanations (LIME). In our research, we noted that the XAIs commonly identified deepfakes by picking on lighting differences in deepfakes. Furthermore, we wanted to understand which XAI method offers clearer insights to users on a deepfake detection model’s outputs. To this end, we developed a framework to evaluate the two widely used XAI methods: Grad-CAM and LIME. This framework evaluated the XAI methods focusing on interpretability, explainability, clarity, and distraction. Based on the evaluation, we revealed that Grad-CAM yields more effective explainability results than LIME.

Share

COinS
 
Apr 17th, 10:25 AM Apr 17th, 10:40 AM

Using a Framework to Evaluate the Performance of Explainable AIs on Deepfake Detection Models

As the number of deepfake image generators has skyrocketed, so has the number of deepfake detection models. However, explainability in these models remains underexplored, which is the key to building trust with users, and these are black box models. To address this issue, we studied how humans can understand the model’s justification behind its decision by integrating explainable AIs (XAIs) with a deepfake detection model. We tested the model on deepfaked images of human faces by integrating it with 2 XAI methods - Gradient-weighted Class Activation Mapping (Grad-CAM) and Local Interpretable Model-Agnostic Explanations (LIME). In our research, we noted that the XAIs commonly identified deepfakes by picking on lighting differences in deepfakes. Furthermore, we wanted to understand which XAI method offers clearer insights to users on a deepfake detection model’s outputs. To this end, we developed a framework to evaluate the two widely used XAI methods: Grad-CAM and LIME. This framework evaluated the XAI methods focusing on interpretability, explainability, clarity, and distraction. Based on the evaluation, we revealed that Grad-CAM yields more effective explainability results than LIME.