Using a Framework to Evaluate the Performance of Explainable AIs on Deepfake Detection Models
Session Number
CMPS 22
Advisor(s)
Dr.Yan Yan
Mr.Junyi Wu, Illinois Institute of Technology
Discipline
Computer Science
Start Date
17-4-2024 10:25 AM
End Date
17-4-2024 10:40 AM
Abstract
As the number of deepfake image generators has skyrocketed, so has the number of deepfake detection models. However, explainability in these models remains underexplored, which is the key to building trust with users, and these are black box models. To address this issue, we studied how humans can understand the model’s justification behind its decision by integrating explainable AIs (XAIs) with a deepfake detection model. We tested the model on deepfaked images of human faces by integrating it with 2 XAI methods - Gradient-weighted Class Activation Mapping (Grad-CAM) and Local Interpretable Model-Agnostic Explanations (LIME). In our research, we noted that the XAIs commonly identified deepfakes by picking on lighting differences in deepfakes. Furthermore, we wanted to understand which XAI method offers clearer insights to users on a deepfake detection model’s outputs. To this end, we developed a framework to evaluate the two widely used XAI methods: Grad-CAM and LIME. This framework evaluated the XAI methods focusing on interpretability, explainability, clarity, and distraction. Based on the evaluation, we revealed that Grad-CAM yields more effective explainability results than LIME.
Using a Framework to Evaluate the Performance of Explainable AIs on Deepfake Detection Models
As the number of deepfake image generators has skyrocketed, so has the number of deepfake detection models. However, explainability in these models remains underexplored, which is the key to building trust with users, and these are black box models. To address this issue, we studied how humans can understand the model’s justification behind its decision by integrating explainable AIs (XAIs) with a deepfake detection model. We tested the model on deepfaked images of human faces by integrating it with 2 XAI methods - Gradient-weighted Class Activation Mapping (Grad-CAM) and Local Interpretable Model-Agnostic Explanations (LIME). In our research, we noted that the XAIs commonly identified deepfakes by picking on lighting differences in deepfakes. Furthermore, we wanted to understand which XAI method offers clearer insights to users on a deepfake detection model’s outputs. To this end, we developed a framework to evaluate the two widely used XAI methods: Grad-CAM and LIME. This framework evaluated the XAI methods focusing on interpretability, explainability, clarity, and distraction. Based on the evaluation, we revealed that Grad-CAM yields more effective explainability results than LIME.