Date of Award
6-10-2025
Document Type
Thesis
Publisher
Santa Clara : Santa Clara University, 2025
Department
Computer Science and Engineering
First Advisor
David C. Anastasiu
Abstract
From elections, to wars, to people’s personal lives, deepfakes have become ever more present, blurring the distinction between reality and fiction. Many deep learning-based deepfake detection methods lack generalizability. They often overfit to deepfake generation techniques in their training distribution, and fail to detect deepfake generation techniques outside of their training distribution. One approach to solve the generalization problem has been to use Vision Language Models, such as the pretrained CLIP model, to extract features that generalize across different deepfake generation techniques, including diffusion and GAN images. However, most CLIP approaches only consider image level features for detection, such as artifacts from blending or generative models, rather than video level features such as flicking or temporal inconsistencies. The importance of capturing both spatial and temporal information is demonstrated by two state-of-the-art detection methods: AltFreezing, which uses a 3D CNN and an alternating weight freezing strategy to train both spatial and temporal weights; and TALL, which uses a Swin-Transformer and a 2D thumbnail layout to capture spatial information within frames and temporal information across consecutive frames. To compete with the state-of-the-art in video deepfake detection, we propose two methods to integrate temporal information with the spatial information that CLIP is already able to capture. The first is to combine CLIP with a transformer trained from scratch. The second is to combine CLIP with TimeSformer. We furthermore investigate X-CLIP, a variant of CLIP with video understanding capabilities, and the addition of its multiframe integration transformer for deepfake detection.
Recommended Citation
Tong, Timothy and Lucas, Abem, "Deepfake Detection" (2025). Computer Science and Engineering Senior Theses. 318.
https://scholarcommons.scu.edu/cseng_senior/318
