Implemented a Deep Learning Model to detect Real and fake Images.
I have implemented deepfake image classification using Progan dataset. It involves reading and preprocessing images from a ProGAN dataset, extracting features using the CLIP (Contrastive Language-Image Pre-training) model, visualizing feature embeddings, and implementing image classification using nearest neighbor and linear classification approaches.
- Data Reading
- Preprocessing
- Feature Extraction
- T-SNE Plot
- Nearest Neighbor Approach
- Linear Classification Approach
- Accuracy
- Reading data from ProGan (Real and Fake Images) for Training Phase
- It iterate over each file and Reads the image using cv2.imread()
- It checks if the image is in RGB Format and appends the image if it satisfies this criterion
- Now we Resize the image into (224,224,3) dimensions
-
- JPEG Compression : It converts the format of image to .jpg with 80%
-
- Gaussian Blur : It applies a gaussian blur filter with kernel size=(5,5) and std dev=0 to half of the images using np.random.rand()<0.5
Effect of preprocessing on some random image
Features from the images are extracted using the CLIP (Contrastive Language-Image Pre-training) model. The features are then saved for further analysis and classification tasks.
Funcion:
- It finds the nearest neighbor to a given test feature vector among a set of feature vectors using cosine similarity.
Parameters:
- test: The test feature vector for which the nearest neighbor is to be found.
- features: A 2D array containing feature vectors among which to find the nearest neighbor.
Returns:
- nn_sim: The cosine similarity between the test feature vector and its nearest neighbor.
Testing Phase
- After training on ProGan now I am testing on Glide-100 and Laion
- The test.pt file is loaded and converted to a NumPy array (which contains Images from Glide and Laion)
- A loop iterates over each feature vector in features_test.
- For each feature vector, the nearest neighbors are found in both the real and fake feature sets using the find_nn() function.
- That class which has less cosine similarity which means it is more closer to test image is assigned to the test image
For Linear Classification, I am training a neural network using TensorFlow's Keras API (FCNN network)
-
Data Preparation:
features_real
andfeatures_fake
are concatenated intoX_train
and corresponding labelsY_train
are defined. Then both are converted into numpy arrays -
Hyperparameters:
- Batch Size: 30 (as instructed in the research paper)
- Epochs: 7 (The model acheives a training accuracy of 1 so trained on less no of epochs)
-
Model Structure: It consists of two fully connected layers:
- First layer has 768 units and uses the ReLU activation function.
- Output layer has 1 unit with a sigmoid activation function, for binary classification problems.
-
Model Compilation: The model is compiled using the Adam optimizer and binary crossentropy loss function.
- Used Adam as optimizer
- Used Binary crossentropy loss as Loss Function
- I had trained on Progan image set
- Nearest Neighbor Method
Dataset | Accuracy | SOTA |
---|---|---|
Glide-100 and Laion | 75.3% | 87% |
- Linear Classification
Dataset | Accuracy | SOTA |
---|---|---|
CycleGan | 97.3% | 98.5% |
Dalle and Laion | 92.9% | 86.78% |
Glide-100 and Laion | 86.9% | 79% |
If the detected image is Fake, then it detects the source of the Fake image (General Adversive Networks (GAN) or Diffusion Models (DM)).
The model takes in the weights of a pretrained RESNET-50 model and then it fine-tunes those weights for our task by freezing the starting layers.
Level-1: Achieves 87% accuracy to detect real and Fake Images (It was trained on Dalle Fake images and Laiom Real images and tested on Glide fake images).
Level-2: The model trained for detecting the source of Fake image achieves a Testing Accuracy of 93%.
Diffusion Models: Link
GAN's: Link
You can load the .pt files from the Data
Folder
The model created for Linear Classification in Paper4 is present with named model_lc.h5