Skip to content

๐Ÿค– Unfake Classification Model for Spoofing Detection

License

Notifications You must be signed in to change notification settings

Unfake-Official/classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

58 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿค– Classifier

python tensorflow keras

About โ€ข Getting Started โ€ข Training โ€ข Inference โ€ข Contribute

๐Ÿ“Œ About the project

The growing evolution of artificial intelligence (A.I.) tools is making them increasingly efficient and globally accessible. However, some of these technologies can be harmful if used maliciously, and that includes deepfakes. Deepfakes are a type of synthetic media that generates realistic content and has the potential to clone an individual's identity, using it to spread fake news, damage their reputation and promote fraud and security breaches. Thus, there is a need for ways to verify whether a piece of media is real or artificially synthesized. However, even though there are technologies that meet this need, the detection of audio deepfakes is still a challenge, considering that it is not as effective when it comes to speech in Portuguese and has questionable effectiveness in audio with the presence of noise. In this sense, Unfake aims to develop an A.I. model capable of identifying whether an audio contains human or synthetic speech. In this way, we hope to make it possible for lay users to identify deepfakes in a robust and effective way, contributing to a safer and more reliable digital environment, as well as encouraging future research in the area using the data obtained in the project.


๐Ÿš€ Getting started

Prerequisites

Here is a list of all prerequisites necessary for running the project locally:

Cloning

git clone https://github.com/Unfake-Official/classifier.git

Starting

Firstly, create a virtual environment and activate it:

python -m venv .venv
.venv/Scripts/activate

Next, install all dependencies:

pip install -r requirements.txt

Training

If you want to use our preprocessed dataset for training, download it from portufake repository and unzip it.

Now, for training the model with your own data, you need to create a folder containing two subdirectories named:

  • real: contains real speaker recording spectrograms
  • fake: contains audio deepfake spectrograms

Then, go to cnn/train.py and change:

  1. EPOCHS, BATCH_SIZE and VALIDATION_SPLIT to the values you want to use as hyperparameters.
  2. IMG_SIZE to the target image size in the format (WIDTH, HEIGHT).
  3. CHECKPOINT_PATH to the model's path, if you want to train from a checkpoint.
  4. METRICS_PATH to the path of the image where the accuracy and loss charts will be plotted.
  5. CSV_PATH to the path of the csv file where the accuracy and loss charts will be saved.
  6. DATASET_PATH to the path of the dataset containing the real and fake folders as mentioned above.

Next, run the file:

python cnn/train.py

The model will be saved after each epoch. You can track the training process through the terminal, with messages and progress bars to help the process.

Inference

If you want to use our pretrained model for inference, download it from unfake repository, unzip it and place it in the following directory: cnn/checkpoints

Now, if you want to use your own model, be aware that the code expects a tf.keras model saved with model.export().

Then, go to: cnn/inference.py and change:

  1. CHECKPOINT_PATH to the model's path (example: cnn/checkpoints/unfake)
  2. AUDIO_PATH to the audios's path you want to classify as a real recording or a deepfake (all common audio formats are accepted, such as .wav, .mp3, .ogg and .flac).

Next, run the file:

python cnn/inference.py

The response will be printed as it follows: Your audio is probably <real/false> with <percentage>% confidence.

๐Ÿ“ซ Contribute

If you want to somehow contribute to this project, start by creating a branch named as follow. Then, make your changes and follow commit patterns. Finally, open an pull request.

  1. git clone https://github.com/Unfake-Official/classifier.git
  2. git checkout -b feature/NAME
  3. Follow commit patterns
  4. Open a Pull Request explaining the problem solved or feature made, if exists, append screenshot of visual modifications and wait for the review!

Documentations that might help

๐Ÿ“ How to create a Pull Request

๐Ÿ’พ Commit pattern