🤖 Classifier

About • Getting Started • Training • Inference • Contribute

📌 About the project

The growing evolution of artificial intelligence (A.I.) tools is making them increasingly efficient and globally accessible. However, some of these technologies can be harmful if used maliciously, and that includes deepfakes. Deepfakes are a type of synthetic media that generates realistic content and has the potential to clone an individual's identity, using it to spread fake news, damage their reputation and promote fraud and security breaches. Thus, there is a need for ways to verify whether a piece of media is real or artificially synthesized. However, even though there are technologies that meet this need, the detection of audio deepfakes is still a challenge, considering that it is not as effective when it comes to speech in Portuguese and has questionable effectiveness in audio with the presence of noise. In this sense, Unfake aims to develop an A.I. model capable of identifying whether an audio contains human or synthetic speech. In this way, we hope to make it possible for lay users to identify deepfakes in a robust and effective way, contributing to a safer and more reliable digital environment, as well as encouraging future research in the area using the data obtained in the project.

🚀 Getting started

Prerequisites

Here is a list of all prerequisites necessary for running the project locally:

Python

Cloning

git clone https://github.com/Unfake-Official/classifier.git

Starting

Firstly, create a virtual environment and activate it:

python -m venv .venv
.venv/Scripts/activate

Next, install all dependencies:

pip install -r requirements.txt

Training

If you want to use our preprocessed dataset for training, download it from portufake repository and unzip it.

Now, for training the model with your own data, you need to create a folder containing two subdirectories named:

real: contains real speaker recording spectrograms
fake: contains audio deepfake spectrograms

Then, go to cnn/train.py and change:

EPOCHS, BATCH_SIZE and VALIDATION_SPLIT to the values you want to use as hyperparameters.
IMG_SIZE to the target image size in the format (WIDTH, HEIGHT).
CHECKPOINT_PATH to the model's path, if you want to train from a checkpoint.
METRICS_PATH to the path of the image where the accuracy and loss charts will be plotted.
CSV_PATH to the path of the csv file where the accuracy and loss charts will be saved.
DATASET_PATH to the path of the dataset containing the real and fake folders as mentioned above.

Next, run the file:

python cnn/train.py

The model will be saved after each epoch. You can track the training process through the terminal, with messages and progress bars to help the process.

Inference

If you want to use our pretrained model for inference, download it from unfake repository, unzip it and place it in the following directory: cnn/checkpoints

Now, if you want to use your own model, be aware that the code expects a tf.keras model saved with model.export().

Then, go to: cnn/inference.py and change:

CHECKPOINT_PATH to the model's path (example: cnn/checkpoints/unfake)
AUDIO_PATH to the audios's path you want to classify as a real recording or a deepfake (all common audio formats are accepted, such as .wav, .mp3, .ogg and .flac).

Next, run the file:

python cnn/inference.py

The response will be printed as it follows: Your audio is probably <real/false> with <percentage>% confidence.

📫 Contribute

If you want to somehow contribute to this project, start by creating a branch named as follow. Then, make your changes and follow commit patterns. Finally, open an pull request.

git clone https://github.com/Unfake-Official/classifier.git
git checkout -b feature/NAME
Follow commit patterns
Open a Pull Request explaining the problem solved or feature made, if exists, append screenshot of visual modifications and wait for the review!

Documentations that might help

📝 How to create a Pull Request

💾 Commit pattern

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
cnn		cnn
gan		gan
rnn		rnn
vision_transformer		vision_transformer
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Classifier

📌 About the project

🚀 Getting started

Prerequisites

Cloning

Starting

Training

Inference

📫 Contribute

Documentations that might help

About

Contributors 2

Languages

License

Unfake-Official/classifier

Folders and files

Latest commit

History

Repository files navigation

🤖 Classifier

📌 About the project

🚀 Getting started

Prerequisites

Cloning

Starting

Training

Inference

📫 Contribute

Documentations that might help

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages