About โข Getting Started โข Training โข Inference โข Contribute
The growing evolution of artificial intelligence (A.I.) tools is making them increasingly efficient and globally accessible. However, some of these technologies can be harmful if used maliciously, and that includes deepfakes. Deepfakes are a type of synthetic media that generates realistic content and has the potential to clone an individual's identity, using it to spread fake news, damage their reputation and promote fraud and security breaches. Thus, there is a need for ways to verify whether a piece of media is real or artificially synthesized. However, even though there are technologies that meet this need, the detection of audio deepfakes is still a challenge, considering that it is not as effective when it comes to speech in Portuguese and has questionable effectiveness in audio with the presence of noise. In this sense, Unfake aims to develop an A.I. model capable of identifying whether an audio contains human or synthetic speech. In this way, we hope to make it possible for lay users to identify deepfakes in a robust and effective way, contributing to a safer and more reliable digital environment, as well as encouraging future research in the area using the data obtained in the project.
Here is a list of all prerequisites necessary for running the project locally:
git clone https://github.com/Unfake-Official/classifier.git
Firstly, create a virtual environment and activate it:
python -m venv .venv
.venv/Scripts/activate
Next, install all dependencies:
pip install -r requirements.txt
If you want to use our preprocessed dataset for training, download it from portufake repository and unzip it.
Now, for training the model with your own data, you need to create a folder containing two subdirectories named:
- real: contains real speaker recording spectrograms
- fake: contains audio deepfake spectrograms
Then, go to cnn/train.py
and change:
EPOCHS, BATCH_SIZE and VALIDATION_SPLIT
to the values you want to use as hyperparameters.IMG_SIZE
to the target image size in the format(WIDTH, HEIGHT)
.CHECKPOINT_PATH
to the model's path, if you want to train from a checkpoint.METRICS_PATH
to the path of the image where the accuracy and loss charts will be plotted.CSV_PATH
to the path of the csv file where the accuracy and loss charts will be saved.DATASET_PATH
to the path of the dataset containing the real and fake folders as mentioned above.
Next, run the file:
python cnn/train.py
The model will be saved after each epoch. You can track the training process through the terminal, with messages and progress bars to help the process.
If you want to use our pretrained model for inference, download it from unfake repository,
unzip it and place it in the following directory: cnn/checkpoints
Now, if you want to use your own model, be aware that the code expects a tf.keras
model saved with model.export()
.
Then, go to: cnn/inference.py
and change:
CHECKPOINT_PATH
to the model's path (example:cnn/checkpoints/unfake
)AUDIO_PATH
to the audios's path you want to classify as a real recording or a deepfake (all common audio formats are accepted, such as .wav, .mp3, .ogg and .flac).
Next, run the file:
python cnn/inference.py
The response will be printed as it follows:
Your audio is probably <real/false> with <percentage>% confidence.
If you want to somehow contribute to this project, start by creating a branch named as follow. Then, make your changes and follow commit patterns. Finally, open an pull request.
git clone https://github.com/Unfake-Official/classifier.git
git checkout -b feature/NAME
- Follow commit patterns
- Open a Pull Request explaining the problem solved or feature made, if exists, append screenshot of visual modifications and wait for the review!