Baseline for Voice Jammer Project

This baseline uses existing technologies to create a clean-slate, end-to-end personalized virtual voice platform, using only a user voice and a target voice. In its current locally-running version, the platform is a script that runs Facebook Flashlight for automatic speech recognition (ASR) then SV2TTS to generate the speech.

To get started, you will need to set up Flashlight and SV2TTS before running pipeline_script.sh.

Setup Flashlight

1. Install Requirements

This project runs Flashlight with Docker. Please make sure you have Docker installed and that Docker is up and running for the baseline.
Follow the instructions in the overview to install the example trained models from AWS S3. You do not need the LibriSpeech audio samples for our purposes but you may choose to download them as well for testing purposes.

Setup SV2TTS

1. Install Requirements

Python 3.6 or 3.7 is needed to run the toolbox.

Install PyTorch (>=1.0.1).
Install ffmpeg.
Run pip install -r requirements.txt to install the remaining necessary packages.

2. Download Pretrained Models

Download the latest here.

By the end of the installation process, you should have a model folder in your local repo.

3. (Optional) Test Configuration

Before you download any dataset, you can begin by testing your configuration with:

python demo_cli.py

If all tests pass, you're good to go.

4. (Optional) Download Datasets

For playing with the toolbox alone, I only recommend downloading LibriSpeech/train-clean-100. Extract the contents as <datasets_root>/LibriSpeech/train-clean-100 where <datasets_root> is a directory of your choosing. Other datasets are supported in the toolbox, see here. You're free not to download any dataset, but then you will need your own data as audio files or you will have to record it with the toolbox.

5. Launch the Toolbox

You can then try the toolbox:

python demo_toolbox.py -d <datasets_root>
or
python demo_toolbox.py

depending on whether you downloaded any datasets. If you are running an X-server or if you have the error Aborted (core dumped), see this issue.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
audio/sandlab-test/transcription		audio/sandlab-test/transcription
encoder		encoder
samples		samples
synthesizer		synthesizer
toolbox		toolbox
utils		utils
vocoder		vocoder
.gitignore		.gitignore
README.md		README.md
demo_cli.py		demo_cli.py
demo_toolbox.py		demo_toolbox.py
encoder_preprocess.py		encoder_preprocess.py
encoder_train.py		encoder_train.py
pipeline_script.sh		pipeline_script.sh
requirements.txt		requirements.txt
synthesizer_preprocess_audio.py		synthesizer_preprocess_audio.py
synthesizer_preprocess_embeds.py		synthesizer_preprocess_embeds.py
synthesizer_train.py		synthesizer_train.py
vocoder_preprocess.py		vocoder_preprocess.py
vocoder_train.py		vocoder_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Baseline for Voice Jammer Project

Setup Flashlight

1. Install Requirements

Setup SV2TTS

1. Install Requirements

2. Download Pretrained Models

3. (Optional) Test Configuration

4. (Optional) Download Datasets

5. Launch the Toolbox

About

Releases

Packages

Languages

JiaqiGao/Voice-Jammer-Baseline

Folders and files

Latest commit

History

Repository files navigation

Baseline for Voice Jammer Project

Setup Flashlight

1. Install Requirements

Setup SV2TTS

1. Install Requirements

2. Download Pretrained Models

3. (Optional) Test Configuration

4. (Optional) Download Datasets

5. Launch the Toolbox

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages