Skip to content
/ sayit Public

A text-to-speech command line tool backed by Azure Cognitive Services.

License

Notifications You must be signed in to change notification settings

pviotti/sayit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SayIt 📢

.NET CI sayit

SayIt is a cross-platform command line tool that pronounces written text. You can use it to create audio recordings of your text files or to improve your pronunciation in a foreign language.

Setup

SayIt uses Azure Cognitive Services as backend to guarantee optimal audio quality, so it requires a subscription to Azure, which you can get for free here. Azure Cognitive Services free tier, as of 2021, includes 5 text-to-speech hours per month, which is often enough for personal use.

You can download SayIt in the release section. SayIt is currently distributed both as self-contained .NET executable (which means you won't need to install the .NET runtime to use it) and as framework-dependent .NET executable.

At the first use you're required to run the setup wizard (./sayit --setup) and enter the configuration parameters of your Azure Cognitive Services resource, such as the subscription key (which you can find in the Azure portal) and the region identifier (see here). SayIt will store these parameters in the configuration folder of the current user (e.g. ~/.config/ in Linux) as an App Setting XML file.

Usage

$ ./sayit --help
USAGE: sayit [--help] [--version] [--setup] [--list-voices] [--list-formats]
             [--voice <voice>] [--format <format>] [--output <output>] [<input>]

INPUT:

    <input>               the text to be pronounced
                          (if missing, sayit will try to read it from stdin)

OPTIONS:

    --version             print sayit version
    --setup               setup the configuration file
    --list-voices, -lv    list the available voice shorthands
                          with their corresponding voice ids
    --list-formats, -lf   list the available output format shorthands
                          with their corresponding output format ids
    --voice, -v <voice>   the voice shorthand
    --format, -f <format> the audio output format shorthand
    --output, -o <output> the path of the output file
    --help                display this list of options.

SayIt supports these settings:

  • languages: English, Italian, French, German, Spanish, Hindi, Portuguese, Russian, Japanese and Chinese (Mandarin).
  • output formats: audio-16khz-32kbitrate-mono-mp3, audio-16khz-64kbitrate-mono-mp3, audio-16khz-128kbitrate-mono-mp3, audio-24khz-96kbitrate-mono-mp3, audio-24khz-160kbitrate-mono-mp3, audio-24khz-48kbitrate-mono-mp3, riff-8khz-16bit-mono-pcm, riff-16khz-16bit-mono-pcm, riff-24khz-16bit-mono-pcm.

NB: some languages and output formats might not be supported by your Azure Cognitive Services resource, depending on its region (see here).

NB: the choice of supported voices and formats has been somewhat random. I welcome suggestions and contributions, of course.