Releases · echogarden-project/echogarden

03 Mar 06:49

rotemdan

v2.3.3

ce2c59a

v2.3.3 Latest

Latest

Fixes

Kokoro (synthesis): apply English phoneme substitutions to more closely follow the Misaki phonemizer output. Pronunciation of words containing diphthongs like ɔɪ, such as "noise" and "annoy", used to be incorrectly pronounced by the US English voices with ʌ-like sounds, like "naise" and "annay". The reason was that the special diphthong token (Y in this case) that the model was trained on, wasn't mapped correctly (this problem also occurs in many JavaScript ports of Kokoro, including the port made by its original author). The issue should now be fixed, and pronunciation should be more accurate and consistent in general.

Enhancements

Synthesis: added many more additional words (incorrectly pronounced by eSpeak-NG), to the English correction lexicon

Full Changelog: v2.3.2...v2.3.3

Assets 2

27 Feb 05:22

rotemdan

v2.3.2

a5ca02a

v2.3.2

Fixes

Work around eSpeak-NG marker issue when multiple square brackets are included in a fragment

Full Changelog: v2.3.1...v2.3.2

Assets 2

25 Feb 11:39

rotemdan

v2.3.1

1506920

v2.3.1

Fixes

Fix unreported issue of files with empty content failing to write to disk (how wasn't this issue reported?)

Full Changelog: v2.3.0...v2.3.1

Assets 2

25 Feb 10:37

rotemdan

v2.3.0

897815e

v2.3.0

Features

Add Deepgram cloud TTS engine

Enhancements

Deepgram STT: convert to Opus codec (48 kbit/s) when sending audio to server. Add option for adding punctuation (enabled by default)
Updated and improved ElevenLabs TTS engine. Add options for selecting model (documented here) and optional seed.

Fixes

Fix ElevenLabs options casing to match the documentation

Full Changelog: v2.2.0...v2.3.0

Assets 2

23 Feb 10:10

rotemdan

v2.2.1

9679c9f

v2.2.1

Fixes

Synthesis: fix regression caused by converting : to ,. eSpeak-NG gets crazy and skips markers when a fragment like :: is converted to ,,. Instead covert to only a single , regardless of the : count

Full Changelog: v2.2.0...v2.2.1

Assets 2

22 Feb 20:37

rotemdan

v2.2.0

9985621

v2.2.0

Features

Add support for Deepgram STT by @DoneMaster in #95

Enhancements

Synthesis: updated pronunciation lexicons
Synthesis: rewrite IPA to Kirshenbaum table
eSpeak-NG synthesis and phonemization: prevent pronunciation of angle brackets and colons in parts

Fixes

Fix references to ElevenLabs.ts
Add missing format properties to FFMpeg codec parameters

New contributors

@DoneMaster made their first contribution in #95

Full Changelog: v2.1.2...v2.2.0

Contributors

DoneMaster

Assets 2

14 Feb 18:24

rotemdan

v2.1.2

22f476f

v2.1.2

Enhancements

Expanded heteronym and word lexicons

Fixes

Fix incorrect logical operator, that caused phoneme timelines of words extracted from eSpeak event output, to be incorrectly merged in some cases

Changes in the lexicon format

Change naming of lexicon properties from succeededBy and notSucceededBy to followedBy and notFollowedBy (the deprecated property names are still read in code, as a fallback, to ensure backward compatibility)

Documentation

Several changes and fixes to the documentation
Added a new guide for enabling the cuda ONNX execution provider in Linux and Windows Subsystem for Linux (WSL)

Full Changelog: v2.1.1...v2.1.2

Assets 2

13 Feb 10:55

rotemdan

v2.1.1

615046c

v2.1.1

Enhancements

Added a new default pronunciation lexicon for English (located at data/lexicons/words.en.json), containing corrections for words mispronounced or inaccurately pronounced by eSpeak-NG. For example vs. will now be pronounced as "versus" rather than "vee ess". Also, now with the higher-quality Kokoro voices, these subtle corrections would become more important, since the Kokoro model is generally more loyal to the exact IPA specified, so it's able to provide better accuracy in general
Some updates to the Heteronym lexicon, including corrections to the disambiguation logic for the word "learned" (deciding between verb l ˈɜː ɹ n d and adjective l ˈɜː ɹ n ɪ d)

Full Changelog: v2.1.0...v2.1.1

Assets 2

12 Feb 16:30

rotemdan

v2.1.0

4e6f41e

v2.1.0

Features

Added the Kokoro TTS engine: new and high-quality open-source local synthesis model based on StyleTTS 2. All currently available voices and languages are supported (English US and UK, Spanish, French, Hindi, Italian, Brazilian Portuguese and Chinese), except for Japanese (due to limitations of eSpeak-NG phonemization for Japanese)
Added the Gnuspeech TTS engine (WebAssembly): legacy English-only speech synthesizer based on articulatory synthesis techniques (initially released in 2002)

Fixes

Clarified log message for sentences to use "part" time instead of "segment" time

Enhancements

Added newer OpenAI synthesis voices "Ash", "Coral" and "Sage"
Some small additions to the English heteronym lexicon

Pull requests merged

Synthesis.ts: fixed typo in symbol name by @Boorj in #91
Development.md: fixed a typo in a link by @kbulygin in #88
Options.md: Align, Whisper: added a note about timestampAccuracy by @kbulygin in #89

New Contributors

@Boorj made their first contribution in #91
@kbulygin made their first contribution in #88

Full Changelog: v2.0.14...v2.1.0

Contributors

Boorj and kbulygin

Assets 2

24 Dec 16:46

rotemdan

v2.0.14

d9c9c40

v2.0.14

Fixes

Whisper tokenizer: attempt to workaround #85 by accepting a token that's one beyond the valid range (51865 for a multilingual model, 51864 for an English-only model).

Full Changelog: v2.0.13...v2.0.14

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes

Enhancements

Fixes

Fixes

Features

Enhancements

Fixes

Fixes

Features

Enhancements

Fixes

New contributors

Contributors

Enhancements

Fixes

Changes in the lexicon format

Documentation

Enhancements

Features

Fixes

Enhancements

Pull requests merged

New Contributors

Contributors

Fixes

Releases: echogarden-project/echogarden

v2.3.3

Fixes

Enhancements

v2.3.2

Fixes

v2.3.1

Fixes

v2.3.0

Features

Enhancements

Fixes

v2.2.1

Fixes

v2.2.0

Features

Enhancements

Fixes

New contributors

Contributors

v2.1.2

Enhancements

Fixes

Changes in the lexicon format

Documentation

v2.1.1

Enhancements

v2.1.0

Features

Fixes

Enhancements

Pull requests merged

New Contributors

Contributors

v2.0.14

Fixes