-
Hi, Is there a way, preferably when making an API request, to lower the duration of pauses in between words and or sentences? The voices are rather slow sometimes, triggering an "end of speech detected" when I STT the TTS text |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @kaosbeat The only real way to speed up pauses etc, that I know of with the XTTS model is to use a reference sample WAV that speaks faster. So if you have the same person speaking slower the generated text speaks slower (from anecdotal evidence) and faster when the reference sample wav is spoken a bit faster. Beyond that, removing commas, semi-commas etc will remove pauses. There is a generation speed, which speeds up the whole of the TTS generation, though Ive not played with it much myself. https://docs.coqui.ai/en/latest/models/xtts.html#inference-parameters You could manually introduce speed into
However, do note what it says in the Coqui documents about Thanks |
Beta Was this translation helpful? Give feedback.
Hi @kaosbeat
The only real way to speed up pauses etc, that I know of with the XTTS model is to use a reference sample WAV that speaks faster. So if you have the same person speaking slower the generated text speaks slower (from anecdotal evidence) and faster when the reference sample wav is spoken a bit faster.
Beyond that, removing commas, semi-commas etc will remove pauses.
There is a generation speed, which speeds up the whole of the TTS generation, though Ive not played with it much myself.
https://docs.coqui.ai/en/latest/models/xtts.html#inference-parameters
You could manually introduce speed into
tts_server.py
by adding"speed": 1.6,
(or a number of your choosing) by placing it in …