Skip to content
View agonzalezd's full-sized avatar
  • Vicomtech

Highlights

  • Pro

Block or report agonzalezd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TTS Towards Human-Sounding Speech

Python 3,205 237 Updated Mar 27, 2025

Open TTS models, built for streaming on the edge

Jupyter Notebook 39 3 Updated Mar 16, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 423 23 Updated Mar 28, 2025

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 485 34 Updated Mar 12, 2025

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 1,180 128 Updated Mar 24, 2025

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,644 217 Updated Aug 1, 2024

Foundational model for human-like, expressive TTS

Python 4,077 682 Updated Jul 30, 2024

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 396 42 Updated Sep 13, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 43,366 4,823 Updated Mar 26, 2025

Code to apply microprosodic effects to pitch contours used for articulatory speech synthesis.

Python 3 Updated Jul 28, 2022

Easy-to-Use Speech MOS predictors

Python 272 16 Updated Oct 24, 2023

Easily train a good VC model with voice data <= 10 mins!

Python 28,281 4,000 Updated Nov 24, 2024

📋 A list of open LLMs available for commercial use.

11,852 826 Updated Feb 13, 2025

WavJourney: Compositional Audio Creation with LLMs

Python 534 43 Updated Sep 28, 2023

Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2

Python 81 38 Updated Nov 22, 2020

Unicode to ASCII transliteration - C Elixir Go Java JS Julia PHP Python Ruby Rust Shell .NET

Kotlin 301 25 Updated Feb 17, 2025

Verbos en español con sus conjugaciones

TSQL 29 4 Updated Aug 20, 2019

A fast, local neural text to speech system

C++ 8,349 629 Updated Mar 3, 2025

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,091 526 Updated Jul 27, 2024

Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.

Python 35 16 Updated Jun 25, 2024

ICASSP 2023 Accepted

Python 189 14 Updated May 6, 2024

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,621 376 Updated Dec 4, 2024

Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

Python 39 6 Updated Jul 17, 2021
Python 56 4 Updated Jan 13, 2023

HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion

Python 350 55 Updated Oct 1, 2024

A collection of utilities for handling IPA phones.

Python 25 2 Updated Sep 24, 2023

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Python 1,984 562 Updated Oct 27, 2023

phoneme tokenizer and grapheme-to-phoneme model for 8k languages

Python 156 15 Updated Jun 9, 2023

[INTERSPEECH'2022] Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning

Python 81 11 Updated Nov 4, 2022

Acoustic models for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion

Python 102 25 Updated Jul 12, 2023
Next
Showing results