Releases · 2noise/ChatTTS · GitHub

05 Nov 05:14

fumiama

v0.2.1: doc(conda): pinning python version (#816) Latest

Latest

New

NPU support (#777) by @shen-shanshan

Fixed

No remove_interjections in Chinese normalizer (#783) by @IrisSally

Optimized

Change gamma to weight in DVAE for compatibility (#733) by @zly-idleness
Migrate all models to safetensors (b9b007e) by @fumiama

新增

NPU 支持 (#777) by @shen-shanshan

修复

中文正则化器中缺少remove_interjections (#783) by @IrisSally

优化

为了兼容性，将 DVAE 中的 gamma 改为 weight (#733) by @zly-idleness
将所有模型迁移到 safetensors (b9b007e) by @fumiama

Contributors

fumiama, shen-shanshan, and 2 other contributors

Assets 2

09 Oct 15:10

fumiama

v0.2.0

New

Simple HTTP API example (#530) by @briancaffey
Command-line streaming inference example (#512) by @ZaymeShaw
Batch Vocos decoding (generate matrix by longest token, fill the rest with 0) (6e18575) by @fumiama
Adaptation to new VQ Encoder (9f0b7a0) by @fumiama
ZeroShot support (b4da237) by @fumiama
Initial support for vLLM (8e6184e) by @ylzz1997
Added --source and --custom_path parameters to command-line examples (#669) by @weedge
Added inplace parameter to Speaker class for fine-tuning (#679) by @ain-soph
Added experimental parameter to Chat.load function (#682) by @ain-soph

Fixed

Intermittent glitches in streaming inference audio (7ee5426) by @fumiama
Different accent per generation even under the same parameters in WebUI (3edd47c) by @fumiama
normalizer changed tag format, causing model to read out the tag (c6bae90) by @fumiama
Replaced unaccessable GitCode mirror (c06f1d4) by @fumiama
Error in handling repetition penalty idx (#738) by @niuzheng168

Optimized

Completely removed pretrain_models dictionary (77c7e20) by @fumiama
Made tokenizer a standalone class (77c7e20) by @fumiama
normalizer will remove all unsupported characters now to avoid inference errors (0f47a87) by @fumiama
Removed config folder, settings are now directly embedded into the code for easier changes (27331c3) by @fumiama
Removed extra whitespace at the end of streaming inference (#564) by @Ox0400
Added manual_seed parameter, which directly provides generator to multinomial, avoiding impact on torch environment (e675a59) by @fumiama
tokenizer loading switched from .pt to the built-in from_pretrained method to eliminate potential malicious code loading (80b24e6) by @fumiama
Made speaker class standalone, placing spk_stat related content within it, and directly wrote its values into the settings class due to its small size (f3dcd97) by @fumiama
Chat.load will set compile=False now by default (7e33889) by @fumiama
Switched GPT to safetensor model (8a503fd) by @fumiama

Dependencies

Changed code license to open-source AGPL3.0 (9f402ba)

新增

简单的 HTTP API 示例 (#530) by @briancaffey
命令行流式推理实例 (#512) by @ZaymeShaw
批量 Vocos 解码（按最长 token 生成矩阵，其余填0） (6e18575) by @fumiama
适配新 VQ Encoder (9f0b7a0) by @fumiama
ZeroShot 支持 (b4da237) by @fumiama
初步支持 vLLM (8e6184e) by @ylzz1997
命令行示例增加--source和--custom_path参数 (#669) by @weedge
为引入微调，给Speaker类增加inplace参数 (#679) by @ain-soph
给Chat.load函数增加experimental参数 (#682) by @ain-soph

修复

流式推理的声音有间断性毛刺 (7ee5426) by @fumiama
WebUI 相同条件下音频生成每次不同 (3edd47c) by @fumiama
normalizer更改 tag 导致模型将 tag 读出 (c6bae90) by @fumiama
更换已失效的 GitCode 镜像 (c06f1d4) by @fumiama
repetition penalty idx 处理错误 (#738) by @niuzheng168

优化

彻底移除pretrain_models字典 (77c7e20) by @fumiama
将tokenizer独立为一个类 (77c7e20) by @fumiama
normalizer将所有不支持的字符删除以免推理出错 (0f47a87) by @fumiama
取消config文件夹，直接把设置写入代码方便更改 (27331c3) by @fumiama
删除流式推理末尾多余的空白 (#564) by @Ox0400
在调用前设置manual_seed改为直接给multinomial提供generator，避免影响 torch 环境 (e675a59) by @fumiama
将tokenizer从直接加载.pt改为调用自带的from_pretrained方法，从而消除可能的恶意代码加载 (80b24e6) by @fumiama
独立speaker类，放置spk_stat相关内容，同时因为该模型很小，所以直接将它的值写入了设置类 (f3dcd97) by @fumiama
Chat.load参数改为默认关闭编译 (7e33889) by @fumiama
GPT 切换到safetensor模型 (8a503fd) by @fumiama

依赖

代码许可证更改为开源的 AGPL3.0 (9f402ba)

Contributors

weedge, niuzheng168, and 6 other contributors

Assets 2

04 Jul 05:31

fumiama

v0.1.1

New

Apple MPS GPU (Experimental, off by default) (#261, #472) by @rasonyang
Replacement of rare characters (Chinese characters) (#350) by @6drf21e
local loading mode, renamed original local to custom (#361) by @fumiama
Core supports streaming inference (#360) by @Ox0400
WebUI supports streaming inference (#380) by @v3ucn
User customizable logger (#398) by @fumiama
CMD supports batch inference (#366) by @Ox0400
Customizable DVAE coef parameter (#405) by @fumiama
download_models unload API (4dd1f88) by @fumiama
Normalizer changed to registration type, users can register interfaces that meet the requirements (#420) by @fumiama
Improved type annotations, all dict parameters changed to dataclass for easy auto-completion when calling (#422) by @fumiama
Interruptable inference process, which will return the currently inferred part (#433) by @fumiama
Experimental: NVIDIA TransformerEngine support (#496) by @fumiama
Infer parameter show_tqdm (3836db8) by @fumiama
Experimental: flash_attention_2 support (c109089) by @fumiama

Fixed

Normalizer initialization error (#343) by @fumiama
Compile error handling (#377, #413) by @asamaayako
Possible addition of [spk_emb] when refining text (#464) by @fumiama
Inconsistent tone when inferring a list of texts (#492) by @fumiama
Possible return of None voice when inferring (#511) by @fumiama

Optimized

DVAE tensor operation process (#273) by @ain-soph
MPS inference sound quality (#373) by @LeoN0425
Added _ prefix for internal calls (4dd1f88) by @fumiama
Renamed check_model to has_loaded (4dd1f88) by @fumiama
Renamed load_model to load (#432) by @fumiama
Verify file hash when customizing model loading path to prevent tampering (#453) by @fumiama
Default output to mp3 format (#449) by @fumiama
Changed spk_emb to str type for easy customization, copying, and sharing of tones (#463) by @fumiama
Removed useless tensor dimension swap in DVAE (#488) by @charSLee013

Dependencies

Relaxed dependency restrictions for easier installation

新增

Apple MPS GPU (实验性, 默认不开启) (#261, #472) by @rasonyang
替换生僻字（汉字） (#350) by @6drf21e
local加载模式，重命名原local到custom (#361) by @fumiama
core 支持流式推理 (#360) by @Ox0400
webui 支持流式推理 (#380) by @v3ucn
用户可自定义 logger (#398) by @fumiama
cmd 支持批量推理 (#366) by @Ox0400
可自定义 DVAE coef 参数 (#405) by @fumiama
download_models unload API (4dd1f88) by @fumiama
normalizer 改为注册式，用户可以自行注册符合要求的接口 (#420) by @fumiama
完善类型注解，将所有dict传参改为dataclass，方便调用时自动补全 (#422) by @fumiama
打断推理进程，返回当前已推理的部分 (#433) by @fumiama
实验性：NVIDIA TransformerEngine 支持 (#496) by @fumiama
infer 参数 show_tqdm (3836db8) by @fumiama
实验性：flash_attention_2 支持 (c109089) by @fumiama

修复

Normalizer 初始化错误 (#343) by @fumiama
compile 错误处理 (#377, #413) by @asamaayako
refine_text() 时可能加入 [spk_emb] (#464) by @fumiama
infer 传入文本列表时音色不统一 (#492) by @fumiama
infer 可能概率返回 None 语音 (#511) by @fumiama

优化

DVAE 张量运算流程 (#273) by @ain-soph
MPS推理音质 (#373) by @LeoN0425
为内部调用增加_前缀 (4dd1f88) by @fumiama
重命名 check_model 为 has_loaded (4dd1f88) by @fumiama
重命名 load_model 为 load (#432) by @fumiama
自定义加载模型路径时校验文件哈希以免被篡改 (#453) by @fumiama
默认输出 mp3 格式 (#449) by @fumiama
spk_emb 改成 str 类型方便自定义、拷贝、分享音色 (#463) by @fumiama
移除 DVAE 中无用的张量维度交换 (#488) by @charSLee013

依赖

放宽依赖限制使安装更容易

Contributors

v3ucn, rasonyang, and 7 other contributors

Assets 2