Releases: 2noise/ChatTTS
Releases · 2noise/ChatTTS
v0.2.1: doc(conda): pinning python version (#816)
New
- NPU support (#777) by @shen-shanshan
Fixed
- No
remove_interjections
in Chinese normalizer (#783) by @IrisSally
Optimized
- Change gamma to weight in DVAE for compatibility (#733) by @zly-idleness
- Migrate all models to safetensors (b9b007e) by @fumiama
新增
- NPU 支持 (#777) by @shen-shanshan
修复
- 中文正则化器中缺少
remove_interjections
(#783) by @IrisSally
优化
- 为了兼容性,将 DVAE 中的 gamma 改为 weight (#733) by @zly-idleness
- 将所有模型迁移到 safetensors (b9b007e) by @fumiama
v0.2.0
New
- Simple HTTP API example (#530) by @briancaffey
- Command-line streaming inference example (#512) by @ZaymeShaw
- Batch Vocos decoding (generate matrix by longest token, fill the rest with 0) (6e18575) by @fumiama
- Adaptation to new VQ Encoder (9f0b7a0) by @fumiama
- ZeroShot support (b4da237) by @fumiama
- Initial support for vLLM (8e6184e) by @ylzz1997
- Added
--source
and--custom_path
parameters to command-line examples (#669) by @weedge - Added
inplace
parameter toSpeaker
class for fine-tuning (#679) by @ain-soph - Added
experimental
parameter toChat.load
function (#682) by @ain-soph
Fixed
- Intermittent glitches in streaming inference audio (7ee5426) by @fumiama
- Different accent per generation even under the same parameters in WebUI (3edd47c) by @fumiama
normalizer
changed tag format, causing model to read out the tag (c6bae90) by @fumiama- Replaced unaccessable GitCode mirror (c06f1d4) by @fumiama
- Error in handling repetition penalty idx (#738) by @niuzheng168
Optimized
- Completely removed
pretrain_models
dictionary (77c7e20) by @fumiama - Made
tokenizer
a standalone class (77c7e20) by @fumiama normalizer
will remove all unsupported characters now to avoid inference errors (0f47a87) by @fumiama- Removed
config
folder, settings are now directly embedded into the code for easier changes (27331c3) by @fumiama - Removed extra whitespace at the end of streaming inference (#564) by @Ox0400
- Added
manual_seed
parameter, which directly providesgenerator
tomultinomial
, avoiding impact on torch environment (e675a59) by @fumiama tokenizer
loading switched from.pt
to the built-infrom_pretrained
method to eliminate potential malicious code loading (80b24e6) by @fumiama- Made
speaker
class standalone, placingspk_stat
related content within it, and directly wrote its values into the settings class due to its small size (f3dcd97) by @fumiama Chat.load
will setcompile=False
now by default (7e33889) by @fumiama- Switched GPT to
safetensor
model (8a503fd) by @fumiama
Dependencies
- Changed code license to open-source AGPL3.0 (9f402ba)
新增
- 简单的 HTTP API 示例 (#530) by @briancaffey
- 命令行流式推理实例 (#512) by @ZaymeShaw
- 批量 Vocos 解码(按最长 token 生成矩阵,其余填0) (6e18575) by @fumiama
- 适配新 VQ Encoder (9f0b7a0) by @fumiama
- ZeroShot 支持 (b4da237) by @fumiama
- 初步支持 vLLM (8e6184e) by @ylzz1997
- 命令行示例增加
--source
和--custom_path
参数 (#669) by @weedge - 为引入微调,给
Speaker
类增加inplace
参数 (#679) by @ain-soph - 给
Chat.load
函数增加experimental
参数 (#682) by @ain-soph
修复
- 流式推理的声音有间断性毛刺 (7ee5426) by @fumiama
- WebUI 相同条件下音频生成每次不同 (3edd47c) by @fumiama
normalizer
更改 tag 导致模型将 tag 读出 (c6bae90) by @fumiama- 更换已失效的 GitCode 镜像 (c06f1d4) by @fumiama
- repetition penalty idx 处理错误 (#738) by @niuzheng168
优化
- 彻底移除
pretrain_models
字典 (77c7e20) by @fumiama - 将
tokenizer
独立为一个类 (77c7e20) by @fumiama normalizer
将所有不支持的字符删除以免推理出错 (0f47a87) by @fumiama- 取消
config
文件夹,直接把设置写入代码方便更改 (27331c3) by @fumiama - 删除流式推理末尾多余的空白 (#564) by @Ox0400
- 在调用前设置
manual_seed
改为直接给multinomial
提供generator
,避免影响 torch 环境 (e675a59) by @fumiama - 将
tokenizer
从直接加载.pt
改为调用自带的from_pretrained
方法,从而消除可能的恶意代码加载 (80b24e6) by @fumiama - 独立
speaker
类,放置spk_stat
相关内容,同时因为该模型很小,所以直接将它的值写入了设置类 (f3dcd97) by @fumiama Chat.load
参数改为默认关闭编译 (7e33889) by @fumiama- GPT 切换到
safetensor
模型 (8a503fd) by @fumiama
依赖
- 代码许可证更改为开源的 AGPL3.0 (9f402ba)
v0.1.1
New
- Apple MPS GPU (Experimental, off by default) (#261, #472) by @rasonyang
- Replacement of rare characters (Chinese characters) (#350) by @6drf21e
local
loading mode, renamed originallocal
tocustom
(#361) by @fumiama- Core supports streaming inference (#360) by @Ox0400
- WebUI supports streaming inference (#380) by @v3ucn
- User customizable logger (#398) by @fumiama
- CMD supports batch inference (#366) by @Ox0400
- Customizable DVAE coef parameter (#405) by @fumiama
download_models
unload
API (4dd1f88) by @fumiama- Normalizer changed to registration type, users can register interfaces that meet the requirements (#420) by @fumiama
- Improved type annotations, all dict parameters changed to dataclass for easy auto-completion when calling (#422) by @fumiama
- Interruptable inference process, which will return the currently inferred part (#433) by @fumiama
- Experimental: NVIDIA TransformerEngine support (#496) by @fumiama
- Infer parameter
show_tqdm
(3836db8) by @fumiama - Experimental: flash_attention_2 support (c109089) by @fumiama
Fixed
- Normalizer initialization error (#343) by @fumiama
- Compile error handling (#377, #413) by @asamaayako
- Possible addition of
[spk_emb]
when refining text (#464) by @fumiama - Inconsistent tone when inferring a list of texts (#492) by @fumiama
- Possible return of None voice when inferring (#511) by @fumiama
Optimized
- DVAE tensor operation process (#273) by @ain-soph
- MPS inference sound quality (#373) by @LeoN0425
- Added
_
prefix for internal calls (4dd1f88) by @fumiama - Renamed
check_model
tohas_loaded
(4dd1f88) by @fumiama - Renamed
load_model
toload
(#432) by @fumiama - Verify file hash when customizing model loading path to prevent tampering (#453) by @fumiama
- Default output to mp3 format (#449) by @fumiama
- Changed spk_emb to str type for easy customization, copying, and sharing of tones (#463) by @fumiama
- Removed useless tensor dimension swap in DVAE (#488) by @charSLee013
Dependencies
- Relaxed dependency restrictions for easier installation
新增
- Apple MPS GPU (实验性, 默认不开启) (#261, #472) by @rasonyang
- 替换生僻字(汉字) (#350) by @6drf21e
local
加载模式,重命名原local
到custom
(#361) by @fumiama- core 支持流式推理 (#360) by @Ox0400
- webui 支持流式推理 (#380) by @v3ucn
- 用户可自定义 logger (#398) by @fumiama
- cmd 支持批量推理 (#366) by @Ox0400
- 可自定义 DVAE coef 参数 (#405) by @fumiama
download_models
unload
API (4dd1f88) by @fumiama- normalizer 改为注册式,用户可以自行注册符合要求的接口 (#420) by @fumiama
- 完善类型注解,将所有dict传参改为dataclass,方便调用时自动补全 (#422) by @fumiama
- 打断推理进程,返回当前已推理的部分 (#433) by @fumiama
- 实验性:NVIDIA TransformerEngine 支持 (#496) by @fumiama
- infer 参数
show_tqdm
(3836db8) by @fumiama - 实验性:flash_attention_2 支持 (c109089) by @fumiama
修复
- Normalizer 初始化错误 (#343) by @fumiama
- compile 错误处理 (#377, #413) by @asamaayako
- refine_text() 时可能加入
[spk_emb]
(#464) by @fumiama - infer 传入文本列表时音色不统一 (#492) by @fumiama
- infer 可能概率返回 None 语音 (#511) by @fumiama
优化
- DVAE 张量运算流程 (#273) by @ain-soph
- MPS推理音质 (#373) by @LeoN0425
- 为内部调用增加
_
前缀 (4dd1f88) by @fumiama - 重命名
check_model
为has_loaded
(4dd1f88) by @fumiama - 重命名
load_model
为load
(#432) by @fumiama - 自定义加载模型路径时校验文件哈希以免被篡改 (#453) by @fumiama
- 默认输出 mp3 格式 (#449) by @fumiama
- spk_emb 改成 str 类型方便自定义、拷贝、分享音色 (#463) by @fumiama
- 移除 DVAE 中无用的张量维度交换 (#488) by @charSLee013
依赖
- 放宽依赖限制使安装更容易