api.py 没法生成音频，api_v2.py 也没法启动 #2013

Wu1905 · 2025-02-07T10:28:38Z

更新项目后，api.py 中有几处冲突。
clean_text 方法只有两个参数，但是在api.py中给入了3个参数
vq_model 没有记录version导致get_tts_wav中报错
cleaned_text_to_sequence 参数数量有问题

GPT-SoVITS/GPT_SoVITS/module/models.py 文件中 SynthesizerTrn类中的decode方法的refer变量返回为数组类型，导致后续处理失败（这个可能是哪个包更新导致的。）

api_v2.py 启动后报错

Loading Text2Semantic weights from GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch=12-step=369668.ckpt
Loading VITS weights from GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth
/opt/anaconda3/envs/GPTSoVits/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Traceback (most recent call last):
File "/opt/GPT_SoVITS/api_v2.py", line 143, in <module>
tts_pipeline = TTS(tts_config)
File "/opt/GPT_SoVITS/GPT_SoVITS/TTS_infer_pack/TTS.py", line 252, in init
self._init_models()
File "/opt/GPT_SoVITS/GPT_SoVITS/TTS_infer_pack/TTS.py", line 278, in _init_models
self.init_vits_weights(self.configs.vits_weights_path)
File "/opt/GPT_SoVITS/GPT_SoVITS/TTS_infer_pack/TTS.py", line 336, in init_vits_weights
vits_model.load_state_dict(dict_s2["weight"], strict=False)
File "/opt/anaconda3/envs/GPTSoVits/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for SynthesizerTrn:
size mismatch for enc_p.text_embedding.weight: copying a param with shape torch.Size([732, 192]) from checkpoint, the shape in current model is torch.Size([322, 192]).
size mismatch for ref_enc.spectral.0.fc.weight: copying a param with shape torch.Size([128, 704]) from checkpoint, the shape in current model is torch.Size([128, 1025]).

下面是我的依赖

absl-py==2.1.0
addict==2.4.0
aiofiles==23.2.1
aiohttp==3.9.5
aiosignal==1.3.1
aliyun-python-sdk-core==2.15.1
aliyun-python-sdk-kms==2.16.3
altair==5.3.0
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
anyio==4.4.0
async-timeout==4.0.3
attrs==23.2.0
audioread==3.0.1
av==12.3.0
Brotli @ file:///croot/brotli-split_1714483155106/work
certifi @ file:///croot/certifi_1738623731865/work/certifi
cffi==1.16.0
chardet==5.2.0
charset-normalizer @ file:///croot/charset-normalizer_1721748349566/work
click==8.1.7
cn2an==0.5.22
coloredlogs==15.0.1
contourpy==1.2.1
crcmod==1.7
cryptography==43.0.0
ctranslate2==4.3.1
cycler==0.12.1
Cython==0.29.37
datasets==2.20.0
decorator==5.1.1
dill==0.3.8
Distance==0.1.3
dnspython==2.6.1
editdistance==0.8.1
einops==0.8.0
email_validator==2.2.0
exceptiongroup==1.2.2
fastapi==0.111.1
fastapi-cli==0.0.4
faster-whisper==1.0.3
ffmpeg-python==0.2.0
ffmpy==0.3.2
filelock @ file:///croot/filelock_1700591183607/work
flatbuffers==24.3.25
fonttools==4.53.1
frozenlist==1.4.1
fsspec==2024.5.0
funasr==1.0.27
future==1.0.0
g2p-en==2.1.0
g2pk2==0.0.3
gast==0.6.0
gmpy2 @ file:///tmp/build/80754af9/gmpy2_1645438755360/work
gradio==4.24.0
gradio_client==0.14.0
grpcio==1.65.1
h11==0.14.0
hdbscan==0.8.37
httpcore==1.0.5
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.24.2
humanfriendly==10.0
hydra-core==1.3.2
idna @ file:///croot/idna_1714398848350/work
importlib_metadata==8.2.0
importlib_resources==6.4.0
inflect==7.3.1
jaconv==0.4.0
jamo==0.4.1
jieba==0.42.1
jieba_fast==0.53
Jinja2 @ file:///croot/jinja2_1716993405101/work
jmespath==0.10.0
joblib==1.4.2
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
kaldiio==2.18.0
kiwisolver==1.4.5
ko-pron==1.3
LangSegment==0.3.3
librosa==0.9.2
lightning-utilities==0.11.6
linkify-it-py==2.0.3
llvmlite==0.39.1
Markdown==3.6
markdown-it-py==2.2.0
MarkupSafe @ file:///croot/markupsafe_1704205993651/work
matplotlib==3.9.1
mdit-py-plugins==0.3.3
mdurl==0.1.2
mkl-fft @ file:///croot/mkl_fft_1695058164594/work
mkl-random @ file:///croot/mkl_random_1695059800811/work
mkl-service==2.4.0
modelscope==1.10.0
more-itertools==10.3.0
mpmath @ file:///croot/mpmath_1690848262763/work
multidict==6.0.5
multiprocess==0.70.16
networkx @ file:///croot/networkx_1717597493534/work
nltk==3.8.1
numba==0.56.4
numpy==1.23.4
omegaconf==2.3.0
onnxruntime==1.18.1
onnxruntime-gpu==1.19.2
openai-whisper==20240930
OpenCC==1.1.1
orjson==3.10.6
oss2==2.18.6
packaging==24.1
pandas==2.2.2
pillow @ file:///croot/pillow_1721059439630/work
platformdirs==4.2.2
pooch==1.8.2
proces==0.1.7
protobuf==4.25.4
psutil==6.0.0
py3langid==0.2.2
pyarrow==17.0.0
pyarrow-hotfix==0.6
pycparser==2.22
pycryptodome==3.20.0
pydantic==2.8.2
pydantic_core==2.20.1
pydub==0.25.1
Pygments==2.18.0
pynndescent==0.5.13
pyopenjtalk==0.3.4
pyparsing==3.1.2
pypinyin==0.51.0
PySocks @ file:///tmp/build/80754af9/pysocks_1605305812635/work
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-mecab-ko==1.3.7
python-mecab-ko-dic==2.1.1.post2
python-multipart==0.0.9
pytorch-lightning==2.3.3
pytorch-wpe==0.0.1
pytz==2024.1
PyYAML @ file:///croot/pyyaml_1698096049011/work
referencing==0.35.1
regex==2024.7.24
requests @ file:///croot/requests_1721410876868/work
resampy==0.4.3
rich==13.7.1
rotary-embedding-torch==0.8.6
rpds-py==0.19.1
ruff==0.9.5
safetensors==0.4.3
scikit-learn==1.5.1
scipy==1.13.1
semantic-version==2.10.0
sentencepiece==0.2.0
shellingham==1.5.4
simplejson==3.19.2
six==1.16.0
sniffio==1.3.1
sortedcontainers==2.4.0
soundfile==0.12.1
starlette==0.37.2
sympy @ file:///croot/sympy_1701397643339/work
tensorboard==2.17.0
tensorboard-data-server==0.7.2
tensorboardX==2.6.2.2
threadpoolctl==3.5.0
tiktoken==0.8.0
ToJyutping==3.2.0
tokenizers==0.19.1
tomli==2.0.1
tomlkit==0.12.0
toolz==0.12.1
torch==2.1.1
torch-complex==0.4.4
torchaudio==2.1.1
torchmetrics==1.4.0.post0
torchvision==0.16.1
tqdm==4.66.4
transformers==4.43.3
triton==2.1.0
typeguard==4.3.0
typer==0.12.3
typing_extensions @ file:///croot/typing_extensions_1715268824938/work
tzdata==2024.1
uc-micro-py==1.0.3
umap==0.1.1
umap-learn==0.5.7
urllib3 @ file:///croot/urllib3_1718912636303/work
uvicorn==0.30.3
uvloop==0.19.0
watchfiles==0.22.0
websockets==11.0.3
Werkzeug==3.0.3
wordsegment==1.3.1
xxhash==3.4.1
yapf==0.40.2
yarl==1.9.4
zipp==3.19.2

The text was updated successfully, but these errors were encountered:

HengIIQing · 2025-02-09T17:03:51Z

runtime\python api_v2.py -a 127.0.0.1 -p 9880 -c GPT_SoVITS/configs/tts_infer.yaml

tdouguo · 2025-03-06T15:12:35Z

我也是最新的整合包不行

hadestyz · 2025-03-14T02:18:53Z

现在这个有什么解决的办法嘛

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api.py 没法生成音频，api_v2.py 也没法启动 #2013

api.py 没法生成音频，api_v2.py 也没法启动 #2013

Wu1905 commented Feb 7, 2025 •

edited

Loading

HengIIQing commented Feb 9, 2025

tdouguo commented Mar 6, 2025

hadestyz commented Mar 14, 2025

api.py 没法生成音频，api_v2.py 也没法启动 #2013

api.py 没法生成音频，api_v2.py 也没法启动 #2013

Comments

Wu1905 commented Feb 7, 2025 • edited Loading

HengIIQing commented Feb 9, 2025

tdouguo commented Mar 6, 2025

hadestyz commented Mar 14, 2025

Wu1905 commented Feb 7, 2025 •

edited

Loading