Generated voice is not a valid speech #722

cod3r0k · 2025-02-03T10:55:56Z

Hi, I trained my model, and during the inference phase, I followed these steps but did not hear a vocal voice. Why?

Step1: Training

python fish_speech/train.py --config-name text2semantic_finetune     project="run3"     [email protected]_config=r_8_alpha_16

Step2: Inference:

prepare model for inference:

python tools/llama/merge_lora.py     --lora-config r_8_alpha_16     --base-weight checkpoints/fish-speech-1.5     --lora-weight results/run3/checkpoints/step_000045900.ckpt     --output checkpoints/fish-speech-1.5-yth-lora-2A100

Then as mentioned in documentation (https://speech.fish.audio/inference/#1-generate-prompt-from-voice):

python fish_speech/models/vqgan/inference.py     -i "paimon.wav"     --checkpoint-path "checkpoints/fish-speech-1.5-yth-lora-2A100/model.pth"

I can not hear any valid voice (https://drive.google.com/file/d/1w3MPQ6jL0Mc5qneBF2fgR9G7-aoTiBtP/view?usp=sharing)

Also, the next evaluation step, as mentioned in the documentation (https://speech.fish.audio/inference/#2-generate-semantic-tokens-from-text) is not working well for me to generate voice:

fish_speech/models/text2semantic/inference.py     --text "The text you want to convert"     --prompt-text "Your reference text"     --prompt-tokens "fake.npy"     --checkpoint-path "checkpoints/fish-speech-1.5-yth-lora-2A100/"     --num-samples 2     --compile

and

python fish_speech/models/vqgan/inference.py     -i "codes_0.npy"     --checkpoint-path "checkpoints/fish-speech-1.5-yth-lora-2A100/model.pth"

which return

2025-02-03 10:52:33.867 | INFO     | __main__:main:99 - Processing precomputed indices from codes_0.npy
2025-02-03 10:52:34.328 | INFO     | __main__:main:113 - Generated audio of shape torch.Size([1, 1, 112640]), equivalent to 2.55 seconds from 55 features, features/second: 21.53
2025-02-03 10:52:34.332 | INFO     | __main__:main:120 - Saved audio to fake.wav

And the fake.wav audio is attached at [Google Drive link]. Could you guide me on why it does not generate a valid response?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated voice is not a valid speech #722

Generated voice is not a valid speech #722

cod3r0k commented Feb 3, 2025 •

edited

Loading

Generated voice is not a valid speech #722

Generated voice is not a valid speech #722

Comments

cod3r0k commented Feb 3, 2025 • edited Loading

Step1: Training

Step2: Inference:

cod3r0k commented Feb 3, 2025 •

edited

Loading