update pretrained checkpoint usage

2025-08-09 01:52:41 +00:00 · 2025-01-22 10:06:22 +00:00 · 2025-01-22 10:06:22 +00:00 · 9416aa5a7c
commit 9416aa5a7c
parent 18fc1a332c
1 changed files with 16 additions and 1 deletions
--- a/egs/wenetspeech4tts/TTS/README.md
+++ b/egs/wenetspeech4tts/TTS/README.md
@ -79,6 +79,7 @@ Preparation:
 ```
 bash prepare.sh --stage 5 --stop_stage 6
 ```
+(Note: To compatiable with F5-TTS official checkpoint, we direclty use `vocab.txt` from [here.](https://github.com/SWivid/F5-TTS/blob/129014c5b43f135b0100d49a0c6804dd4cf673e1/data/Emilia_ZH_EN_pinyin/vocab.txt) To generate your own `vocab.txt`, you may refer to [the script](https://github.com/SWivid/F5-TTS/blob/main/src/f5_tts/train/datasets/prepare_emilia.py).)

 The training command is given below:

@ -96,7 +97,7 @@ python3 f5-tts/train.py --max-duration 700 --filter-min-duration 0.5 --filter-ma
      --exp-dir ${exp_dir} --world-size ${world_size}
 ```

-To inference, use:
+To inference with Icefall Wenetspeech4TTS trained F5-Small, use:
 ```
 huggingface-cli login
 huggingface-cli download --local-dir seed_tts_eval yuekai/seed_tts_eval --repo-type dataset
@ -116,6 +117,20 @@ accelerate launch f5-tts/infer.py --nfe 16 --model-path $model_path --manifest-f
 bash local/compute_wer.sh $output_dir $manifest
 ```

+To inference with official Emilia trained F5-Base, use:
+```
+huggingface-cli login
+huggingface-cli download --local-dir seed_tts_eval yuekai/seed_tts_eval --repo-type dataset
+huggingface-cli download --local-dir F5-TTS SWivid/F5-TTS
+huggingface-cli download nvidia/bigvgan_v2_24khz_100band_256x --local-dir bigvgan_v2_24khz_100band_256x
+
+manifest=./seed_tts_eval/seedtts_testset/zh/meta.lst
+model_path=./F5-TTS/F5TTS_Base_bigvgan/model_1250000.pt
+
+accelerate launch f5-tts/infer.py --nfe 16 --model-path $model_path --manifest-file $manifest --output-dir $output_dir
+bash local/compute_wer.sh $output_dir $manifest
+```
+
 # Credits
 - [VALL-E](https://github.com/lifeiteng/vall-e)
 - [F5-TTS](https://github.com/SWivid/F5-TTS)