update readme

This commit is contained in:
root 2025-02-21 02:36:41 +00:00
parent 5c21518870
commit 93225563cd

View File

@ -140,9 +140,7 @@ bash local/compute_wer.sh $output_dir $manifest
# F5-TTS-Semantic-Token
./f5-tts contains the code for training F5-TTS-Semantic-Token. We replaced the text tokens in F5-TTS with pretrained cosyvoice2 semantic tokens.
We observed faster convergence and better prosody modeling results by doing this.
./f5-tts contains the code for training F5-TTS-Semantic-Token. We replaced the text tokens in F5-TTS with pretrained cosyvoice2 semantic tokens. During inference, we use the pretrained CosyVoice2 LLM to predict the semantic tokens for target audios. We observed that this approach leads to faster convergence and improved prosody modeling results.
Generated samples and training logs of wenetspeech basic 7k hours data can be found [here](https://huggingface.co/yuekai/f5-tts-semantic-token-small-wenetspeech4tts-basic/tree/main).