update readme

2025-02-21 02:36:41 +00:00 · 2025-02-21 02:36:41 +00:00 · 93225563cd
commit 93225563cd
parent 5c21518870
1 changed files with 1 additions and 3 deletions
--- a/egs/wenetspeech4tts/TTS/README.md
+++ b/egs/wenetspeech4tts/TTS/README.md
@ -140,9 +140,7 @@ bash local/compute_wer.sh $output_dir $manifest

 # F5-TTS-Semantic-Token

-./f5-tts contains the code for training F5-TTS-Semantic-Token. We replaced the text tokens in F5-TTS with pretrained cosyvoice2 semantic tokens.
-
-We observed faster convergence and better prosody modeling results by doing this.
+./f5-tts contains the code for training F5-TTS-Semantic-Token. We replaced the text tokens in F5-TTS with pretrained cosyvoice2 semantic tokens. During inference, we use the pretrained CosyVoice2 LLM to predict the semantic tokens for target audios. We observed that this approach leads to faster convergence and improved prosody modeling results.

 Generated samples and training logs of wenetspeech basic 7k hours data can be found [here](https://huggingface.co/yuekai/f5-tts-semantic-token-small-wenetspeech4tts-basic/tree/main).