mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-12-09 14:05:33 +00:00
update readme
This commit is contained in:
parent
5c21518870
commit
93225563cd
@ -140,9 +140,7 @@ bash local/compute_wer.sh $output_dir $manifest
|
||||
|
||||
# F5-TTS-Semantic-Token
|
||||
|
||||
./f5-tts contains the code for training F5-TTS-Semantic-Token. We replaced the text tokens in F5-TTS with pretrained cosyvoice2 semantic tokens.
|
||||
|
||||
We observed faster convergence and better prosody modeling results by doing this.
|
||||
./f5-tts contains the code for training F5-TTS-Semantic-Token. We replaced the text tokens in F5-TTS with pretrained cosyvoice2 semantic tokens. During inference, we use the pretrained CosyVoice2 LLM to predict the semantic tokens for target audios. We observed that this approach leads to faster convergence and improved prosody modeling results.
|
||||
|
||||
Generated samples and training logs of wenetspeech basic 7k hours data can be found [here](https://huggingface.co/yuekai/f5-tts-semantic-token-small-wenetspeech4tts-basic/tree/main).
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user