Fangjun Kuang 2bca7032af
Update RNNLM training scripts (#720)
* Update RNNLM training scripts

* Fix a typo

* Fix CI
2022-12-01 15:57:43 +08:00
..
2022-11-17 09:42:17 -05:00
2022-12-01 15:57:43 +08:00

Description

(Note: the experiments here are only about language modeling)

ptb is short for Penn Treebank.

About the Penn Treebank corpus:

  • This corpus is free for research purposes
  • ptb.train.txt: train set
  • ptb.valid.txt: development set (should be used just for tuning hyper-parameters, but not for training)
  • ptb.test.txt: test set for reporting perplexity

You can download the dataset from one of the following URLs: