icefall

Author	SHA1	Message	Date
Daniel Povey	ab7c940803	Include changes from Liyong about padding conformer module.	2022-09-30 18:37:31 +08:00
Daniel Povey	38f89053bd	Introduce feature mask per frame	2022-09-29 17:31:04 +08:00
Daniel Povey	056b9a4f9a	Apply single_prob mask, so sometimes we just get one layer as output.	2022-09-29 15:29:37 +08:00
Daniel Povey	d8f7310118	Add print statement	2022-09-29 14:15:29 +08:00
Daniel Povey	d398f0ed70	Decrease random_prob from 0.5 to 0.333	2022-09-29 13:55:33 +08:00
Daniel Povey	461ad3655a	Implement AttentionCombine as replacement for RandomCombine	2022-09-29 13:44:03 +08:00
Zengwei Yao	f3ad32777a	Gradient filter for training lstm model (#564 ) * init files * add gradient filter module * refact getting median value * add cutoff for grad filter * delete comments * apply gradient filter in LSTM module, to filter both input and params * fix typing and refactor * filter with soft mask * rename lstm_transducer_stateless2 to lstm_transducer_stateless3 * fix typos, and update RESULTS.md * minor fix * fix return typing * fix typo	2022-09-29 11:15:43 +08:00
LIyong.Guo	923b60a7c6	padding zeros (#591 )	2022-09-28 21:20:33 +08:00
Daniel Povey	d6ef1bec5f	Change subsamplling factor from 1 to 2	2022-09-28 21:10:13 +08:00
Daniel Povey	14a2603ada	Bug fix	2022-09-28 20:59:24 +08:00
Daniel Povey	e5666628bd	Bug fix	2022-09-28 20:58:34 +08:00
Daniel Povey	df795912ed	Try to reproduce baseline but with current code with 2 encoder stacks, as a baseline	2022-09-28 20:56:40 +08:00
Fangjun Kuang	3b5846effa	Update kaldifeat in CI tests (#583 )	2022-09-28 20:51:06 +08:00
Daniel Povey	1005ff35ba	Fix w.r.t. uneven upsampling	2022-09-28 13:57:26 +08:00
Daniel Povey	10a3061025	Simplify downsampling and upsampling	2022-09-28 13:49:11 +08:00
Daniel Povey	01af88c2f6	Various fixes	2022-09-27 16:09:30 +08:00
Daniel Povey	d34eafa623	Closer to working..	2022-09-27 15:47:58 +08:00
Daniel Povey	e5a0d8929b	Remove unused out_balancer member	2022-09-27 13:10:59 +08:00
Daniel Povey	6b12f20995	Remove out_balancer and out_norm from conv modules	2022-09-27 12:25:11 +08:00
Daniel Povey	76e66408c5	Some cosmetic improvements	2022-09-27 11:08:44 +08:00
Daniel Povey	71b3756ada	Use half the dim per head, in self_attn layers.	2022-09-24 15:40:44 +08:00
Daniel Povey	ce3f59d9c7	Use dropout in attention, on attn weights.	2022-09-22 19:18:50 +08:00
Daniel Povey	24aea947d2	Fix issues where grad is None, and unused-grad cases	2022-09-22 19:18:16 +08:00
Daniel Povey	c16f795962	Avoid error in ddp by using last module'sc scores	2022-09-22 18:52:16 +08:00
Daniel Povey	0f85a3c2e5	Implement persistent attention scores	2022-09-22 18:47:16 +08:00
Daniel Povey	03a77f8ae5	Merge branch 'scaled_adam_exp7c' into scaled_adam_exp11c	2022-09-22 18:15:44 +08:00
Daniel Povey	ceadfad48d	Reduce debug freq	2022-09-22 12:30:49 +08:00
Daniel Povey	1d20c12bc0	Increase max_var_per_eig to 0.2	2022-09-22 12:28:35 +08:00
Fangjun Kuang	9ae2f3a3c5	Small fixes to the transducer training doc (#575 )	2022-09-21 14:20:49 +08:00
Fangjun Kuang	099cd3a215	support exporting to ncnn format via PNNX (#571 )	2022-09-20 22:52:49 +08:00
Daniel Povey	e2fdfe990c	Loosen limit on param_max_rms, from 2.0 to 3.0; change how param_min_rms is applied.	2022-09-20 15:20:43 +08:00
Daniel Povey	6eb9a0bc9b	Halve max_var_per_eig to 0.05	2022-09-20 14:39:17 +08:00
Daniel Povey	cd5ac76a05	Add max-var-per-eig in encoder layers	2022-09-20 14:22:07 +08:00
Daniel Povey	db1f4ccdd1	4x scale on max-eig constraint	2022-09-20 14:20:13 +08:00
Teo Wen Shen	436942211c	Adding Dockerfile for Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8 (#572 ) * Changed Dockerfile * Update Dockerfile * Dockerfile * Update README.md * Add Dockerfiles * Update README.md Removed misleading CUDA version, as the Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8 Dockerfile can only support CUDA versions >11.0.	2022-09-20 10:52:24 +08:00
Daniel Povey	3d72a65de8	Implement max-eig-proportion..	2022-09-19 10:26:37 +08:00
Daniel Povey	5f27cbdb44	Merge branch 'scaled_adam_exp4_max_var_per_eig' into scaled_adam_exp7 # Conflicts: # egs/librispeech/ASR/pruned_transducer_stateless7/conformer.py	2022-09-18 21:23:59 +08:00
Daniel Povey	0f567e27a5	Add max_var_per_eig in self-attn	2022-09-18 21:22:01 +08:00
Daniel Povey	eb77fa7aaa	Restore min_positive,max_positive limits on linear_pos projection	2022-09-18 14:38:30 +08:00
Daniel Povey	69404f61ef	Use scalar_lr_scale for scalars as well as sizes.	2022-09-18 14:12:27 +08:00
Daniel Povey	76031a7c1d	Loosen some limits of activation balancers	2022-09-18 13:59:44 +08:00
Daniel Povey	3122637266	Use ScaledLinear where I previously had StructuredLinear	2022-09-17 13:18:58 +08:00
Daniel Povey	4a2b940321	Remove StructuredLinear,StructuredConv1d	2022-09-17 13:14:08 +08:00
Daniel Povey	1a184596b6	A little code refactoring	2022-09-16 20:56:21 +08:00
Fangjun Kuang	97b3fc53aa	Add LSTM for the multi-dataset setup. (#558 ) * Add LSTM for the multi-dataset setup. * Add results * fix style issues * add missing file	2022-09-16 18:40:25 +08:00
Daniel Povey	bb1bee4a7b	Improve how quartiles are printed	2022-09-16 17:30:03 +08:00
Daniel Povey	5f55f80fbb	Configure train.py with clipping_scale=2.0	2022-09-16 17:19:52 +08:00
Daniel Povey	8298333bd2	Implement gradient clipping.	2022-09-16 16:52:46 +08:00
Daniel Povey	8f876b3f54	Remove batching from ScaledAdam, in preparation to add gradient norm clipping	2022-09-16 15:42:56 +08:00
Daniel Povey	3b450c2682	Bug fix in train.py, fix optimzier name	2022-09-16 14:10:42 +08:00

... 8 9 10 11 12 ...

1448 Commits