* Add streaming feature extractor. * Parallel streaming decode with greedy search. * Fix typos. * Use torch.stack() to replace torch.cat()