mirror of
https://github.com/csukuangfj/kaldifeat.git
synced 2025-08-10 18:42:17 +00:00
Add documentation (#39)
* Add documentation. * Build doc in GitHub actions. * Fix a typo. * Minor fixes.
This commit is contained in:
parent
ae17334ce2
commit
c6e643951f
62
.github/workflows/build-doc.yml
vendored
Normal file
62
.github/workflows/build-doc.yml
vendored
Normal file
@ -0,0 +1,62 @@
|
|||||||
|
# Copyright 2022 Xiaomi Corp. (author: Fangjun Kuang)
|
||||||
|
|
||||||
|
# See ../../LICENSE for clarification regarding multiple authors
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
|
||||||
|
# refer to https://github.com/actions/starter-workflows/pull/47/files
|
||||||
|
|
||||||
|
# You can access it at https://csukuangfj.github.io/kaldifeat
|
||||||
|
name: Generate doc
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches:
|
||||||
|
- master
|
||||||
|
- doc
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
build-doc:
|
||||||
|
runs-on: ${{ matrix.os }}
|
||||||
|
strategy:
|
||||||
|
fail-fast: false
|
||||||
|
matrix:
|
||||||
|
os: [ubuntu-latest]
|
||||||
|
python-version: [3.8]
|
||||||
|
steps:
|
||||||
|
# refer to https://github.com/actions/checkout
|
||||||
|
- uses: actions/checkout@v2
|
||||||
|
with:
|
||||||
|
fetch-depth: 0
|
||||||
|
|
||||||
|
- name: Setup Python ${{ matrix.python-version }}
|
||||||
|
uses: actions/setup-python@v2
|
||||||
|
with:
|
||||||
|
python-version: ${{ matrix.python-version }}
|
||||||
|
|
||||||
|
- name: Display Python version
|
||||||
|
run: python -c "import sys; print(sys.version)"
|
||||||
|
|
||||||
|
- name: Build doc
|
||||||
|
shell: bash
|
||||||
|
run: |
|
||||||
|
cd doc
|
||||||
|
python3 -m pip install -r ./requirements.txt
|
||||||
|
make html
|
||||||
|
touch build/html/.nojekyll
|
||||||
|
|
||||||
|
- name: Deploy
|
||||||
|
uses: peaceiris/actions-gh-pages@v3
|
||||||
|
with:
|
||||||
|
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
publish_dir: ./doc/build/html
|
||||||
|
publish_branch: gh-pages
|
102
README.md
102
README.md
@ -7,6 +7,10 @@
|
|||||||
<img src="/doc/source/images/cuda_ge_10.1-orange.svg">
|
<img src="/doc/source/images/cuda_ge_10.1-orange.svg">
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
[](https://csukuangfj.github.io/kaldifeat/)
|
||||||
|
|
||||||
|
**Documentation**: <https://csukuangfj.github.io/kaldifeat>
|
||||||
|
|
||||||
<sub>
|
<sub>
|
||||||
<table>
|
<table>
|
||||||
<tr>
|
<tr>
|
||||||
@ -277,98 +281,6 @@ See <https://github.com/k2-fsa/sherpa/blob/master/sherpa/bin/pruned_stateless_em
|
|||||||
|
|
||||||
# Installation
|
# Installation
|
||||||
|
|
||||||
## From conda (Only for Linux)
|
Refer to
|
||||||
|
<https://csukuangfj.github.io/kaldifeat>
|
||||||
Supported versions of Python, PyTorch, and CUDA toolkit are listed below:
|
for installation.
|
||||||
|
|
||||||
[](/doc/source/images/python_ge_3.6-blue.svg)
|
|
||||||
[](/doc/source/images/pytorch_ge_1.5.0-green.svg)
|
|
||||||
[](/doc/source/images/cuda_ge_10.1-orange.svg)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
conda install -c kaldifeat -c pytorch -c conda-forge kaldifeat python=3.8 cudatoolkit=11.1 pytorch=1.8.1
|
|
||||||
```
|
|
||||||
|
|
||||||
You can select the supported Python version, CUDA toolkit version and PyTorch version as you wish.
|
|
||||||
|
|
||||||
To install a CPU version, use
|
|
||||||
|
|
||||||
```bash
|
|
||||||
conda install -c kaldifeat -c pytorch cpuonly kaldifeat python=3.8 pytorch=1.8.1
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
**Note:** If you want to install `kaldifeat` on macOS or Windows, please
|
|
||||||
use either `pip install -v kaldifeat` or compile `kaldifeat` from source.
|
|
||||||
|
|
||||||
## From PyPi with pip (Support Linux/macOS/Windows)
|
|
||||||
|
|
||||||
You need to install PyTorch and CMake first.
|
|
||||||
CMake 3.11 is known to work. Other CMake versions may also work.
|
|
||||||
PyTorch 1.6.0 and above are known to work. Other PyTorch versions may also work.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install -v kaldifeat
|
|
||||||
```
|
|
||||||
|
|
||||||
## From source (Support Linux/macOS/Windows)
|
|
||||||
|
|
||||||
The following are the commands to compile `kaldifeat` from source.
|
|
||||||
We assume that you have installed `CMake` and PyTorch.
|
|
||||||
CMake 3.11 is known to work. Other CMake versions may also work.
|
|
||||||
PyTorch 1.6.0 and above are known to work. Other PyTorch versions may also work.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
mkdir /some/path
|
|
||||||
git clone https://github.com/csukuangfj/kaldifeat.git
|
|
||||||
cd kaldifeat
|
|
||||||
python setup.py install
|
|
||||||
```
|
|
||||||
|
|
||||||
To test whether `kaldifeat` was installed successfully, you can run:
|
|
||||||
```bash
|
|
||||||
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
|
|
||||||
```
|
|
||||||
|
|
||||||
## FAQs
|
|
||||||
|
|
||||||
### How to install a CUDA version
|
|
||||||
|
|
||||||
You need to first install a CUDA version of PyTorch and then install `kaldifeat`.
|
|
||||||
|
|
||||||
**Note**: You can use a CUDA version of kaldifeat on machines with no GPUs.
|
|
||||||
|
|
||||||
|
|
||||||
### How to install a CPU-only version
|
|
||||||
|
|
||||||
You need to first install a CPU version of PyTorch and then install `kaldifeat`.
|
|
||||||
|
|
||||||
### How to fix `Caffe2: Cannot find cuDNN library`
|
|
||||||
|
|
||||||
```
|
|
||||||
Your installed Caffe2 version uses cuDNN but I cannot find the cuDNN
|
|
||||||
libraries. Please set the proper cuDNN prefixes and / or install cuDNN.
|
|
||||||
```
|
|
||||||
|
|
||||||
This error happens when you want to install a CUDA version of kaldifeat
|
|
||||||
by `pip install kaldifeat` or from source.
|
|
||||||
|
|
||||||
You need to first install cuDNN. Assume you have installed cuDNN to the
|
|
||||||
path `/path/to/cudnn`. You can fix the error by the following commands.
|
|
||||||
|
|
||||||
(1) Fix for installation using `pip install`
|
|
||||||
|
|
||||||
```bash
|
|
||||||
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DCUDNN_LIBRARY_PATH=/path/to/cudnn/lib/libcudnn.so -DCUDNN_INCLUDE_PATH=/path/to/cudnn/include"
|
|
||||||
pip install --verbose kaldifeat
|
|
||||||
```
|
|
||||||
|
|
||||||
(2) Fix for installation from source
|
|
||||||
|
|
||||||
```bash
|
|
||||||
mkdir /some/path
|
|
||||||
git clone https://github.com/csukuangfj/kaldifeat.git
|
|
||||||
cd kaldifeat
|
|
||||||
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DCUDNN_LIBRARY_PATH=/path/to/cudnn/lib/libcudnn.so -DCUDNN_INCLUDE_PATH=/path/to/cudnn/include"
|
|
||||||
python setup.py install
|
|
||||||
```
|
|
||||||
|
0
doc/source/_static/.gitkeep
Normal file
0
doc/source/_static/.gitkeep
Normal file
@ -1,72 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
|
|
||||||
# Copyright 2021 Xiaomi Corporation (authors: Fangjun Kuang)
|
|
||||||
|
|
||||||
import numpy as np
|
|
||||||
import soundfile as sf
|
|
||||||
import torch
|
|
||||||
|
|
||||||
import kaldifeat
|
|
||||||
|
|
||||||
|
|
||||||
def read_wave(filename) -> torch.Tensor:
|
|
||||||
"""Read a wave file and return it as a 1-D tensor.
|
|
||||||
|
|
||||||
Note:
|
|
||||||
You don't need to scale it to [-32768, 32767].
|
|
||||||
We use scaling here to follow the approach in Kaldi.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
filename:
|
|
||||||
Filename of a sound file.
|
|
||||||
Returns:
|
|
||||||
Return a 1-D tensor containing audio samples.
|
|
||||||
"""
|
|
||||||
with sf.SoundFile(filename) as sf_desc:
|
|
||||||
sampling_rate = sf_desc.samplerate
|
|
||||||
assert sampling_rate == 16000
|
|
||||||
data = sf_desc.read(dtype=np.float32, always_2d=False)
|
|
||||||
data *= 32768
|
|
||||||
return torch.from_numpy(data)
|
|
||||||
|
|
||||||
|
|
||||||
def test_fbank():
|
|
||||||
device = torch.device("cpu")
|
|
||||||
if torch.cuda.is_available():
|
|
||||||
device = torch.device("cuda", 0)
|
|
||||||
|
|
||||||
wave0 = read_wave("test_data/test.wav")
|
|
||||||
wave1 = read_wave("test_data/test2.wav")
|
|
||||||
|
|
||||||
wave0 = wave0.to(device)
|
|
||||||
wave1 = wave1.to(device)
|
|
||||||
|
|
||||||
opts = kaldifeat.FbankOptions()
|
|
||||||
opts.frame_opts.dither = 0
|
|
||||||
opts.device = device
|
|
||||||
|
|
||||||
fbank = kaldifeat.Fbank(opts)
|
|
||||||
|
|
||||||
# We can compute fbank features in batches
|
|
||||||
features = fbank([wave0, wave1])
|
|
||||||
assert isinstance(features, list), f"{type(features)}"
|
|
||||||
assert len(features) == 2
|
|
||||||
|
|
||||||
# We can also compute fbank features for a single wave
|
|
||||||
features0 = fbank(wave0)
|
|
||||||
features1 = fbank(wave1)
|
|
||||||
|
|
||||||
assert torch.allclose(features[0], features0)
|
|
||||||
assert torch.allclose(features[1], features1)
|
|
||||||
|
|
||||||
# To compute fbank features for only a specified frame
|
|
||||||
audio_frames = fbank.convert_samples_to_frames(wave0)
|
|
||||||
feature_frame_1 = fbank.compute(audio_frames[1])
|
|
||||||
feature_frame_10 = fbank.compute(audio_frames[10])
|
|
||||||
|
|
||||||
assert torch.allclose(features0[1], feature_frame_1)
|
|
||||||
assert torch.allclose(features0[10], feature_frame_10)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
test_fbank()
|
|
@ -59,7 +59,7 @@ templates_path = ["_templates"]
|
|||||||
# List of patterns, relative to source directory, that match files and
|
# List of patterns, relative to source directory, that match files and
|
||||||
# directories to ignore when looking for source files.
|
# directories to ignore when looking for source files.
|
||||||
# This pattern also affects html_static_path and html_extra_path.
|
# This pattern also affects html_static_path and html_extra_path.
|
||||||
exclude_patterns = []
|
exclude_patterns = ["images/*.md"]
|
||||||
|
|
||||||
source_suffix = {
|
source_suffix = {
|
||||||
".rst": "restructuredtext",
|
".rst": "restructuredtext",
|
||||||
@ -102,3 +102,35 @@ html_theme_options = {
|
|||||||
"prev_next_buttons_location": "bottom",
|
"prev_next_buttons_location": "bottom",
|
||||||
"style_external_links": True,
|
"style_external_links": True,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
rst_epilog = """
|
||||||
|
.. _kaldifeat: https://github.com/csukuangfj/kaldifeat
|
||||||
|
.. _Kaldi: https://github.com/kaldi-asr/kaldi
|
||||||
|
.. _PyTorch: https://pytorch.org/
|
||||||
|
.. _kaldifeat.Fbank: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/fbank.py#L10
|
||||||
|
.. _kaldifeat.Mfcc: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/mfcc.py#L10
|
||||||
|
.. _kaldifeat.Plp: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/plp.py#L10
|
||||||
|
.. _kaldifeat.Spectrogram: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/spectrogram.py#L9
|
||||||
|
.. _kaldifeat.OnlineFbank: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/fbank.py#L16
|
||||||
|
.. _kaldifeat.OnlineMfcc: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/mfcc.py#L16
|
||||||
|
.. _kaldifeat.OnlinePlp: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/plp.py#L16
|
||||||
|
.. _compute-fbank-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-fbank-feats.cc
|
||||||
|
.. _compute-mfcc-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-mfcc-feats.cc
|
||||||
|
.. _compute-plp-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-plp-feats.cc
|
||||||
|
.. _compute-spectrogram-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-spectrogram-feats.cc
|
||||||
|
.. _kaldi::OnlineFbank: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L160
|
||||||
|
.. _kaldi::OnlineMfcc: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L158
|
||||||
|
.. _kaldi::OnlinePlp: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L159
|
||||||
|
.. _kaldifeat.FbankOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-fbank.h#L19
|
||||||
|
.. _kaldi::FbankOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-fbank.h#L41
|
||||||
|
.. _kaldifeat.MfccOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-mfcc.h#L22
|
||||||
|
.. _kaldi::MfccOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-mfcc.h#L38
|
||||||
|
.. _kaldifeat.PlpOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-plp.h#L24
|
||||||
|
.. _kaldi::PlpOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-plp.h#L42
|
||||||
|
.. _kaldifeat.SpectrogramOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-spectrogram.h#L18
|
||||||
|
.. _kaldi::SpectrogramOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-spectrogram.h#L38
|
||||||
|
.. _kaldifeat.FrameExtractionOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-window.h#L30
|
||||||
|
.. _kaldi::FrameExtractionOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-window.h#L35
|
||||||
|
.. _kaldifeat.MelBanksOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/mel-computations.h#L17
|
||||||
|
.. _kaldi::MelBanksOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/mel-computations.h#L43
|
||||||
|
"""
|
||||||
|
@ -6,19 +6,11 @@
|
|||||||
kaldifeat
|
kaldifeat
|
||||||
=========
|
=========
|
||||||
|
|
||||||
`kaldifeat <https://github.com/csukuangfj/kaldifeat>`_ implements
|
|
||||||
feature extraction algorithms **compatible** with kaldi using PyTorch, supporting CUDA
|
|
||||||
as well as autograd.
|
|
||||||
|
|
||||||
Currently, only fbank features are supported.
|
|
||||||
It can produce the same feature output as ``compute-fbank-feats`` (from kaldi)
|
|
||||||
when given the same options.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
:caption: Contents:
|
:caption: Contents
|
||||||
|
|
||||||
|
intro
|
||||||
installation
|
installation
|
||||||
usage
|
usage/index
|
||||||
|
@ -1,19 +1,38 @@
|
|||||||
Installation
|
Installation
|
||||||
============
|
============
|
||||||
|
|
||||||
|
- |os_types|
|
||||||
|
- |python_versions|
|
||||||
|
- |pytorch_versions|
|
||||||
|
- |cuda_versions|
|
||||||
|
|
||||||
|
.. caution::
|
||||||
|
|
||||||
|
`kaldifeat`_ depends on `PyTorch`_. `PyTorch`_ >= 1.5.0 is known to work.
|
||||||
|
|
||||||
|
Please first install `PyTorch`_ before you install `kaldifeat`_.
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
To install a CPU version of `kaldifeat`_, please install a CPU version
|
||||||
|
of `PyTorch`_.
|
||||||
|
|
||||||
|
To install a CUDA version of `kaldifeat`_, please install a CUDA version
|
||||||
|
of `PyTorch`_. CUDA >= 10.1 is known to work.
|
||||||
|
|
||||||
.. _from source:
|
.. _from source:
|
||||||
|
|
||||||
Install kaldifeat from source
|
Install kaldifeat from source
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
You have to install ``cmake`` and ``PyTorch`` first.
|
You have to install ``cmake`` and `PyTorch`_ first.
|
||||||
|
|
||||||
- ``cmake`` 3.11 is known to work. Other CMake versions may also work.
|
- ``cmake`` 3.11 is known to work. Other CMake versions may also work.
|
||||||
- ``PyTorch`` 1.8.1 is known to work. Other PyTorch versions may also work.
|
- `PyTorch`_ >= 1.5.0 is known to work. Other PyTorch versions may also work.
|
||||||
- Python >= 3.6
|
- Python >= 3.6
|
||||||
|
|
||||||
|
|
||||||
The commands to install ``kaldifeat`` from source are:
|
The commands to install `kaldifeat`_ from source are:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
@ -21,7 +40,7 @@ The commands to install ``kaldifeat`` from source are:
|
|||||||
cd kaldifeat
|
cd kaldifeat
|
||||||
python3 setup.py install
|
python3 setup.py install
|
||||||
|
|
||||||
To test that you have installed ``kaldifeat`` successfully, please run:
|
To test that you have installed `kaldifeat`_ successfully, please run:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
@ -29,26 +48,120 @@ To test that you have installed ``kaldifeat`` successfully, please run:
|
|||||||
|
|
||||||
It should print the version, e.g., ``1.0``.
|
It should print the version, e.g., ``1.0``.
|
||||||
|
|
||||||
|
.. _from PyPI:
|
||||||
|
|
||||||
Install kaldifeat from PyPI
|
Install kaldifeat from PyPI
|
||||||
---------------------------
|
---------------------------
|
||||||
|
|
||||||
The pre-built ``kaldifeat`` hosted on PyPI uses PyTorch 1.8.1.
|
The command to install `kaldifeat`_ from PyPI is:
|
||||||
If you install ``kaldifeat`` using pip, it will replace your locally
|
|
||||||
installed PyTorch automatically with PyTorch 1.8.1.
|
|
||||||
|
|
||||||
If you don't want this happen, please `Install kaldifeat from source`_.
|
|
||||||
|
|
||||||
The command to install ``kaldifeat`` from PyPI is:
|
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
pip install kaldifeat
|
pip install --verbose kaldifeat
|
||||||
|
|
||||||
|
To test that you have installed `kaldifeat`_ successfully, please run:
|
||||||
To test that you have installed ``kaldifeat`` successfully, please run:
|
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
|
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
|
||||||
|
|
||||||
It should print the version, e.g., ``1.0``.
|
It should print the version, e.g., ``1.0``.
|
||||||
|
|
||||||
|
Install kaldifeat from conda (Only for Linux)
|
||||||
|
---------------------------------------------
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
Installation using ``conda`` supports only Linux. For macOS and Windows,
|
||||||
|
please use either :ref:`from source` or :ref:`from PyPI`.
|
||||||
|
|
||||||
|
The command to install `kaldifeat` using ``conda`` is
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
conda install -c kaldifeat -c pytorch -c conda-forge kaldifeat python=3.8 cudatoolkit=11.1 pytorch=1.8.1
|
||||||
|
|
||||||
|
You can select the supported Python version, CUDA toolkit version and `PyTorch`_ version as you wish.
|
||||||
|
|
||||||
|
To install a CPU version of `kaldifeat`, use:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
conda install -c kaldifeat -c pytorch cpuonly kaldifeat python=3.8 pytorch=1.8.1
|
||||||
|
|
||||||
|
.. caution::
|
||||||
|
|
||||||
|
If you encounter issues about missing GLIBC after installing `kaldifeat`_
|
||||||
|
with ``conda``, please consider :ref:`from source` or :ref:`from PyPI`.
|
||||||
|
The reason is that the package was built using Ubuntu 18.04 and your system's
|
||||||
|
GLIBC is older.
|
||||||
|
|
||||||
|
|
||||||
|
.. |os_types| image:: ./images/os-green.svg
|
||||||
|
:alt: Supported operating systems
|
||||||
|
|
||||||
|
.. |python_versions| image:: ./images/python_ge_3.6-blue.svg
|
||||||
|
:alt: Supported python versions
|
||||||
|
|
||||||
|
.. |cuda_versions| image:: ./images/cuda_ge_10.1-orange.svg
|
||||||
|
:alt: Supported cuda versions
|
||||||
|
|
||||||
|
.. |pytorch_versions| image:: ./images/pytorch_ge_1.5.0-green.svg
|
||||||
|
:alt: Supported pytorch versions
|
||||||
|
|
||||||
|
To test that you have installed `kaldifeat`_ successfully, please run:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
|
||||||
|
|
||||||
|
It should print the version, e.g., ``1.0``.
|
||||||
|
|
||||||
|
FAQs
|
||||||
|
----
|
||||||
|
|
||||||
|
How to install a CUDA version of kaldifeat
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
You need to first install a CUDA version of `PyTorch`_ and then install `kaldifeat`_.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You can use a CUDA version of `kaldifeat`_ on machines with no GPUs.
|
||||||
|
|
||||||
|
How to install a CPU version of kaldifeat
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
You need to first install a CPU version of `PyTorch`_ and then install `kaldifeat`_.
|
||||||
|
|
||||||
|
How to fix `Caffe2: Cannot find cuDNN library`
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block::
|
||||||
|
|
||||||
|
Your installed Caffe2 version uses cuDNN but I cannot find the cuDNN
|
||||||
|
libraries. Please set the proper cuDNN prefixes and / or install cuDNN.
|
||||||
|
|
||||||
|
You will have such an error when you want to install a CUDA version of `kaldifeat`_
|
||||||
|
by ``pip install kaldifeat`` or from source.
|
||||||
|
|
||||||
|
You need to first install cuDNN. Assume you have installed cuDNN to the
|
||||||
|
path ``/path/to/cudnn``. You can fix the error by using ``one`` of the following
|
||||||
|
commands.
|
||||||
|
|
||||||
|
(1) Fix for installation using ``pip install``
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DCUDNN_LIBRARY_PATH=/path/to/cudnn/lib/libcudnn.so -DCUDNN_INCLUDE_PATH=/path/to/cudnn/include"
|
||||||
|
pip install --verbose kaldifeat
|
||||||
|
|
||||||
|
(2) Fix for installation from source
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
mkdir /some/path
|
||||||
|
git clone https://github.com/csukuangfj/kaldifeat.git
|
||||||
|
cd kaldifeat
|
||||||
|
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DCUDNN_LIBRARY_PATH=/path/to/cudnn/lib/libcudnn.so -DCUDNN_INCLUDE_PATH=/path/to/cudnn/include"
|
||||||
|
python setup.py install
|
||||||
|
103
doc/source/intro.rst
Normal file
103
doc/source/intro.rst
Normal file
@ -0,0 +1,103 @@
|
|||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
`kaldifeat`_ implements
|
||||||
|
speech feature extraction algorithms **compatible** with `Kaldi`_ using `PyTorch`_,
|
||||||
|
supporting CUDA as well as autograd.
|
||||||
|
|
||||||
|
`kaldifeat`_ has the following features:
|
||||||
|
|
||||||
|
- Fully compatible with `Kaldi`_
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
The underlying C++ code is copied & modified from `Kaldi`_ directly.
|
||||||
|
It is rewritten with `PyTorch` C++ APIs.
|
||||||
|
|
||||||
|
- Provide not only ``C++ APIs`` but also ``Python APIs``
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You can access `kaldifeat`_ from ``Python``.
|
||||||
|
|
||||||
|
- Support autograd
|
||||||
|
- Support ``CUDA`` and ``CPU``
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You can use CUDA for feature extraction.
|
||||||
|
|
||||||
|
- Support ``online`` (i.e., ``streaming``) and ``offline`` (i.e., ``non-streaming``)
|
||||||
|
feature extraction
|
||||||
|
- Support chunk-based processing
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
This is especially usefull if you want to process audios of several
|
||||||
|
hours long, which may cause OOM if you send them for computation at once.
|
||||||
|
With chunk-based processing, you can process audios of arbirtray length.
|
||||||
|
|
||||||
|
- Support batch processing
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
With `kaldifeat`_ you can extract features for a batch of audios
|
||||||
|
|
||||||
|
|
||||||
|
.. see https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html
|
||||||
|
|
||||||
|
Currently implemented speech features and their counterparts in `Kaldi`_ are
|
||||||
|
listed in the following table.
|
||||||
|
|
||||||
|
.. list-table:: Supported speech features
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Supported speech features
|
||||||
|
- Counterpart in `Kaldi`_
|
||||||
|
* - `kaldifeat.Fbank`_
|
||||||
|
- `compute-fbank-feats`_
|
||||||
|
* - `kaldifeat.Mfcc`_
|
||||||
|
- `compute-mfcc-feats`_
|
||||||
|
* - `kaldifeat.Plp`_
|
||||||
|
- `compute-plp-feats`_
|
||||||
|
* - `kaldifeat.Spectrogram`_
|
||||||
|
- `compute-spectrogram-feats`_
|
||||||
|
* - `kaldifeat.OnlineFbank`_
|
||||||
|
- `kaldi::OnlineFbank`_
|
||||||
|
* - `kaldifeat.OnlineMfcc`_
|
||||||
|
- `kaldi::OnlineMfcc`_
|
||||||
|
* - `kaldifeat.OnlinePlp`_
|
||||||
|
- `kaldi::OnlinePlp`_
|
||||||
|
|
||||||
|
Each feature computer needs an option. The following table lists the options
|
||||||
|
for each computer and the corresponding options in `Kaldi`_.
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
Note that we reuse the parameter names from `Kaldi`_.
|
||||||
|
|
||||||
|
Also, both online feature computers and offline feature computers share the
|
||||||
|
same option.
|
||||||
|
|
||||||
|
.. list-table:: Feature computer options
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Options in `kaldifeat`_
|
||||||
|
- Corresponding options in `Kaldi`_
|
||||||
|
* - `kaldifeat.FbankOptions`_
|
||||||
|
- `kaldi::FbankOptions`_
|
||||||
|
* - `kaldifeat.MfccOptions`_
|
||||||
|
- `kaldi::MfccOptions`_
|
||||||
|
* - `kaldifeat.PlpOptions`_
|
||||||
|
- `kaldi::PlpOptions`_
|
||||||
|
* - `kaldifeat.SpectrogramOptions`_
|
||||||
|
- `kaldi::SpectrogramOptions`_
|
||||||
|
* - `kaldifeat.FrameExtractionOptions`_
|
||||||
|
- `kaldi::FrameExtractionOptions`_
|
||||||
|
* - `kaldifeat.MelBanksOptions`_
|
||||||
|
- `kaldi::MelBanksOptions`_
|
||||||
|
|
||||||
|
Read more to learn how to install `kaldifeat`_ and how to use each feature
|
||||||
|
computer.
|
@ -1,212 +0,0 @@
|
|||||||
Usage
|
|
||||||
=====
|
|
||||||
|
|
||||||
Let us first see the help message of kaldi's ``compute-fbank-feats``:
|
|
||||||
|
|
||||||
.. code-block:: bash
|
|
||||||
|
|
||||||
$ compute-fbank-feats
|
|
||||||
|
|
||||||
Create Mel-filter bank (FBANK) feature files.
|
|
||||||
Usage: compute-fbank-feats [options...] <wav-rspecifier> <feats-wspecifier>
|
|
||||||
|
|
||||||
Options:
|
|
||||||
--allow-downsample : If true, allow the input waveform to have a higher frequency than the specified --sample-frequency (and we'll downsample). (bool, default = false)
|
|
||||||
--allow-upsample : If true, allow the input waveform to have a lower frequency than the specified --sample-frequency (and we'll upsample). (bool, default = false)
|
|
||||||
--blackman-coeff : Constant coefficient for generalized Blackman window. (float, default = 0.42)
|
|
||||||
--channel : Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (int, default = -1)
|
|
||||||
--debug-mel : Print out debugging information for mel bin computation (bool, default = false)
|
|
||||||
--dither : Dithering constant (0.0 means no dither). If you turn this off, you should set the --energy-floor option, e.g. to 1.0 or 0.1 (float, default = 1)
|
|
||||||
--energy-floor : Floor on energy (absolute, not relative) in FBANK computation. Only makes a difference if --use-energy=true; only necessary if --dither=0.0. Suggested values: 0.1 or 1.0 (float, default = 0)
|
|
||||||
--frame-length : Frame length in milliseconds (float, default = 25)
|
|
||||||
--frame-shift : Frame shift in milliseconds (float, default = 10)
|
|
||||||
--high-freq : High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (float, default = 0)
|
|
||||||
--htk-compat : If true, put energy last. Warning: not sufficient to get HTK compatible features (need to change other parameters). (bool, default = false)
|
|
||||||
--low-freq : Low cutoff frequency for mel bins (float, default = 20)
|
|
||||||
--max-feature-vectors : Memory optimization. If larger than 0, periodically remove feature vectors so that only this number of the latest feature vectors is retained. (int, default = -1)
|
|
||||||
--min-duration : Minimum duration of segments to process (in seconds). (float, default = 0)
|
|
||||||
--num-mel-bins : Number of triangular mel-frequency bins (int, default = 23)
|
|
||||||
--output-format : Format of the output files [kaldi, htk] (string, default = "kaldi")
|
|
||||||
--preemphasis-coefficient : Coefficient for use in signal preemphasis (float, default = 0.97)
|
|
||||||
--raw-energy : If true, compute energy before preemphasis and windowing (bool, default = true)
|
|
||||||
--remove-dc-offset : Subtract mean from waveform on each frame (bool, default = true)
|
|
||||||
--round-to-power-of-two : If true, round window size to power of two by zero-padding input to FFT. (bool, default = true)
|
|
||||||
--sample-frequency : Waveform data sample frequency (must match the waveform file, if specified there) (float, default = 16000)
|
|
||||||
--snip-edges : If true, end effects will be handled by outputting only frames that completely fit in the file, and the number of frames depends on the frame-length. If false, the number of frames depends only on the frame-shift, and we reflect the data at the ends. (bool, default = true)
|
|
||||||
--subtract-mean : Subtract mean of each feature file [CMS]; not recommended to do it this way. (bool, default = false)
|
|
||||||
--use-energy : Add an extra dimension with energy to the FBANK output. (bool, default = false)
|
|
||||||
--use-log-fbank : If true, produce log-filterbank, else produce linear. (bool, default = true)
|
|
||||||
--use-power : If true, use power, else use magnitude. (bool, default = true)
|
|
||||||
--utt2spk : Utterance to speaker-id map (if doing VTLN and you have warps per speaker) (string, default = "")
|
|
||||||
--vtln-high : High inflection point in piecewise linear VTLN warping function (if negative, offset from high-mel-freq (float, default = -500)
|
|
||||||
--vtln-low : Low inflection point in piecewise linear VTLN warping function (float, default = 100)
|
|
||||||
--vtln-map : Map from utterance or speaker-id to vtln warp factor (rspecifier) (string, default = "")
|
|
||||||
--vtln-warp : Vtln warp factor (only applicable if vtln-map not specified) (float, default = 1)
|
|
||||||
--window-type : Type of window ("hamming"|"hanning"|"povey"|"rectangular"|"sine"|"blackmann") (string, default = "povey")
|
|
||||||
--write-utt2dur : Wspecifier to write duration of each utterance in seconds, e.g. 'ark,t:utt2dur'. (string, default = "")
|
|
||||||
|
|
||||||
Standard options:
|
|
||||||
--config : Configuration file to read (this option may be repeated) (string, default = "")
|
|
||||||
--help : Print out usage message (bool, default = false)
|
|
||||||
--print-args : Print the command line arguments (to stderr) (bool, default = true)
|
|
||||||
--verbose : Verbose level (higher->more logging) (int, default = 0)
|
|
||||||
|
|
||||||
FbankOptions
|
|
||||||
------------
|
|
||||||
|
|
||||||
``kaldifeat`` reuses the same options from kaldi's ``compute-fbank-feats``.
|
|
||||||
|
|
||||||
The following shows the default values of ``kaldifeat.FbankOptions``:
|
|
||||||
|
|
||||||
.. code-block:: python
|
|
||||||
|
|
||||||
>>> import kaldifeat
|
|
||||||
>>> fbank_opts = kaldifeat.FbankOptions()
|
|
||||||
>>> print(fbank_opts)
|
|
||||||
frame_opts:
|
|
||||||
samp_freq: 16000
|
|
||||||
frame_shift_ms: 10
|
|
||||||
frame_length_ms: 25
|
|
||||||
dither: 1
|
|
||||||
preemph_coeff: 0.97
|
|
||||||
remove_dc_offset: 1
|
|
||||||
window_type: povey
|
|
||||||
round_to_power_of_two: 1
|
|
||||||
blackman_coeff: 0.42
|
|
||||||
snip_edges: 1
|
|
||||||
|
|
||||||
|
|
||||||
mel_opts:
|
|
||||||
num_bins: 23
|
|
||||||
low_freq: 20
|
|
||||||
high_freq: 0
|
|
||||||
vtln_low: 100
|
|
||||||
vtln_high: -500
|
|
||||||
debug_mel: 0
|
|
||||||
htk_mode: 0
|
|
||||||
|
|
||||||
use_energy: 0
|
|
||||||
energy_floor: 0
|
|
||||||
raw_energy: 1
|
|
||||||
htk_compat: 0
|
|
||||||
use_log_fbank: 1
|
|
||||||
use_power: 1
|
|
||||||
device: cpu
|
|
||||||
|
|
||||||
It consists of three parts:
|
|
||||||
|
|
||||||
- ``frame_opts``
|
|
||||||
|
|
||||||
Options in this part are accessed by ``frame_opts.xxx``. That is, to access
|
|
||||||
the sample rate, you use:
|
|
||||||
|
|
||||||
.. code-block:: python
|
|
||||||
|
|
||||||
>>> fbank_opts = kaldifeat.FbankOptions()
|
|
||||||
>>> print(fbank_opts.frame_opts.samp_freq)
|
|
||||||
16000.0
|
|
||||||
|
|
||||||
- ``mel_opts``
|
|
||||||
|
|
||||||
Options in this part are accessed by ``mel_opts.xxx``. That is, to access
|
|
||||||
the number of mel bins, you use:
|
|
||||||
|
|
||||||
.. code-block:: python
|
|
||||||
|
|
||||||
>>> fbank_opts = kaldifeat.FbankOptions()
|
|
||||||
>>> print(fbank_opts.mel_opts.num_bins)
|
|
||||||
23
|
|
||||||
|
|
||||||
- fbank related
|
|
||||||
|
|
||||||
Options in this part are accessed directly. That is, to access the device
|
|
||||||
field, you use:
|
|
||||||
|
|
||||||
.. code-block::
|
|
||||||
|
|
||||||
>>> print(fbank_opts.device)
|
|
||||||
cpu
|
|
||||||
>>> fbank_opts.device = 'cuda:0'
|
|
||||||
>>> print(fbank_opts.device)
|
|
||||||
cuda:0
|
|
||||||
>>> import torch
|
|
||||||
>>> fbank_opts.device = torch.device('cuda', 0)
|
|
||||||
>>> print(fbank_opts.device)
|
|
||||||
cuda:0
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
To change the sample rate to 8000, you can use:
|
|
||||||
|
|
||||||
.. code-block:: python
|
|
||||||
|
|
||||||
>>> fbank_opts = kaldifeat.FbankOptions()
|
|
||||||
>>> print(fbank_opts.frame_opts.samp_freq)
|
|
||||||
16000.0
|
|
||||||
>>> fbank_opts.frame_opts.samp_freq = 8000
|
|
||||||
>>> print(fbank_opts.frame_opts.samp_freq)
|
|
||||||
8000.0
|
|
||||||
|
|
||||||
To change ``snip_edges`` to ``False``, you can use:
|
|
||||||
|
|
||||||
.. code-block:: python
|
|
||||||
|
|
||||||
>>> fbank_opts.frame_opts.snip_edges = False
|
|
||||||
>>> print(fbank_opts.frame_opts.snip_edges)
|
|
||||||
False
|
|
||||||
|
|
||||||
To change number of mel bins to 80, you can use:
|
|
||||||
|
|
||||||
.. code-block:: python
|
|
||||||
|
|
||||||
>>> print(fbank_opts.mel_opts.num_bins)
|
|
||||||
23
|
|
||||||
>>> fbank_opts.mel_opts.num_bins = 80
|
|
||||||
>>> print(fbank_opts.mel_opts.num_bins)
|
|
||||||
80
|
|
||||||
|
|
||||||
To change the device to ``cuda``, you can use:
|
|
||||||
|
|
||||||
|
|
||||||
Fbank
|
|
||||||
-----
|
|
||||||
|
|
||||||
The following shows how to use ``kaldifeat.Fbank`` to compute
|
|
||||||
the fbank features of sound files.
|
|
||||||
|
|
||||||
First, let us generate two sound files using ``sox``:
|
|
||||||
|
|
||||||
.. code-block:: bash
|
|
||||||
|
|
||||||
# generate a wav of two seconds, containing a sine-wave
|
|
||||||
# swept from 300 Hz to 3300 Hz
|
|
||||||
sox -n -r 16000 -b 16 test.wav synth 1.2 sine 300-3300
|
|
||||||
|
|
||||||
# another sound file with 0.5 seconds
|
|
||||||
sox -n -r 16000 -b 16 test2.wav synth 0.5 sine 300-3300
|
|
||||||
|
|
||||||
.. hint::
|
|
||||||
|
|
||||||
You can find the above two files by visiting the following two links:
|
|
||||||
|
|
||||||
- `test.wav <https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_data/test.wav>`_
|
|
||||||
- `test2.wav <https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_data/test2.wav>`_
|
|
||||||
|
|
||||||
The `following code <https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_fbank.py>`_
|
|
||||||
shows the usage of ``kaldifeat.Fbank``.
|
|
||||||
|
|
||||||
It shows:
|
|
||||||
|
|
||||||
- How to read a sound file. Note that audio samples are scaled to the range [-32768, 32768].
|
|
||||||
The intention is to produce the same output as kaldi. You don't need to scale it if
|
|
||||||
you don't care about the compatibility with kaldi
|
|
||||||
|
|
||||||
- ``kaldifeat.Fbank`` supports CUDA as well as CPU
|
|
||||||
|
|
||||||
- ``kaldifeat.Fbank`` supports processing sound file in a batch as well as accepting
|
|
||||||
a single sound file
|
|
||||||
|
|
||||||
|
|
||||||
.. literalinclude:: ./code/test_fbank.py
|
|
||||||
:caption: Demo of ``kaldifeat.Fbank``
|
|
||||||
:language: python
|
|
46
doc/source/usage/code/compute-fbank-feats-help.txt
Normal file
46
doc/source/usage/code/compute-fbank-feats-help.txt
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
compute-fbank-feats
|
||||||
|
|
||||||
|
Create Mel-filter bank (FBANK) feature files.
|
||||||
|
Usage: compute-fbank-feats [options...] <wav-rspecifier> <feats-wspecifier>
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--allow-downsample : If true, allow the input waveform to have a higher frequency than the specified --sample-frequency (and we'll downsample). (bool, default = false)
|
||||||
|
--allow-upsample : If true, allow the input waveform to have a lower frequency than the specified --sample-frequency (and we'll upsample). (bool, default = false)
|
||||||
|
--blackman-coeff : Constant coefficient for generalized Blackman window. (float, default = 0.42)
|
||||||
|
--channel : Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (int, default = -1)
|
||||||
|
--debug-mel : Print out debugging information for mel bin computation (bool, default = false)
|
||||||
|
--dither : Dithering constant (0.0 means no dither). If you turn this off, you should set the --energy-floor option, e.g. to 1.0 or 0.1 (float, default = 1)
|
||||||
|
--energy-floor : Floor on energy (absolute, not relative) in FBANK computation. Only makes a difference if --use-energy=true; only necessary if --dither=0.0. Suggested values: 0.1 or 1.0 (float, default = 0)
|
||||||
|
--frame-length : Frame length in milliseconds (float, default = 25)
|
||||||
|
--frame-shift : Frame shift in milliseconds (float, default = 10)
|
||||||
|
--high-freq : High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (float, default = 0)
|
||||||
|
--htk-compat : If true, put energy last. Warning: not sufficient to get HTK compatible features (need to change other parameters). (bool, default = false)
|
||||||
|
--low-freq : Low cutoff frequency for mel bins (float, default = 20)
|
||||||
|
--max-feature-vectors : Memory optimization. If larger than 0, periodically remove feature vectors so that only this number of the latest feature vectors is retained. (int, default = -1)
|
||||||
|
--min-duration : Minimum duration of segments to process (in seconds). (float, default = 0)
|
||||||
|
--num-mel-bins : Number of triangular mel-frequency bins (int, default = 23)
|
||||||
|
--output-format : Format of the output files [kaldi, htk] (string, default = "kaldi")
|
||||||
|
--preemphasis-coefficient : Coefficient for use in signal preemphasis (float, default = 0.97)
|
||||||
|
--raw-energy : If true, compute energy before preemphasis and windowing (bool, default = true)
|
||||||
|
--remove-dc-offset : Subtract mean from waveform on each frame (bool, default = true)
|
||||||
|
--round-to-power-of-two : If true, round window size to power of two by zero-padding input to FFT. (bool, default = true)
|
||||||
|
--sample-frequency : Waveform data sample frequency (must match the waveform file, if specified there) (float, default = 16000)
|
||||||
|
--snip-edges : If true, end effects will be handled by outputting only frames that completely fit in the file, and the number of frames depends on the frame-length. If false, the number of frames depends only on the frame-shift, and we reflect the data at the ends. (bool, default = true)
|
||||||
|
--subtract-mean : Subtract mean of each feature file [CMS]; not recommended to do it this way. (bool, default = false)
|
||||||
|
--use-energy : Add an extra dimension with energy to the FBANK output. (bool, default = false)
|
||||||
|
--use-log-fbank : If true, produce log-filterbank, else produce linear. (bool, default = true)
|
||||||
|
--use-power : If true, use power, else use magnitude. (bool, default = true)
|
||||||
|
--utt2spk : Utterance to speaker-id map (if doing VTLN and you have warps per speaker) (string, default = "")
|
||||||
|
--vtln-high : High inflection point in piecewise linear VTLN warping function (if negative, offset from high-mel-freq (float, default = -500)
|
||||||
|
--vtln-low : Low inflection point in piecewise linear VTLN warping function (float, default = 100)
|
||||||
|
--vtln-map : Map from utterance or speaker-id to vtln warp factor (rspecifier) (string, default = "")
|
||||||
|
--vtln-warp : Vtln warp factor (only applicable if vtln-map not specified) (float, default = 1)
|
||||||
|
--window-type : Type of window ("hamming"|"hanning"|"povey"|"rectangular"|"sine"|"blackmann") (string, default = "povey")
|
||||||
|
--write-utt2dur : Wspecifier to write duration of each utterance in seconds, e.g. 'ark,t:utt2dur'. (string, default = "")
|
||||||
|
|
||||||
|
Standard options:
|
||||||
|
--config : Configuration file to read (this option may be repeated) (string, default = "")
|
||||||
|
--help : Print out usage message (bool, default = false)
|
||||||
|
--print-args : Print the command line arguments (to stderr) (bool, default = true)
|
||||||
|
--verbose : Verbose level (higher->more logging) (int, default = 0)
|
||||||
|
|
65
doc/source/usage/code/fbank_options-1.txt
Normal file
65
doc/source/usage/code/fbank_options-1.txt
Normal file
@ -0,0 +1,65 @@
|
|||||||
|
$ python3
|
||||||
|
Python 3.8.0 (default, Oct 28 2019, 16:14:01)
|
||||||
|
[GCC 8.3.0] on linux
|
||||||
|
Type "help", "copyright", "credits" or "license" for more information.
|
||||||
|
>>> import kaldifeat
|
||||||
|
>>> opts = kaldifeat.FbankOptions()
|
||||||
|
>>> print(opts)
|
||||||
|
frame_opts:
|
||||||
|
samp_freq: 16000
|
||||||
|
frame_shift_ms: 10
|
||||||
|
frame_length_ms: 25
|
||||||
|
dither: 1
|
||||||
|
preemph_coeff: 0.97
|
||||||
|
remove_dc_offset: 1
|
||||||
|
window_type: povey
|
||||||
|
round_to_power_of_two: 1
|
||||||
|
blackman_coeff: 0.42
|
||||||
|
snip_edges: 1
|
||||||
|
max_feature_vectors: -1
|
||||||
|
|
||||||
|
|
||||||
|
mel_opts:
|
||||||
|
num_bins: 23
|
||||||
|
low_freq: 20
|
||||||
|
high_freq: 0
|
||||||
|
vtln_low: 100
|
||||||
|
vtln_high: -500
|
||||||
|
debug_mel: 0
|
||||||
|
htk_mode: 0
|
||||||
|
|
||||||
|
use_energy: 0
|
||||||
|
energy_floor: 0
|
||||||
|
raw_energy: 1
|
||||||
|
htk_compat: 0
|
||||||
|
use_log_fbank: 1
|
||||||
|
use_power: 1
|
||||||
|
device: cpu
|
||||||
|
|
||||||
|
>>> print(opts.dither)
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
AttributeError: '_kaldifeat.FbankOptions' object has no attribute 'dither'
|
||||||
|
>>>
|
||||||
|
>>> print(opts.frame_opts.dither)
|
||||||
|
1.0
|
||||||
|
>>> opts.frame_opts.dither = 0 # disable dither
|
||||||
|
>>> print(opts.frame_opts.dither)
|
||||||
|
0.0
|
||||||
|
>>> import torch
|
||||||
|
>>> print(opts.device)
|
||||||
|
cpu
|
||||||
|
>>> opts.device = 'cuda:0'
|
||||||
|
>>> print(opts.device)
|
||||||
|
cuda:0
|
||||||
|
>>> opts.device = torch.device('cuda', 1)
|
||||||
|
>>> print(opts.device)
|
||||||
|
cuda:1
|
||||||
|
>>> opts.device = 'cpu'
|
||||||
|
>>> print(opts.device)
|
||||||
|
cpu
|
||||||
|
>>> print(opts.mel_opts.num_bins)
|
||||||
|
23
|
||||||
|
>>> opts.mel_opts.num_bins = 80
|
||||||
|
>>> print(opts.mel_opts.num_bins)
|
||||||
|
80
|
1
doc/source/usage/code/test_fbank_options.py
Symbolic link
1
doc/source/usage/code/test_fbank_options.py
Symbolic link
@ -0,0 +1 @@
|
|||||||
|
../../../../kaldifeat/python/tests/test_fbank_options.py
|
3
doc/source/usage/fbank.rst
Normal file
3
doc/source/usage/fbank.rst
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
kaldifeat.Fbank
|
||||||
|
===============
|
||||||
|
|
52
doc/source/usage/fbank_options.rst
Normal file
52
doc/source/usage/fbank_options.rst
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
kaldifeat.FbankOptions
|
||||||
|
======================
|
||||||
|
|
||||||
|
If you want to construct an instance of `kaldifeat.Fbank`_ or
|
||||||
|
`kaldifeat.OnlineFbank`_, you have to provide an instance of
|
||||||
|
`kaldifeat.FbankOptions`_.
|
||||||
|
|
||||||
|
The following code shows how to construct an instance of `kaldifeat.FbankOptions`_.
|
||||||
|
|
||||||
|
.. literalinclude:: ./code/fbank_options-1.txt
|
||||||
|
:caption: Usage of `kaldifeat.FbankOptions`_
|
||||||
|
:emphasize-lines: 6,8,22,37
|
||||||
|
:language: python
|
||||||
|
|
||||||
|
Note that we reuse the same option name with `compute-fbank-feats`_ from `Kaldi`_:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
$ compute-fbank-feats --help
|
||||||
|
|
||||||
|
|
||||||
|
.. literalinclude:: ./code/compute-fbank-feats-help.txt
|
||||||
|
:caption: Output of ``compute-fbank-feats --help``
|
||||||
|
|
||||||
|
Please refer to the output of ``compute-fbank-feats --help`` for the meaning
|
||||||
|
of each field of `kaldifeat.FbankOptions`_.
|
||||||
|
|
||||||
|
One thing worth noting is that `kaldifeat.FbankOptions`_ has a field ``device``,
|
||||||
|
which is an instance of ``torch.device``. You can assign it either a string, e.g.,
|
||||||
|
``"cpu"`` or ``"cuda:0"``, or an instance of ``torch.device``, e.g., ``torch.device("cpu")`` or
|
||||||
|
``torch.device("cuda", 1)``.
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
You can use this field to control whether the feature computer
|
||||||
|
constructed from it performs computation on CPU or CUDA.
|
||||||
|
|
||||||
|
.. caution::
|
||||||
|
|
||||||
|
If you use a CUDA device, make sure that you have installed a CUDA version
|
||||||
|
of `PyTorch`_.
|
||||||
|
|
||||||
|
Example usage
|
||||||
|
-------------
|
||||||
|
|
||||||
|
The following code from
|
||||||
|
`<https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_fbank_options.py>`_
|
||||||
|
demonstrate the usage of `kaldifeat.FbankOptions`_:
|
||||||
|
|
||||||
|
.. literalinclude:: ./code/test_fbank_options.py
|
||||||
|
:caption: Example usage of `kaldifeat.FbankOptions`_
|
||||||
|
:language: python
|
11
doc/source/usage/index.rst
Normal file
11
doc/source/usage/index.rst
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
Usage
|
||||||
|
=====
|
||||||
|
|
||||||
|
This section describes how to use feature computers in `kaldifeat`_.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
fbank_options
|
||||||
|
fbank
|
||||||
|
online_fbank
|
3
doc/source/usage/online_fbank.rst
Normal file
3
doc/source/usage/online_fbank.rst
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
kaldifeat.OnlineFbank
|
||||||
|
=====================
|
||||||
|
|
Loading…
x
Reference in New Issue
Block a user