Compare commits

...

95 Commits

Author SHA1 Message Date
Fangjun Kuang
9165e73a3b
Support torch 2.8.0 (#120) 2025-08-07 21:09:15 +08:00
Fangjun Kuang
8cda08e0c5
support torch 2.7.1 (#119) 2025-06-05 11:30:38 +08:00
Fangjun Kuang
6edd6a8370
support torch 2.7.0 (#118) 2025-06-05 10:47:36 +08:00
Fangjun Kuang
0ecdee6e88
Support Linux arm64 (#114) 2025-03-07 14:22:18 +08:00
Fangjun Kuang
3f79fbbd6d
Support torch 2.6.0 (#112) 2025-01-30 08:45:49 +08:00
Fangjun Kuang
c1aefb643e
Support numpy 2 0 and torch 2.5.1 (#110) 2024-10-30 12:45:45 +08:00
Fangjun Kuang
e485c5749b
Support torch 2.5.0 (#109) 2024-10-18 11:04:11 +08:00
Fangjun Kuang
e05540cfad
Allow all zero rows in mel computation matrix. (#107) 2024-09-14 12:02:20 +08:00
Fangjun Kuang
5f9602afab
support for torch 2.4.1 (#106) 2024-09-07 17:32:29 +08:00
Fangjun Kuang
a39a011de9
Update readme for how to install kaldifeat (#105) 2024-08-09 14:55:47 +08:00
Karel Vesely
eeefc87660
switch C++14 -> C++17, so it becomes compatible with Pytorch 2.4 (#103) 2024-08-01 06:45:02 +08:00
Fangjun Kuang
0355ed8ced
support torch2.4.0 (#102) 2024-07-25 15:17:44 +08:00
Fangjun Kuang
f36767ed1d
support torch 2.3.1 (#100) 2024-06-06 11:00:41 +08:00
Fangjun Kuang
40cc0a4a2c
Support torch 2.3.0 (#99) 2024-04-25 21:28:17 +08:00
Fangjun Kuang
843763fd05
support torch 2.2.2 (#98) 2024-04-25 20:49:43 +08:00
Fangjun Kuang
5dee672556
support torch 2.2.1 (#96) 2024-02-23 13:48:33 +08:00
Fangjun Kuang
2a8a993f15
Fix building wheels for torch 2.2.0. (#95)
See also https://github.com/pytorch/pytorch/issues/120020
2024-02-18 20:30:18 +08:00
Fangjun Kuang
2e042b356e
Support torch 2.2.0 (#94) 2024-02-10 21:27:19 +08:00
Fangjun Kuang
b75abef4c0
update install doc (#90) 2024-01-14 14:28:44 +08:00
Fangjun Kuang
4fd12b48b8
Support torch 2.1.2 (#89) 2023-12-21 21:43:52 +08:00
Dadoou
f834ad861d
Update whisper-fbank.h: correct Dim() function (#88) 2023-12-19 15:55:20 +08:00
Fangjun Kuang
352aa43ce3
support torch 2.1.1 (#85) 2023-11-26 17:04:52 +08:00
Fangjun Kuang
c1adbda9f0
Update README.md 2023-11-09 12:48:57 +08:00
Fangjun Kuang
2624da8275
support whisper v3 (#84) 2023-11-09 12:45:56 +08:00
Fangjun Kuang
20379449fc
Update readme to include whisper (#83) 2023-11-08 19:36:22 +08:00
Fangjun Kuang
01aed93b1b
Support computing features for whisper (#82) 2023-11-08 19:22:32 +08:00
Fangjun Kuang
7912c2f442
Fix building wheels for torch 2.1.x (#81) 2023-11-03 10:16:32 +08:00
Fangjun Kuang
46ab855aa5
Release v1.25.1 (#79) 2023-10-13 10:16:37 +08:00
Fangjun Kuang
b30f385898 fix test errors 2023-07-26 17:04:56 +08:00
Fangjun Kuang
9fae1afa65 minor fixes 2023-07-26 17:02:00 +08:00
Fangjun Kuang
5a5895d20b more fixes 2023-07-26 16:59:15 +08:00
Fangjun Kuang
6f5ab0ee14 fix a typo in CI 2023-07-26 16:57:09 +08:00
Fangjun Kuang
dde905597c add tests for pre-compiled wheels 2023-07-26 16:55:13 +08:00
Fangjun Kuang
3c78253a56 remove htmls from doc 2023-07-26 16:41:32 +08:00
Fangjun Kuang
64b6e7fcc5 fix doc building 2023-07-26 16:33:06 +08:00
Fangjun Kuang
114ea348a7 fix building doc 2023-07-26 16:31:35 +08:00
Fangjun Kuang
af6e61a5a3
Release v1.25 (#76) 2023-07-26 12:38:22 +08:00
Fangjun Kuang
b65c799b4e
fix typos (#75) 2023-07-25 11:50:14 +08:00
Fangjun Kuang
6c60c618e6
Build more wheels (#74) 2023-07-24 16:43:52 +08:00
Fangjun Kuang
1868136b40
Fix ci (#73) 2023-07-24 09:46:44 +08:00
Fangjun Kuang
6be131252a
Update installation doc (#72) 2023-07-24 09:22:08 +08:00
trunglebka
17861dce6b
Add missing header due to gcc/g++ 13 changes (#71) 2023-06-15 15:47:37 +08:00
Fangjun Kuang
9b79da9128
release v1.24 (#69) 2023-05-04 14:40:32 +08:00
Fangjun Kuang
bf2047023f
Add alternative URLs for cmake deps (#67)
* Add alternative URLs for cmake deps

* add missing files
2023-03-09 15:05:27 +08:00
Fangjun Kuang
9fdc43df23 add a link to https://github.com/csukuangfj/kaldi-native-fbank 2023-01-06 13:38:51 +08:00
trunglebka
749eff994f
Prevent cmake from trying to link with system MKL (#64)
Co-authored-by: Trung Le <trungle@x3english.com>
2022-12-16 17:07:22 +08:00
Fangjun Kuang
94a567c638
Change how options are displayed (#62)
* Change the way how FrameExtractionOptions is displayed

* small fixes

* Change the way how MelBankOptions is displayed

* Change the way how SpectrogramOptions is displayed

* Change the way how FbankOptions is displayed

* Change the way how MfccOptions is displayed

* Change the way how PlpOptions is displayed

* Export num_frames

* release v1.22
2022-12-03 13:37:55 +08:00
Fangjun Kuang
8ee0c34d3a
Test torch 1.13.0 (#61) 2022-11-30 17:16:10 +08:00
Fangjun Kuang
5c7566f2e2 Release v1.21 2022-11-30 13:35:53 +08:00
veelion
a1f252c981
changes for using C++ API by other projects (#60)
Co-authored-by: veelion <veelion@gmail>
2022-11-30 12:59:38 +08:00
Fangjun Kuang
c38f0252d5
Use -j4 by default for make (#58) 2022-10-28 11:54:55 +08:00
Fangjun Kuang
4b1411b04f minor fixes 2022-08-24 14:35:26 +08:00
Fangjun Kuang
428571a274 Fix typos 2022-08-20 14:31:30 +08:00
Fangjun Kuang
3692f9a54c
Release v1.19 (#55) 2022-08-20 11:05:57 +08:00
Fangjun Kuang
06b5574bcb
Support building conda packages. (#54)
* Build CPU version for linux

* Build conda package for macos

* Support windows
2022-08-20 11:03:54 +08:00
Fangjun Kuang
2bd09f4b18
Release v1.18 (#48) 2022-08-01 15:47:41 +08:00
Fangjun Kuang
01f30d2e34
Support "make install" (#47)
* Support "make install"

* fix a typo
2022-08-01 15:14:43 +08:00
Fangjun Kuang
e78f081327
Print TORCH_DIR for debug. (#44)
* Print TORCH_DIR for debug.

* fix style issues

* test torch 1.12.0
2022-07-20 10:08:47 +08:00
Fangjun Kuang
e15b2ae6f5
Release v1.17 (#41) 2022-07-04 16:30:10 +08:00
Wei Kang
cc1cd67c3e
Fix android build (#40) 2022-07-04 16:28:34 +08:00
Fangjun Kuang
c6e643951f
Add documentation (#39)
* Add documentation.

* Build doc in GitHub actions.

* Fix a typo.

* Minor fixes.
2022-07-03 15:17:38 +08:00
Fangjun Kuang
ae17334ce2
Release v1.16. (#38)
* Release v1.16.

* Reduce number of test runs.
2022-06-04 15:00:09 +08:00
Fangjun Kuang
48d96b9250
Release GIL for online feature extraction. (#37) 2022-06-04 12:14:35 +08:00
Fangjun Kuang
1a43b6117d
Support windows (#36)
* Add nightly tests.

* Fix CI tests.

* Install numpy in CI.

* Add nightly ubuntu cpu tests.

* add nightly tests for macos

* Fix CI for macOS.

* Support building CPU conda packages for macOS and Ubuntu.

* Disable building conda packages for macOS.

* Add tests for conda packages.

* Revert "Disable building conda packages for macOS."

This reverts commit 9c3f6ebe98b390cfd13314686ab6a518a7ad3482.

* Disable building conda packages for macOS.

* Add tests for conda packages.

* Enable all tests.

* Run tests automatically.

* Add windows cuda tests.

* Fix errors.

* Minor fixes.

* Download cudnn for windows.

* Fix installing cuDNN for windows.

* Minor fixes.

* Fix using cuDNN.

* Typo fixes.

* Update readme.

* Minor fixes.

* small fixes.
2022-06-04 12:02:51 +08:00
Fangjun Kuang
0c0e24298c Minor release. 2022-06-01 23:38:20 +08:00
Fangjun Kuang
46535d5e6d Fix installation on macOS. 2022-06-01 23:37:33 +08:00
Fangjun Kuang
88627631b6 Minor release. 2022-06-01 23:01:59 +08:00
Fangjun Kuang
febd091b24 Fix release. 2022-06-01 21:42:59 +08:00
Fangjun Kuang
a2d21601bc Minor release. 2022-06-01 21:40:03 +08:00
Fangjun Kuang
800a6ff1d9
Merge pull request #35 from csukuangfj/fix-conda-build
Fix conda build
2022-06-01 19:18:47 +08:00
Fangjun Kuang
32f4ba6c8e Minor fixes. 2022-06-01 19:18:01 +08:00
Fangjun Kuang
25b70c9cd5 Fix conda build. 2022-06-01 19:06:57 +08:00
Fangjun Kuang
fa6404ad19 Fix setup.py 2022-06-01 18:57:03 +08:00
Fangjun Kuang
0348fda69f
Merge pull request #34 from csukuangfj/fix-windows
Fix  build on windows
2022-06-01 18:34:33 +08:00
Fangjun Kuang
9f8eaa6cd4 Fix building on windows. 2022-06-01 18:32:44 +08:00
Fangjun Kuang
d1a473a069 Fix CI tests and compiler warnings 2022-06-01 18:21:30 +08:00
Fangjun Kuang
6e41bb2039 Minor fixes. 2022-06-01 18:04:39 +08:00
Fangjun Kuang
ca273828f1 Update scripts to install cuda/cudnn/torch. 2022-06-01 18:01:31 +08:00
Fangjun Kuang
3e2979ec45 Add CI for windows. 2022-06-01 18:01:30 +08:00
Fangjun Kuang
41bd60a538
Merge pull request #33 from csukuangfj/check-torch
Check the runtime version of PyTorch before importing kaldifeat.
2022-06-01 18:01:11 +08:00
Fangjun Kuang
da74f96e4d Check the runtime version of PyTorch before importing kaldifeat. 2022-06-01 17:44:46 +08:00
Fangjun Kuang
fc78bc2be7 Release v1.15. 2022-05-23 00:56:05 +08:00
Fangjun Kuang
2f65906554
Merge pull request #31 from csukuangfj/release-GIL
Release GIL for offline feature extractors.
2022-05-23 00:54:04 +08:00
Fangjun Kuang
78454c23f3 Release GIL for offline feature extractors. 2022-05-22 16:33:01 +08:00
Fangjun Kuang
13f2917e69 Release v1.14. 2022-04-08 16:43:14 +08:00
Fangjun Kuang
1efbb6dbf3 Export OnlineFeature and OfflineFeature. 2022-04-08 16:42:16 +08:00
Fangjun Kuang
1313e9dc61
Merge pull request #30 from csukuangfj/fix-windows
Fix building on windows
2022-04-08 16:41:10 +08:00
Fangjun Kuang
2ea9ff75b4 minor fixes. 2022-04-03 17:09:14 +08:00
Fangjun Kuang
d4f2a89b50 Minor fixes. 2022-04-03 15:56:59 +08:00
Fangjun Kuang
49039c074a Add CI for windows. 2022-04-03 15:38:09 +08:00
Fangjun Kuang
08dd6c52de Minor fixes. 2022-04-03 15:29:22 +08:00
Fangjun Kuang
2f9fc99e6b Fix building on windows 2022-04-03 14:56:42 +08:00
Fangjun Kuang
07d11476c1 Fix building on windows 2022-04-03 14:50:24 +08:00
Fangjun Kuang
00e550c11b Disable build shared libs on Windows. 2022-04-03 14:36:24 +08:00
Fangjun Kuang
959da88b6e Fix building on Windows. 2022-04-03 14:26:25 +08:00
110 changed files with 10603 additions and 919 deletions

View File

@ -3,8 +3,10 @@ max-line-length = 80
exclude = exclude =
.git, .git,
doc,
build, build,
build_release, build_release,
cmake/cmake_extension.py,
kaldifeat/python/kaldifeat/__init__.py kaldifeat/python/kaldifeat/__init__.py
ignore = ignore =

81
.github/workflows/build-doc.yml vendored Normal file
View File

@ -0,0 +1,81 @@
# Copyright 2022 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# refer to https://github.com/actions/starter-workflows/pull/47/files
# You can access it at https://csukuangfj.github.io/kaldifeat
name: Generate doc
on:
push:
branches:
- master
- doc
workflow_dispatch:
jobs:
build-doc:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
python-version: [3.8]
steps:
# refer to https://github.com/actions/checkout
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Update wheels
shell: bash
run: |
export KALDIFEAT_DIR=$PWD
ls -lh $KALDIFEAT_DIR
export GIT_LFS_SKIP_SMUDGE=1
export GIT_CLONE_PROTECTION_ACTIVE=false
git clone https://huggingface.co/csukuangfj/kaldifeat huggingface
cd huggingface
./run.sh
- name: Build doc
shell: bash
run: |
cd doc
git status
python3 -m pip install -r ./requirements.txt
make html
cp source/cpu.html build/html/
cp source/cuda.html build/html/
cp source/cpu-cn.html build/html/
cp source/cuda-cn.html build/html/
touch build/html/.nojekyll
- name: Deploy
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./doc/build/html
publish_branch: gh-pages

View File

@ -1,163 +0,0 @@
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# refer to https://github.com/actions/starter-workflows/pull/47/files
name: build_conda_cuda
on:
push:
branches:
- conda-cuda
env:
KALDIFEAT_BUILD_TYPE: Release
jobs:
build_conda_cuda:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-18.04]
# anaconda does not support 3.9 as of 2021.05.08
python-version: [3.6, 3.7, 3.8, 3.9]
# python-version: [3.6, 3.7, 3.8]
cuda: ["10.1", "10.2", "11.0", "11.1"]
# from https://download.pytorch.org/whl/torch_stable.html
#
# PyTorch 1.9.0 supports: 10.2 (default), 11.1
# PyTorch 1.8.1 supports: cuda 10.1, 10.2 (default), 11.1
# PyTorch 1.8.0 supports: cuda 10.1, 10.2 (default), 11.1
# PyTorch 1.7.x supports: cuda 10.1, 10.2 (default), 11.0, 9.2 (not included in this setup)
# PyTorch 1.6.0 supports: cuda 10.1, 10.2 (default), 9.2 (not included in this setup)
# PyTorch 1.5.x supports: cuda 10.1, 10.2 (default), 9.2 (not included in this setup)
#
# PyTorch 1.8.x and 1.7.1 support 3.6, 3.7, 3.8, 3.9
# PyTorch 1.7.0, 1.6.0, and 1.5.x support 3.6, 3.7, 3.8
#
# Other PyTorch versions are not tested
#
# torch: ["1.5.0", "1.5.1", "1.6.0", "1.7.0", "1.7.1", "1.8.0", "1.8.1"]
# 1.5.x is removed because there are compilation errors.
# See
# https://github.com/csukuangfj/k2/runs/2533830771?check_suite_focus=true
# and
# https://github.com/NVIDIA/apex/issues/805
torch: ["1.6.0", "1.7.0", "1.7.1", "1.8.0", "1.8.1", "1.9.0"]
exclude:
# - cuda: "11.0" # exclude 11.0 for [1.5.0, 1.5.1, 1.6.0, 1.8.0, 1.8.1, 1.9.0]
# torch: "1.5.0"
# - cuda: "11.0"
# torch: "1.5.1"
- cuda: "11.0"
torch: "1.6.0"
- cuda: "11.0"
torch: "1.8.0"
- cuda: "11.0"
torch: "1.8.1"
- cuda: "11.0"
torch: "1.9.0"
# - cuda: "11.1" # exclude 11.1 for [1.5.0, 1.5.1, 1.6.0, 1.7.0, 1.7.1]
# torch: "1.5.0"
# - cuda: "11.1"
# torch: "1.5.1"
- cuda: "11.1"
torch: "1.6.0"
- cuda: "11.1"
torch: "1.7.0"
- cuda: "11.1"
torch: "1.7.1"
- cuda: "10.1" # exclude 10.1 for [1.9.0]
torch: "1.9.0"
- python-version: 3.9 # exclude Python 3.9 for [1.5.0, 1.5.1, 1.6.0, 1.7.0]
torch: "1.5.0"
- python-version: 3.9
torch: "1.5.1"
- python-version: 3.9
torch: "1.6.0"
- python-version: 3.9
torch: "1.7.0"
steps:
# refer to https://github.com/actions/checkout
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Install CUDA Toolkit ${{ matrix.cuda }}
shell: bash -l {0}
env:
cuda: ${{ matrix.cuda }}
run: |
source ./scripts/github_actions/install_cuda.sh
echo "CUDA_HOME=${CUDA_HOME}" >> $GITHUB_ENV
echo "${CUDA_HOME}/bin" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${CUDA_HOME}/lib:${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
- name: Display NVCC version
shell: bash -l {0}
run: |
which nvcc
nvcc --version
- uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
activate-environment: kaldifeat
- name: Display Python version
shell: bash -l {0}
run: |
python3 -c "import sys; print(sys.version)"
which python3
- name: Install conda dependencies
shell: bash -l {0}
run: |
conda install -y -q anaconda-client
conda install -y -q conda-build
conda install -y -q bs4 requests tqdm
conda install -y -q -c pytorch -c conda-forge pytorch=${{ matrix.torch }} cudatoolkit=${{ matrix.cuda }}
- name: Display conda info
shell: bash -l {0}
run: |
which conda
conda env list
conda info
nproc
- name: Download cudnn 8.0
shell: bash -l {0}
env:
cuda: ${{ matrix.cuda }}
run: |
./scripts/github_actions/install_cudnn.sh
- name: Build kaldifeat
shell: bash -l {0}
env:
KALDIFEAT_CUDA_VERSION: ${{ matrix.cuda }}
KALDIFEAT_PYTHON_VERSION: ${{ matrix.python-version}}
KALDIFEAT_TORCH_VERSION: ${{ matrix.torch }}
KALDIFEAT_CONDA_TOKEN: ${{ secrets.KALDIFEAT_CONDA_TOKEN}}
KALDIFEAT_IS_GITHUB_ACTIONS: 1
KALDIFEAT_IS_FOR_CONDA: 1
run: |
export KALDIFEAT_BUILD_TYPE=$KALDIFEAT_BUILD_TYPE
./scripts/build_conda.sh

121
.github/workflows/macos-cpu-wheels.yml vendored Normal file
View File

@ -0,0 +1,121 @@
name: build-wheels-cpu-macos
on:
push:
branches:
# - wheel
- torch-2.8.0
tags:
- '*'
workflow_dispatch:
concurrency:
group: build-wheels-cpu-macos-${{ github.ref }}
cancel-in-progress: true
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
# python ./scripts/github_actions/generate_build_matrix.py --for-macos
# MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --for-macos)
python ./scripts/github_actions/generate_build_matrix.py --for-macos --test-only-latest-torch
MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --for-macos --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
build_wheels_macos_cpu:
needs: generate_build_matrix
name: ${{ matrix.torch }} ${{ matrix.python-version }}
runs-on: macos-14
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
shell: bash
run: |
pip install -q torch==${{ matrix.torch}} cmake numpy wheel>=0.40.0 twine setuptools
- name: Build wheel
shell: bash
run: |
python3 setup.py bdist_wheel
mkdir wheelhouse
cp -v dist/* wheelhouse
- name: Display wheels (before fix)
shell: bash
run: |
ls -lh ./wheelhouse/
- name: Fix wheel platform tag
run: |
# See https://github.com/glencoesoftware/zeroc-ice-py-macos-x86_64/pull/3/files
# See:
# * https://github.com/pypa/wheel/issues/406
python -m wheel tags \
--platform-tag=macosx_11_0_arm64 \
--remove wheelhouse/*.whl
- name: Display wheels (after fix)
shell: bash
run: |
ls -lh ./wheelhouse/
- name: Upload Wheel
uses: actions/upload-artifact@v4
with:
name: torch-${{ matrix.torch }}-python-${{ matrix.python-version }}-macos-latest-cpu
path: wheelhouse/*.whl
# https://huggingface.co/docs/hub/spaces-github-actions
- name: Publish to huggingface
if: github.repository_owner == 'csukuangfj'
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v2
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"
rm -rf huggingface
export GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/csukuangfj/kaldifeat huggingface
cd huggingface
git pull
d=cpu/1.25.5.dev20241029/macos
mkdir -p $d
cp -v ../wheelhouse/*.whl ./$d
git status
git lfs track "*.whl"
git add .
git commit -m "upload macos wheel for torch ${{ matrix.torch }} python ${{ matrix.python-version }}"
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kaldifeat main

View File

@ -20,25 +20,27 @@ on:
push: push:
tags: tags:
- '*' - '*'
workflow_dispatch:
jobs: jobs:
pypi: pypi:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v4
with: with:
fetch-depth: 0 fetch-depth: 0
- name: Setup Python - name: Setup Python
uses: actions/setup-python@v2 uses: actions/setup-python@v2
with: with:
python-version: 3.6 python-version: 3.8
- name: Install Python dependencies - name: Install Python dependencies
shell: bash shell: bash
run: | run: |
python3 -m pip install --upgrade pip python3 -m pip install --upgrade pip
python3 -m pip install wheel twine setuptools python3 -m pip install wheel twine setuptools
python3 -m pip install torch==1.10.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
- name: Build - name: Build
shell: bash shell: bash

View File

@ -1,121 +0,0 @@
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
name: Publish to PyPI
on:
push:
tags:
- '*'
jobs:
pypi:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-18.04, macos-10.15]
cuda: ["10.1"]
torch: ["1.7.1"]
python-version: [3.6, 3.7, 3.8]
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install GCC 7
if: startsWith(matrix.os, 'ubuntu')
run: |
sudo apt-get install -y gcc-7 g++-7
echo "CC=/usr/bin/gcc-7" >> $GITHUB_ENV
echo "CXX=/usr/bin/g++-7" >> $GITHUB_ENV
- name: Install CUDA Toolkit ${{ matrix.cuda }}
if: startsWith(matrix.os, 'ubuntu')
shell: bash
env:
cuda: ${{ matrix.cuda }}
run: |
source ./scripts/github_actions/install_cuda.sh
echo "CUDA_HOME=${CUDA_HOME}" >> $GITHUB_ENV
echo "${CUDA_HOME}/bin" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${CUDA_HOME}/lib:${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
- name: Display NVCC version
if: startsWith(matrix.os, 'ubuntu')
run: |
which nvcc
nvcc --version
- name: Install PyTorch ${{ matrix.torch }}
if: startsWith(matrix.os, 'ubuntu')
env:
cuda: ${{ matrix.cuda }}
torch: ${{ matrix.torch }}
shell: bash
run: |
python3 -m pip install --upgrade pip
python3 -m pip install wheel twine typing_extensions
python3 -m pip install bs4 requests tqdm
./scripts/github_actions/install_torch.sh
python3 -c "import torch; print('torch version:', torch.__version__)"
- name: Install PyTorch ${{ matrix.torch }}
if: startsWith(matrix.os, 'macos')
shell: bash
run: |
python3 -m pip install -qq --upgrade pip
python3 -m pip install -qq wheel twine typing_extensions
python3 -m pip install -qq torch==${{ matrix.torch }}
python3 -c "import torch; print('torch version:', torch.__version__)"
- name: Download cudnn 8.0
if: startsWith(matrix.os, 'ubuntu')
env:
cuda: ${{ matrix.cuda }}
run: |
./scripts/github_actions/install_cudnn.sh
- name: Build pip packages
shell: bash
env:
KALDIFEAT_IS_FOR_PYPI: 1
run: |
tag=$(python3 -c "import sys; print(''.join(sys.version[:3].split('.')))")
export KALDIFEAT_MAKE_ARGS="-j2"
python3 setup.py bdist_wheel --python-tag=py${tag}
ls -lh dist/
- name: Publish wheels to PyPI
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
twine upload dist/kaldifeat-*.whl
- name: Upload Wheel
uses: actions/upload-artifact@v2
with:
name: torch-${{ matrix.torch }}-python-${{ matrix.python-version }}-${{ matrix.os }}
path: dist/*.whl

View File

@ -0,0 +1,85 @@
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
name: Run tests macos cpu
on:
push:
branches:
- master
pull_request:
branches:
- master
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
python scripts/github_actions/generate_build_matrix.py --test-only-latest-torch
MATRIX=$(python scripts/github_actions/generate_build_matrix.py --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
run_tests_macos_cpu:
needs: generate_build_matrix
runs-on: macos-latest
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install PyTorch ${{ matrix.torch }}
shell: bash
run: |
python3 -m pip install -qq --upgrade pip
python3 -m pip install -qq wheel twine typing_extensions soundfile numpy
python3 -m pip install -qq torch==${{ matrix.torch }} -f https://download.pytorch.org/whl/torch_stable.html || python3 -m pip install -qq torch==${{ matrix.torch }} -f https://download.pytorch.org/whl/torch/
python3 -c "import torch; print('torch version:', torch.__version__)"
- name: Build
shell: bash
run: |
mkdir build_release
cd build_release
cmake -DCMAKE_CXX_STANDARD=17 ..
make VERBOSE=1 -j3
- name: Run tests
shell: bash
run: |
cd build_release
ctest --output-on-failure

View File

@ -0,0 +1,88 @@
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
name: Run tests ubuntu cpu
on:
push:
branches:
- master
pull_request:
branches:
- master
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
python scripts/github_actions/generate_build_matrix.py --test-only-latest-torch
MATRIX=$(python scripts/github_actions/generate_build_matrix.py --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
run_tests_ubuntu_cpu:
needs: generate_build_matrix
runs-on: ubuntu-18.04
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install PyTorch ${{ matrix.torch }}
shell: bash
run: |
sudo apt-get update
sudo apt-get install -y libsndfile1-dev libsndfile1 ffmpeg
python3 -m pip install --upgrade pip
python3 -m pip install wheel twine typing_extensions soundfile
python3 -m pip install bs4 requests tqdm numpy
python3 -m pip install -qq torch==${{ matrix.torch }}+cpu -f https://download.pytorch.org/whl/torch_stable.html || python3 -m pip install -qq torch==${{ matrix.torch }}+cpu -f https://download.pytorch.org/whl/torch/
python3 -c "import torch; print('torch version:', torch.__version__)"
- name: Build
shell: bash
run: |
mkdir build_release
cd build_release
cmake -DCMAKE_CXX_STANDARD=17 ..
make VERBOSE=1 -j3
- name: Run tests
shell: bash
run: |
cd build_release
ctest --output-on-failure

View File

@ -14,28 +14,45 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# #
name: Run tests name: Run tests ubuntu cuda
on: on:
push: push:
branches: branches:
- master - master
pull_request: pull_request:
branches: branches:
- master - master
jobs: jobs:
run_tests: generate_build_matrix:
runs-on: ${{ matrix.os }} # see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
python scripts/github_actions/generate_build_matrix.py --enable-cuda --test-only-latest-torch
MATRIX=$(python scripts/github_actions/generate_build_matrix.py --enable-cuda --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
run_tests_ubuntu_cuda:
needs: generate_build_matrix
runs-on: ubuntu-latest
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
os: [ubuntu-18.04, macos-10.15] ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
cuda: ["10.1"]
torch: ["1.7.1"]
python-version: [3.6, 3.7, 3.8]
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v4
with: with:
fetch-depth: 0 fetch-depth: 0
@ -44,16 +61,7 @@ jobs:
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
- name: Install GCC 7
if: startsWith(matrix.os, 'ubuntu')
run: |
sudo apt-get install -y gcc-7 g++-7
echo "CC=/usr/bin/gcc-7" >> $GITHUB_ENV
echo "CXX=/usr/bin/g++-7" >> $GITHUB_ENV
- name: Install CUDA Toolkit ${{ matrix.cuda }} - name: Install CUDA Toolkit ${{ matrix.cuda }}
if: startsWith(matrix.os, 'ubuntu')
shell: bash shell: bash
env: env:
cuda: ${{ matrix.cuda }} cuda: ${{ matrix.cuda }}
@ -64,39 +72,26 @@ jobs:
echo "LD_LIBRARY_PATH=${CUDA_HOME}/lib:${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}" >> $GITHUB_ENV echo "LD_LIBRARY_PATH=${CUDA_HOME}/lib:${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
- name: Display NVCC version - name: Display NVCC version
if: startsWith(matrix.os, 'ubuntu')
run: | run: |
which nvcc which nvcc
nvcc --version nvcc --version
- name: Install PyTorch ${{ matrix.torch }} - name: Install PyTorch ${{ matrix.torch }}
if: startsWith(matrix.os, 'ubuntu')
env: env:
cuda: ${{ matrix.cuda }} cuda: ${{ matrix.cuda }}
torch: ${{ matrix.torch }} torch: ${{ matrix.torch }}
shell: bash shell: bash
run: | run: |
sudo apt update sudo apt-get update
sudo apt install libsndfile1-dev libsndfile1 ffmpeg sudo apt-get install -y libsndfile1-dev libsndfile1 ffmpeg
python3 -m pip install --upgrade pip python3 -m pip install --upgrade pip
python3 -m pip install wheel twine typing_extensions soundfile python3 -m pip install wheel twine typing_extensions soundfile
python3 -m pip install bs4 requests tqdm python3 -m pip install bs4 requests tqdm numpy
./scripts/github_actions/install_torch.sh ./scripts/github_actions/install_torch.sh
python3 -c "import torch; print('torch version:', torch.__version__)" python3 -c "import torch; print('torch version:', torch.__version__)"
- name: Install PyTorch ${{ matrix.torch }}
if: startsWith(matrix.os, 'macos')
shell: bash
run: |
python3 -m pip install -qq --upgrade pip
python3 -m pip install -qq wheel twine typing_extensions soundfile
python3 -m pip install -qq torch==${{ matrix.torch }}
python3 -c "import torch; print('torch version:', torch.__version__)"
- name: Download cudnn 8.0 - name: Download cudnn 8.0
if: startsWith(matrix.os, 'ubuntu')
env: env:
cuda: ${{ matrix.cuda }} cuda: ${{ matrix.cuda }}
run: | run: |
@ -107,11 +102,11 @@ jobs:
run: | run: |
mkdir build_release mkdir build_release
cd build_release cd build_release
cmake .. cmake -DCMAKE_CXX_STANDARD=17 ..
make VERBOSE=1 make VERBOSE=1 -j3
- name: Run tests - name: Run tests
shell: bash shell: bash
run: | run: |
cd build_release cd build_release
ctest -R py --output-on-failure ctest --output-on-failure

View File

@ -0,0 +1,121 @@
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
name: Run tests windows cpu
on:
push:
branches:
- master
pull_request:
branches:
- master
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
python scripts/github_actions/generate_build_matrix.py --test-only-latest-torch
MATRIX=$(python scripts/github_actions/generate_build_matrix.py --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
run_tests_windows_cpu:
# see https://github.com/actions/virtual-environments/blob/win19/20210525.0/images/win/Windows2019-Readme.md
needs: generate_build_matrix
runs-on: windows-latest
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
# see https://github.com/microsoft/setup-msbuild
- name: Add msbuild to PATH
uses: microsoft/setup-msbuild@v1.0.2
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Install PyTorch ${{ matrix.torch }}
run: |
pip3 install -qq torch==${{ matrix.torch }}+cpu -f https://download.pytorch.org/whl/torch_stable.html || pip3 install -qq torch==${{ matrix.torch }}+cpu -f https://download.pytorch.org/whl/torch/
pip3 install -qq wheel twine dataclasses numpy typing_extensions soundfile
- name: Display CMake version
run: |
cmake --version
cmake --help
- name: Configure CMake
shell: bash
run: |
mkdir build_release
cd build_release
cmake -DCMAKE_BUILD_TYPE=$BUILD_TYPE ..
ls -lh
- name: Build kaldifeat
run: |
cd build_release
cmake --build -DCMAKE_CXX_STANDARD=17 . --target _kaldifeat --config Release
- name: Display generated files
shell: bash
run: |
cd build_release
ls -lh lib/*/*
- name: Build wheel
shell: bash
run: |
python3 setup.py bdist_wheel
ls -lh dist/
pip install ./dist/*.whl
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
- name: Upload Wheel
uses: actions/upload-artifact@v4
with:
name: python-${{ matrix.python-version }}-${{ matrix.os }}-cpu
path: dist/*.whl
- name: Build tests
shell: bash
run: |
cd build_release
cmake -DCMAKE_CXX_STANDARD=17 --build . --target ALL_BUILD --config Release
ls -lh bin/*/*
ctest -C Release --verbose --output-on-failure

View File

@ -0,0 +1,173 @@
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
name: Run tests windows cuda
on:
push:
branches:
- master
pull_request:
branches:
- master
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
python scripts/github_actions/generate_build_matrix.py --enable-cuda --for-windows --test-only-latest-torch
MATRIX=$(python scripts/github_actions/generate_build_matrix.py --enable-cuda --for-windows --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
run_tests_windows_cuda:
needs: generate_build_matrix
runs-on: windows-latest
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
# see https://github.com/microsoft/setup-msbuild
- name: Add msbuild to PATH
uses: microsoft/setup-msbuild@v1.0.2
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Display Python version
run: python -c "import sys; print(sys.version)"
# See https://github.com/Jimver/cuda-toolkit/blob/master/src/links/windows-links.ts
# for available CUDA versions
- uses: Jimver/cuda-toolkit@v0.2.7
id: cuda-toolkit
with:
cuda: ${{ matrix.cuda }}
- name: Display CUDA version
shell: bash
run: |
echo "Installed cuda version is: ${{ steps.cuda-toolkit.outputs.cuda }}"
echo "Cuda install location: ${{steps.cuda-toolkit.outputs.CUDA_PATH}}"
nvcc --version
- name: Remove CUDA installation package
shell: bash
run: |
rm "C:/hostedtoolcache/windows/cuda_installer-windows/${{ matrix.cuda }}/x64/cuda_installer_${{ matrix.cuda }}.exe"
- name: Download cuDNN
shell: bash
run: |
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj/cudnn-for-windows
cd cudnn-for-windows
git lfs pull --include="cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive.zip"
unzip cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive.zip
rm cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive.zip
ls -lh *
ls -lh */*
echo "PWD: $PWD"
- name: Install PyTorch ${{ matrix.torch }}
shell: bash
run: |
version=${{ matrix.cuda }}
major=${version:0:2}
minor=${version:3:1}
v=${major}${minor}
if [ ${v} -eq 102 ]; then v=""; else v="+cu${v}"; fi
python3 -m pip install -qq --upgrade pip
python3 -m pip install -qq wheel twine numpy typing_extensions
python3 -m pip install -qq dataclasses soundfile numpy
python3 -m pip install -qq torch==${{ matrix.torch }}${v} -f https://download.pytorch.org/whl/torch_stable.html numpy || python3 -m pip install -qq torch==${{ matrix.torch }}${v} -f https://download.pytorch.org/whl/torch/ numpy
python3 -c "import torch; print('torch version:', torch.__version__)"
python3 -m torch.utils.collect_env
- name: Display CMake version
run: |
cmake --version
cmake --help
- name: Configure CMake
shell: bash
run: |
echo "PWD: $PWD"
ls -lh
mkdir build_release
cd build_release
cmake -DCMAKE_BUILD_TYPE=$BUILD_TYPE -DCUDNN_INCLUDE_PATH=d:/a/kaldifeat/kaldifeat/cudnn-for-windows/cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive/include -DCUDNN_LIBRARY_PATH=d:/a/kaldifeat/kaldifeat/cudnn-for-windows/cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive/lib/cudnn.lib ..
ls -lh
- name: Build kaldifeat
shell: bash
run: |
cd build_release
cmake --build . --target _kaldifeat --config Release
- name: Display generated files
shell: bash
run: |
cd build_release
ls -lh lib/*/*
- name: Build wheel
shell: bash
run: |
echo $PWD
ls -lh ./*
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=$BUILD_TYPE -DCUDNN_INCLUDE_PATH=d:/a/kaldifeat/kaldifeat/cudnn-for-windows/cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive/include -DCUDNN_LIBRARY_PATH=d:/a/kaldifeat/kaldifeat/cudnn-for-windows/cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive/lib/cudnn.lib"
python3 setup.py bdist_wheel
ls -lh dist/
pip install ./dist/*.whl
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
- name: Upload Wheel
uses: actions/upload-artifact@v4
with:
name: python-${{ matrix.python-version }}-${{ matrix.os }}-cuda-${{ matrix.cuda }}
path: dist/*.whl
- name: Build tests
shell: bash
run: |
cd build_release
cmake -DCMAKE_CXX_STANDARD=17 --build . --target ALL_BUILD --config Release
ls -lh bin/*/*
ctest -C Release --verbose --output-on-failure

View File

@ -29,12 +29,12 @@ jobs:
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
strategy: strategy:
matrix: matrix:
os: [ubuntu-18.04, macos-10.15] os: [ubuntu-latest, macos-latest]
python-version: [3.7, 3.8, 3.9] python-version: ["3.8"]
fail-fast: false fail-fast: false
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v4
with: with:
fetch-depth: 0 fetch-depth: 0

67
.github/workflows/test-wheels.yml vendored Normal file
View File

@ -0,0 +1,67 @@
name: Test pre-compiled wheels
on:
workflow_dispatch:
inputs:
torch_version:
description: "torch version, e.g., 2.0.1"
required: true
kaldifeat_version:
description: "kaldifeat version, e.g., 1.25.0.dev20230726"
required: true
jobs:
Test_pre_compiled_wheels:
name: ${{ matrix.os }} ${{ github.event.inputs.torch_version }} ${{ github.event.inputs.kaldifeat_version }} ${{ matrix.python-version }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
python-version: ["3.8", "3.9", "3.10"]
steps:
# refer to https://github.com/actions/checkout
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Install dependencies
shell: bash
run: |
pip install numpy
- name: Install torch
if: startsWith(matrix.os, 'macos')
shell: bash
run: |
pip install torch==${{ github.event.inputs.torch_version }}
- name: Install torch
if: startsWith(matrix.os, 'ubuntu') || startsWith(matrix.os, 'windows')
shell: bash
run: |
pip install torch==${{ github.event.inputs.torch_version }}+cpu -f https://download.pytorch.org/whl/torch_stable.html || pip install torch==${{ github.event.inputs.torch_version }}+cpu -f https://download.pytorch.org/whl/torch/
- name: Install kaldifeat
shell: bash
run: |
pip install kaldifeat==${{ github.event.inputs.kaldifeat_version }}+cpu.torch${{ github.event.inputs.torch_version }} -f https://csukuangfj.github.io/kaldifeat/cpu.html
- name: Run tests
shell: bash
run: |
cd kaldifeat/python/tests
python3 -c "import kaldifeat; print(kaldifeat.__file__)"
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
python3 ./test_fbank_options.py
python3 ./test_mfcc_options.py

View File

@ -0,0 +1,168 @@
name: build-wheels-cpu-arm64-ubuntu
on:
push:
branches:
# - wheel
- torch-2.8.0
tags:
- '*'
workflow_dispatch:
concurrency:
group: build-wheels-cpu-arm64-ubuntu-${{ github.ref }}
cancel-in-progress: true
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
# python ./scripts/github_actions/generate_build_matrix.py --for-arm64
# MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --for-arm64)
python ./scripts/github_actions/generate_build_matrix.py --test-only-latest-torch --for-arm64
MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --test-only-latest-torch --for-arm64)
echo "::set-output name=matrix::${MATRIX}"
build-manylinux-wheels:
needs: generate_build_matrix
name: ${{ matrix.torch }} ${{ matrix.python-version }}
runs-on: ubuntu-22.04-arm
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
# see https://github.com/pytorch/test-infra/blob/9e3d392690719fac85bad0c9b67f530e48375ca1/tools/scripts/generate_binary_build_matrix.py
# https://github.com/pytorch/builder/tree/main/manywheel
# https://github.com/pytorch/builder/pull/476
# https://github.com/k2-fsa/k2/issues/733
# https://github.com/pytorch/pytorch/pull/50633 (generate build matrix)
- name: Run the build process with Docker
uses: addnab/docker-run-action@v3
with:
image: ${{ matrix.image }}
options: -v ${{ github.workspace }}:/var/www -e IS_2_28=${{ matrix.is_2_28 }} -e PYTHON_VERSION=${{ matrix.python-version }} -e TORCH_VERSION=${{ matrix.torch }}
run: |
echo "pwd: $PWD"
uname -a
id
cat /etc/*release
gcc --version
python3 --version
which python3
ls -lh /opt/python/
echo "---"
ls -lh /opt/python/cp*
ls -lh /opt/python/*/bin
echo "---"
find /opt/python/cp* -name "libpython*"
echo "-----"
find /opt/_internal/cp* -name "libpython*"
echo "-----"
find / -name "libpython*"
echo "----"
ls -lh /usr/lib64/libpython3.so
# cp36-cp36m
# cp37-cp37m
# cp38-cp38
# cp39-cp39
# cp310-cp310
# cp311-cp311
# cp312-cp312
# cp313-cp313
# cp313-cp313t (no gil)
if [[ $PYTHON_VERSION == "3.6" ]]; then
python_dir=/opt/python/cp36-cp36m
export PYTHONPATH=/opt/python/cp36-cp36m/lib/python3.6/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.7" ]]; then
python_dir=/opt/python/cp37-cp37m
export PYTHONPATH=/opt/python/cp37-cp37m/lib/python3.7/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.8" ]]; then
python_dir=/opt/python/cp38-cp38
export PYTHONPATH=/opt/python/cp38-cp38/lib/python3.8/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.9" ]]; then
python_dir=/opt/python/cp39-cp39
export PYTHONPATH=/opt/python/cp39-cp39/lib/python3.9/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.10" ]]; then
python_dir=/opt/python/cp310-cp310
export PYTHONPATH=/opt/python/cp310-cp310/lib/python3.10/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.11" ]]; then
python_dir=/opt/python/cp311-cp311
export PYTHONPATH=/opt/python/cp311-cp311/lib/python3.11/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.12" ]]; then
python_dir=/opt/python/cp312-cp312
export PYTHONPATH=/opt/python/cp312-cp312/lib/python3.12/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.13" ]]; then
python_dir=/opt/python/cp313-cp313
export PYTHONPATH=/opt/python/cp313-cp313/lib/python3.13/site-packages:$PYTHONPATH
else
echo "Unsupported Python version $PYTHON_VERSION"
exit 1
fi
export PYTHON_INSTALL_DIR=$python_dir
export PATH=$PYTHON_INSTALL_DIR/bin:$PATH
python3 --version
which python3
/var/www/scripts/github_actions/build-ubuntu-cpu-arm64.sh
- name: Display wheels
shell: bash
run: |
ls -lh ./wheelhouse/
# https://huggingface.co/docs/hub/spaces-github-actions
- name: Publish to huggingface
if: github.repository_owner == 'csukuangfj'
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v2
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"
rm -rf huggingface
export GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/csukuangfj/kaldifeat huggingface
cd huggingface
git pull
d=cpu/1.25.5.dev20250307/linux-arm64
mkdir -p $d
cp -v ../wheelhouse/*.whl ./$d
git status
git lfs track "*.whl"
git add .
git commit -m "upload ubuntu-arm64-cpu wheel for torch ${{ matrix.torch }} python ${{ matrix.python-version }}"
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kaldifeat main

168
.github/workflows/ubuntu-cpu-wheels.yml vendored Normal file
View File

@ -0,0 +1,168 @@
name: build-wheels-cpu-ubuntu
on:
push:
branches:
# - wheel
- torch-2.8.0
tags:
- '*'
workflow_dispatch:
concurrency:
group: build-wheels-cpu-ubuntu-${{ github.ref }}
cancel-in-progress: true
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
# python ./scripts/github_actions/generate_build_matrix.py
# MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py)
python ./scripts/github_actions/generate_build_matrix.py --test-only-latest-torch
MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
build-manylinux-wheels:
needs: generate_build_matrix
name: ${{ matrix.torch }} ${{ matrix.python-version }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
# see https://github.com/pytorch/test-infra/blob/9e3d392690719fac85bad0c9b67f530e48375ca1/tools/scripts/generate_binary_build_matrix.py
# https://github.com/pytorch/builder/tree/main/manywheel
# https://github.com/pytorch/builder/pull/476
# https://github.com/k2-fsa/k2/issues/733
# https://github.com/pytorch/pytorch/pull/50633 (generate build matrix)
- name: Run the build process with Docker
uses: addnab/docker-run-action@v3
with:
image: ${{ matrix.image }}
options: -v ${{ github.workspace }}:/var/www -e IS_2_28=${{ matrix.is_2_28 }} -e PYTHON_VERSION=${{ matrix.python-version }} -e TORCH_VERSION=${{ matrix.torch }}
run: |
echo "pwd: $PWD"
uname -a
id
cat /etc/*release
gcc --version
python3 --version
which python3
ls -lh /opt/python/
echo "---"
ls -lh /opt/python/cp*
ls -lh /opt/python/*/bin
echo "---"
find /opt/python/cp* -name "libpython*"
echo "-----"
find /opt/_internal/cp* -name "libpython*"
echo "-----"
find / -name "libpython*"
echo "----"
ls -lh /usr/lib64/libpython3.so || true
# cp36-cp36m
# cp37-cp37m
# cp38-cp38
# cp39-cp39
# cp310-cp310
# cp311-cp311
# cp312-cp312
# cp313-cp313
# cp313-cp313t (no gil)
if [[ $PYTHON_VERSION == "3.6" ]]; then
python_dir=/opt/python/cp36-cp36m
export PYTHONPATH=/opt/python/cp36-cp36m/lib/python3.6/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.7" ]]; then
python_dir=/opt/python/cp37-cp37m
export PYTHONPATH=/opt/python/cp37-cp37m/lib/python3.7/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.8" ]]; then
python_dir=/opt/python/cp38-cp38
export PYTHONPATH=/opt/python/cp38-cp38/lib/python3.8/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.9" ]]; then
python_dir=/opt/python/cp39-cp39
export PYTHONPATH=/opt/python/cp39-cp39/lib/python3.9/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.10" ]]; then
python_dir=/opt/python/cp310-cp310
export PYTHONPATH=/opt/python/cp310-cp310/lib/python3.10/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.11" ]]; then
python_dir=/opt/python/cp311-cp311
export PYTHONPATH=/opt/python/cp311-cp311/lib/python3.11/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.12" ]]; then
python_dir=/opt/python/cp312-cp312
export PYTHONPATH=/opt/python/cp312-cp312/lib/python3.12/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.13" ]]; then
python_dir=/opt/python/cp313-cp313
export PYTHONPATH=/opt/python/cp313-cp313/lib/python3.13/site-packages:$PYTHONPATH
else
echo "Unsupported Python version $PYTHON_VERSION"
exit 1
fi
export PYTHON_INSTALL_DIR=$python_dir
export PATH=$PYTHON_INSTALL_DIR/bin:$PATH
python3 --version
which python3
/var/www/scripts/github_actions/build-ubuntu-cpu.sh
- name: Display wheels
shell: bash
run: |
ls -lh ./wheelhouse/
# https://huggingface.co/docs/hub/spaces-github-actions
- name: Publish to huggingface
if: github.repository_owner == 'csukuangfj'
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v2
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"
rm -rf huggingface
export GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/csukuangfj/kaldifeat huggingface
cd huggingface
git pull
d=cpu/1.25.5.dev20250307/linux-x64
mkdir -p $d
cp -v ../wheelhouse/*.whl ./$d
git status
git lfs track "*.whl"
git add .
git commit -m "upload ubuntu-cpu wheel for torch ${{ matrix.torch }} python ${{ matrix.python-version }}"
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kaldifeat main

194
.github/workflows/ubuntu-cuda-wheels.yml vendored Normal file
View File

@ -0,0 +1,194 @@
name: build-wheels-cuda-ubuntu
on:
push:
branches:
- wheel
# - torch-2.7.1
tags:
- '*'
workflow_dispatch:
concurrency:
group: build-wheels-cuda-ubuntu-${{ github.ref }}
cancel-in-progress: true
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
# python ./scripts/github_actions/generate_build_matrix.py --enable-cuda
# MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --enable-cuda)
python ./scripts/github_actions/generate_build_matrix.py --enable-cuda --test-only-latest-torch
MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --enable-cuda --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
build-manylinux-wheels:
needs: generate_build_matrix
name: ${{ matrix.torch }} ${{ matrix.python-version }} cuda${{ matrix.cuda }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Free space
shell: bash
run: |
df -h
rm -rf /opt/hostedtoolcache
df -h
echo "pwd: $PWD"
echo "github.workspace ${{ github.workspace }}"
# see https://github.com/pytorch/test-infra/blob/9e3d392690719fac85bad0c9b67f530e48375ca1/tools/scripts/generate_binary_build_matrix.py
# https://github.com/pytorch/builder/tree/main/manywheel
# https://github.com/pytorch/builder/pull/476
# https://github.com/k2-fsa/k2/issues/733
# https://github.com/pytorch/pytorch/pull/50633 (generate build matrix)
- name: Run the build process with Docker
uses: addnab/docker-run-action@v3
with:
image: ${{ matrix.image }}
options: -v ${{ github.workspace }}:/var/www -e IS_2_28=${{ matrix.is_2_28 }} -e PYTHON_VERSION=${{ matrix.python-version }} -e TORCH_VERSION=${{ matrix.torch }} -e CUDA_VERSION=${{ matrix.cuda }}
run: |
echo "pwd: $PWD"
uname -a
id
cat /etc/*release
gcc --version
python3 --version
which python3
ls -lh /opt/python/
echo "---"
ls -lh /opt/python/cp*
ls -lh /opt/python/*/bin
echo "---"
find /opt/python/cp* -name "libpython*"
echo "-----"
find /opt/_internal/cp* -name "libpython*"
echo "-----"
find / -name "libpython*"
# cp36-cp36m
# cp37-cp37m
# cp38-cp38
# cp39-cp39
# cp310-cp310
# cp311-cp311
# cp312-cp312
# cp313-cp313
# cp313-cp313t (no gil)
if [[ $PYTHON_VERSION == "3.6" ]]; then
python_dir=/opt/python/cp36-cp36m
export PYTHONPATH=/opt/python/cp36-cp36m/lib/python3.6/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.7" ]]; then
python_dir=/opt/python/cp37-cp37m
export PYTHONPATH=/opt/python/cp37-cp37m/lib/python3.7/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.8" ]]; then
python_dir=/opt/python/cp38-cp38
export PYTHONPATH=/opt/python/cp38-cp38/lib/python3.8/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.9" ]]; then
python_dir=/opt/python/cp39-cp39
export PYTHONPATH=/opt/python/cp39-cp39/lib/python3.9/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.10" ]]; then
python_dir=/opt/python/cp310-cp310
export PYTHONPATH=/opt/python/cp310-cp310/lib/python3.10/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.11" ]]; then
python_dir=/opt/python/cp311-cp311
export PYTHONPATH=/opt/python/cp311-cp311/lib/python3.11/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.12" ]]; then
python_dir=/opt/python/cp312-cp312
export PYTHONPATH=/opt/python/cp312-cp312/lib/python3.12/site-packages:$PYTHONPATH
elif [[ $PYTHON_VERSION == "3.13" ]]; then
python_dir=/opt/python/cp313-cp313
export PYTHONPATH=/opt/python/cp313-cp313/lib/python3.13/site-packages:$PYTHONPATH
else
echo "Unsupported Python version $PYTHON_VERSION"
exit 1
fi
export PYTHON_INSTALL_DIR=$python_dir
export PATH=$PYTHON_INSTALL_DIR/bin:$PATH
# There are no libpython.so inside $PYTHON_INSTALL_DIR
# since they are statically linked.
python3 --version
which python3
pushd /usr/local
rm cuda
ln -s cuda-$CUDA_VERSION cuda
popd
which nvcc
nvcc --version
cp /var/www/scripts/github_actions/install_torch.sh .
chmod +x install_torch.sh
/var/www/scripts/github_actions/build-ubuntu-cuda.sh
- name: Display wheels
shell: bash
run: |
ls -lh ./wheelhouse/
- name: Upload Wheel
if: false
uses: actions/upload-artifact@v4
with:
name: torch-${{ matrix.torch }}-python-${{ matrix.python-version }}-ubuntu-latest-cuda-is_2_28-${{ matrix.is_2_28 }}
path: wheelhouse/*.whl
# https://huggingface.co/docs/hub/spaces-github-actions
- name: Publish to huggingface
if: github.repository_owner == 'csukuangfj'
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v2
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"
rm -rf huggingface
export GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/csukuangfj/kaldifeat huggingface
cd huggingface
git pull
d=cuda/1.25.5.dev20241029/linux
mkdir -p $d
cp -v ../wheelhouse/*.whl ./$d
git status
git lfs track "*.whl"
git add .
git commit -m "upload ubuntu-cuda wheel for torch ${{ matrix.torch }} python ${{ matrix.python-version }}"
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kaldifeat main

View File

@ -0,0 +1,108 @@
name: build-wheels-cpu-win64
on:
push:
branches:
# - wheel
- torch-2.8.0
tags:
- '*'
workflow_dispatch:
concurrency:
group: build-wheels-cpu-win64-${{ github.ref }}
cancel-in-progress: true
jobs:
generate_build_matrix:
# see https://github.com/pytorch/pytorch/pull/50633
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
# python ./scripts/github_actions/generate_build_matrix.py --for-windows
# MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --for-windows)
python ./scripts/github_actions/generate_build_matrix.py --for-windows --test-only-latest-torch
MATRIX=$(python ./scripts/github_actions/generate_build_matrix.py --for-windows --test-only-latest-torch)
echo "::set-output name=matrix::${MATRIX}"
build_wheels_win64_cpu:
needs: generate_build_matrix
name: ${{ matrix.torch }} ${{ matrix.python-version }}
runs-on: windows-latest
strategy:
fail-fast: false
matrix:
${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
shell: bash
run: |
pip install -q torch==${{ matrix.torch}} cmake numpy wheel>=0.40.0 twine setuptools
pip install torch==${{ matrix.torch}}+cpu -f https://download.pytorch.org/whl/torch_stable.html cmake numpy || pip install torch==${{ matrix.torch}}+cpu -f https://download.pytorch.org/whl/torch/ cmake numpy
- name: Build wheel
shell: bash
run: |
python3 setup.py bdist_wheel
mkdir wheelhouse
cp -v dist/* wheelhouse
- name: Display wheels
shell: bash
run: |
ls -lh ./wheelhouse/
- name: Upload Wheel
uses: actions/upload-artifact@v4
with:
name: torch-${{ matrix.torch }}-python-${{ matrix.python-version }}-windows-latest-cpu
path: wheelhouse/*.whl
# https://huggingface.co/docs/hub/spaces-github-actions
- name: Publish to huggingface
if: github.repository_owner == 'csukuangfj'
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v2
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "csukuangfj@gmail.com"
git config --global user.name "Fangjun Kuang"
rm -rf huggingface
export GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/csukuangfj/kaldifeat huggingface
cd huggingface
git pull
d=cpu/1.25.5.dev20241029/windows
mkdir -p $d
cp -v ../wheelhouse/*.whl ./$d
git status
git lfs track "*.whl"
git add .
git commit -m "upload windows-cpu wheel for torch ${{ matrix.torch }} python ${{ matrix.python-version }}"
git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kaldifeat main

3
.gitignore vendored
View File

@ -5,3 +5,6 @@ dist/
__pycache__/ __pycache__/
test-1hour.wav test-1hour.wav
path.sh path.sh
torch_version.py
cpu*.html
cuda*.html

View File

@ -1,4 +1,7 @@
# Copyright (c) 2021 Xiaomi Corporation (author: Fangjun Kuang) # Copyright (c) 2021 Xiaomi Corporation (author: Fangjun Kuang)
if (CMAKE_VERSION VERSION_GREATER_EQUAL "4.0.0")
set(CMAKE_POLICY_VERSION_MINIMUM 3.5)
endif()
cmake_minimum_required(VERSION 3.8 FATAL_ERROR) cmake_minimum_required(VERSION 3.8 FATAL_ERROR)
@ -6,7 +9,8 @@ project(kaldifeat)
# remember to change the version in # remember to change the version in
# scripts/conda/kaldifeat/meta.yaml # scripts/conda/kaldifeat/meta.yaml
set(kaldifeat_VERSION "1.13") # scripts/conda-cpu/kaldifeat/meta.yaml
set(kaldifeat_VERSION "1.25.5")
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib") set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib")
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib") set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib")
@ -15,29 +19,49 @@ set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin")
set(CMAKE_SKIP_BUILD_RPATH FALSE) set(CMAKE_SKIP_BUILD_RPATH FALSE)
set(BUILD_RPATH_USE_ORIGIN TRUE) set(BUILD_RPATH_USE_ORIGIN TRUE)
set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE) set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
set(CMAKE_INSTALL_RPATH "$ORIGIN")
set(CMAKE_BUILD_RPATH "$ORIGIN")
if(NOT APPLE)
set(kaldifeat_rpath_origin "$ORIGIN")
else()
set(kaldifeat_rpath_origin "@loader_path")
endif()
set(CMAKE_INSTALL_RPATH ${kaldifeat_rpath_origin})
set(CMAKE_BUILD_RPATH ${kaldifeat_rpath_origin})
if(NOT CMAKE_BUILD_TYPE) if(NOT CMAKE_BUILD_TYPE)
message(STATUS "No CMAKE_BUILD_TYPE given, default to Release") message(STATUS "No CMAKE_BUILD_TYPE given, default to Release")
set(CMAKE_BUILD_TYPE Release) set(CMAKE_BUILD_TYPE Release)
endif() endif()
set(CMAKE_CXX_STANDARD 14 CACHE STRING "The C++ version to be used.") if (NOT CMAKE_CXX_STANDARD)
set(CMAKE_CXX_STANDARD 17 CACHE STRING "The C++ version to be used.")
endif()
message(STATUS "C++ Standard version: ${CMAKE_CXX_STANDARD}")
set(CMAKE_CXX_EXTENSIONS OFF) set(CMAKE_CXX_EXTENSIONS OFF)
message(STATUS "C++ Standard version: ${CMAKE_CXX_STANDARD}") list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake/Modules)
list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
list(APPEND CMAKE_MODULE_PATH ${CMAKE_SOURCE_DIR}/cmake/Modules) option(BUILD_SHARED_LIBS "Whether to build shared libraries" ON)
list(APPEND CMAKE_MODULE_PATH ${CMAKE_SOURCE_DIR}/cmake) option(kaldifeat_BUILD_TESTS "Whether to build tests or not" OFF)
option(kaldifeat_BUILD_PYMODULE "Whether to build python module or not" ON)
message(STATUS "BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}")
option(kaldifeat_BUILD_TESTS "Whether to build tests or not" ON) if(BUILD_SHARED_LIBS AND MSVC)
set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
if(CMAKE_SOURCE_DIR STREQUAL PROJECT_SOURCE_DIR)
include(pybind11)
endif() endif()
if(kaldifeat_BUILD_PYMODULE)
include(pybind11)
endif()
# to prevent cmake from trying to link with system installed mkl since we not directly use it
# mkl libraries should be linked with pytorch already
# ref: https://github.com/pytorch/pytorch/blob/master/cmake/public/mkl.cmake
set(CMAKE_DISABLE_FIND_PACKAGE_MKL TRUE)
include(torch) include(torch)
if(kaldifeat_BUILD_TESTS) if(kaldifeat_BUILD_TESTS)
@ -45,5 +69,52 @@ if(kaldifeat_BUILD_TESTS)
enable_testing() enable_testing()
endif() endif()
include_directories(${CMAKE_SOURCE_DIR})
if(WIN32)
# disable various warnings for MSVC
# 4624: destructor was implicitly defined as deleted because a base class destructor is inaccessible or deleted
set(disabled_warnings
/wd4624
)
message(STATUS "Disabled warnings: ${disabled_warnings}")
foreach(w IN LISTS disabled_warnings)
string(APPEND CMAKE_CXX_FLAGS " ${w} ")
endforeach()
endif()
include_directories(${CMAKE_CURRENT_SOURCE_DIR})
message(STATUS "CMAKE_CXX_FLAGS: ${CMAKE_CXX_FLAGS}")
message(STATUS "CMAKE_INSTALL_PREFIX: ${CMAKE_INSTALL_PREFIX}")
add_subdirectory(kaldifeat) add_subdirectory(kaldifeat)
# TORCH_VERSION is defined in cmake/torch.cmake
configure_file(
${PROJECT_SOURCE_DIR}/kaldifeat/python/kaldifeat/torch_version.py.in
${PROJECT_SOURCE_DIR}/kaldifeat/python/kaldifeat/torch_version.py @ONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/cmake/kaldifeatConfigVersion.cmake.in
${PROJECT_BINARY_DIR}/kaldifeatConfigVersion.cmake
@ONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/cmake/kaldifeatConfig.cmake.in
${PROJECT_BINARY_DIR}/kaldifeatConfig.cmake
@ONLY
)
install(FILES
${PROJECT_BINARY_DIR}/kaldifeatConfigVersion.cmake
${PROJECT_BINARY_DIR}/kaldifeatConfig.cmake
DESTINATION share/cmake/kaldifeat
)
install(FILES
${PROJECT_SOURCE_DIR}/kaldifeat/python/kaldifeat/torch_version.py
DESTINATION ./
)

117
README.md
View File

@ -1,12 +1,33 @@
# kaldifeat # kaldifeat
<div align="center"> <div align="center">
<img src="/doc/source/images/os.svg"> <img src="/doc/source/images/os-green.svg">
<img src="/doc/source/images/python_ge_3.6-blue.svg"> <img src="/doc/source/images/python_ge_3.6-blue.svg">
<img src="/doc/source/images/pytorch_ge_1.5.0-green.svg"> <img src="/doc/source/images/pytorch_ge_1.5.0-green.svg">
<img src="/doc/source/images/cuda_ge_10.1-orange.svg"> <img src="/doc/source/images/cuda_ge_10.1-orange.svg">
</div> </div>
[![Documentation Status](https://github.com/csukuangfj/kaldifeat/actions/workflows/build-doc.yml/badge.svg)](https://csukuangfj.github.io/kaldifeat/)
**Documentation**: <https://csukuangfj.github.io/kaldifeat>
**Note**: If you are looking for a version that does not depend on PyTorch,
please see <https://github.com/csukuangfj/kaldi-native-fbank>
# Installation
Refer to
<https://csukuangfj.github.io/kaldifeat/installation/from_wheels.html>
for installation.
> Never use `pip install kaldifeat`
> Never use `pip install kaldifeat`
> Never use `pip install kaldifeat`
<sub> <sub>
<table> <table>
<tr> <tr>
@ -16,6 +37,36 @@
<th>Usage</th> <th>Usage</th>
</tr> </tr>
<tr>
<td>Fbank for <a href="https://github.com/openai/whisper">Whisper</a></td>
<td><code>kaldifeat.WhisperFbankOptions</code></td>
<td><code>kaldifeat.WhisperFbank</code></td>
<td>
<pre lang="python">
opts = kaldifeat.WhisperFbankOptions()
opts.device = torch.device('cuda', 0)
fbank = kaldifeat.WhisperFbank(opts)
features = fbank(wave)
</pre>
See <a href="https://github.com/csukuangfj/kaldifeat/pull/82">#82</a>
</td>
</tr>
<tr>
<td>Fbank for <a href="https://github.com/openai/whisper">Whisper-V3</a></td>
<td><code>kaldifeat.WhisperFbankOptions</code></td>
<td><code>kaldifeat.WhisperFbank</code></td>
<td>
<pre lang="python">
opts = kaldifeat.WhisperFbankOptions()
opts.num_mels = 128
opts.device = torch.device('cuda', 0)
fbank = kaldifeat.WhisperFbank(opts)
features = fbank(wave)
</pre>
</td>
</tr>
<tr> <tr>
<td>FBANK</td> <td>FBANK</td>
<td><code>kaldifeat.FbankOptions</code></td> <td><code>kaldifeat.FbankOptions</code></td>
@ -268,67 +319,11 @@ See <https://github.com/k2-fsa/k2/blob/v2.0-pre/k2/torch/csrc/features.cu>.
See <https://github.com/lhotse-speech/lhotse/blob/master/lhotse/features/kaldifeat.py>. See <https://github.com/lhotse-speech/lhotse/blob/master/lhotse/features/kaldifeat.py>.
# Installation ## sherpa
## From conda (Only for Linux + CUDA) [sherpa](https://github.com/k2-fsa/sherpa) uses kaldifeat for streaming feature
extraction.
Supported versions of Python, PyTorch, and CUDA toolkit are listed below: See <https://github.com/k2-fsa/sherpa/blob/master/sherpa/bin/pruned_stateless_emformer_rnnt2/decode.py>
[![Supported Python versions](/doc/source/images/python-3.6_3.7_3.8-blue.svg)](/doc/source/images/python-3.6_3.7_3.8-blue.svg)
[![Supported PyTorch versions](/doc/source/images/pytorch-1.6.0_1.7.0_1.7.1_1.8.0_1.8.1_1.9.0-green.svg)](/doc/source/images/pytorch-1.6.0_1.7.0_1.7.1_1.8.0_1.8.1_1.9.0-green.svg)
[![Supported CUDA versions](/doc/source/images/cuda-10.1_10.2_11.0_11.1-orange.svg)](/doc/source/images/cuda-10.1_10.2_11.0_11.1-orange.svg)
```bash
conda install -c kaldifeat -c pytorch -c conda-forge kaldifeat python=3.8 cudatoolkit=11.1 pytorch=1.8.1
```
You can select the supported Python version, CUDA toolkit version and PyTorch version as you wish.
**Note:** If you want a CPU only version or want to install `kaldifeat` on other operating systems,
e.g., macOS, please use `pip install` or compile `kaldifeat` from source.
## From PyPi with pip
You need to install PyTorch and CMake first.
CMake 3.11 is known to work. Other CMake versions may also work.
PyTorch 1.6.0 and above are known to work. Other PyTorch versions may also work.
```bash
pip install -v kaldifeat
```
## From source
The following are the commands to compile `kaldifeat` from source.
We assume that you have installed `CMake` and PyTorch.
CMake 3.11 is known to work. Other CMake versions may also work.
PyTorch 1.6.0 and above are known to work. Other PyTorch versions may also work.
```bash
mkdir /some/path
git clone https://github.com/csukuangfj/kaldifeat.git
cd kaldifeat
python setup.py install
```
To test whether `kaldifeat` was installed successfully, you can run:
```bash
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
```
## FAQs
### How to install a CUDA version
There are two approaches:
- (1) Install using `conda`. It always installs a CUDA version of kaldifeat.
- (2) Install a CUDA version of PyTorch and then install kaldifeat from source
or use `pip install kaldifeat`.
### How to install a CPU-only version
You have to first install a CPU-only version of PyTorch and then install kaldifeat
from source or use `pip install kaldifeat`.

View File

@ -8,9 +8,15 @@ import sys
from pathlib import Path from pathlib import Path
import setuptools import setuptools
import torch
from setuptools.command.build_ext import build_ext from setuptools.command.build_ext import build_ext
def get_pytorch_version():
# if it is 1.7.1+cuda101, then strip +cuda101
return torch.__version__.split("+")[0]
def is_for_pypi(): def is_for_pypi():
ans = os.environ.get("KALDIFEAT_IS_FOR_PYPI", None) ans = os.environ.get("KALDIFEAT_IS_FOR_PYPI", None)
return ans is not None return ans is not None
@ -20,6 +26,10 @@ def is_macos():
return platform.system() == "Darwin" return platform.system() == "Darwin"
def is_windows():
return platform.system() == "Windows"
try: try:
from wheel.bdist_wheel import bdist_wheel as _bdist_wheel from wheel.bdist_wheel import bdist_wheel as _bdist_wheel
@ -35,7 +45,6 @@ try:
# -linux_x86_64.whl # -linux_x86_64.whl
self.root_is_pure = False self.root_is_pure = False
except ImportError: except ImportError:
bdist_wheel = None bdist_wheel = None
@ -63,34 +72,67 @@ class BuildExtension(build_ext):
if cmake_args == "": if cmake_args == "":
cmake_args = "-DCMAKE_BUILD_TYPE=Release" cmake_args = "-DCMAKE_BUILD_TYPE=Release"
if make_args == "" and system_make_args == "": extra_cmake_args = " -Dkaldifeat_BUILD_TESTS=OFF "
print("For fast compilation, run:") extra_cmake_args += f" -DCMAKE_INSTALL_PREFIX={Path(self.build_lib).resolve()}/kaldifeat " # noqa
print('export KALDIFEAT_MAKE_ARGS="-j"; python setup.py install')
major, minor = get_pytorch_version().split(".")[:2]
print("major, minor", major, minor)
major = int(major)
minor = int(minor)
if major > 2 or (major == 2 and minor >= 1):
extra_cmake_args += f" -DCMAKE_CXX_STANDARD=17 "
if "PYTHON_EXECUTABLE" not in cmake_args: if "PYTHON_EXECUTABLE" not in cmake_args:
print(f"Setting PYTHON_EXECUTABLE to {sys.executable}") print(f"Setting PYTHON_EXECUTABLE to {sys.executable}")
cmake_args += f" -DPYTHON_EXECUTABLE={sys.executable}" cmake_args += f" -DPYTHON_EXECUTABLE={sys.executable}"
build_cmd = f""" cmake_args += extra_cmake_args
cd {self.build_temp}
cmake {cmake_args} {kaldifeat_dir} if is_windows():
build_cmd = f"""
cmake {cmake_args} -B {self.build_temp} -S {kaldifeat_dir}
make {make_args} _kaldifeat cmake --build {self.build_temp} --target _kaldifeat --config Release -- -m
""" cmake --build {self.build_temp} --target install --config Release -- -m
print(f"build command is:\n{build_cmd}") """
print(f"build command is:\n{build_cmd}")
ret = os.system(build_cmd) ret = os.system(
if ret != 0: f"cmake {cmake_args} -B {self.build_temp} -S {kaldifeat_dir}"
raise Exception(
"\nBuild kaldifeat failed. Please check the error message.\n"
"You can ask for help by creating an issue on GitHub.\n"
"\nClick:\n\thttps://github.com/csukuangfj/kaldifeat/issues/new\n" # noqa
) )
if ret != 0:
raise Exception("Failed to configure kaldifeat")
lib_so = glob.glob(f"{self.build_temp}/lib/*kaldifeat*.so") ret = os.system(
lib_so += glob.glob(f"{self.build_temp}/lib/*kaldifeat*.dylib") # macOS f"cmake --build {self.build_temp} --target _kaldifeat --config Release -- -m"
for so in lib_so: )
print(f"Copying {so} to {self.build_lib}/") if ret != 0:
shutil.copy(f"{so}", f"{self.build_lib}/") raise Exception("Failed to build kaldifeat")
ret = os.system(
f"cmake --build {self.build_temp} --target install --config Release -- -m"
)
if ret != 0:
raise Exception("Failed to install kaldifeat")
else:
if make_args == "" and system_make_args == "":
print("For fast compilation, run:")
print('export KALDIFEAT_MAKE_ARGS="-j"; python setup.py install')
make_args = " -j4 "
print("Setting make_args to '-j4'")
build_cmd = f"""
cd {self.build_temp}
cmake {cmake_args} {kaldifeat_dir}
make {make_args} _kaldifeat install
"""
print(f"build command is:\n{build_cmd}")
ret = os.system(build_cmd)
if ret != 0:
raise Exception(
"\nBuild kaldifeat failed. Please check the error message.\n"
"You can ask for help by creating an issue on GitHub.\n"
"\nClick:\n\thttps://github.com/csukuangfj/kaldifeat/issues/new\n" # noqa
)

View File

@ -24,8 +24,28 @@ function(download_googltest)
include(FetchContent) include(FetchContent)
set(googletest_URL "https://github.com/google/googletest/archive/release-1.10.0.tar.gz") set(googletest_URL "https://github.com/google/googletest/archive/refs/tags/v1.13.0.tar.gz")
set(googletest_HASH "SHA256=9dc9157a9a1551ec7a7e43daea9a694a0bb5fb8bec81235d8a1e6ef64c716dcb") set(googletest_URL2 "https://huggingface.co/csukuangfj/k2-cmake-deps/resolve/main/googletest-1.13.0.tar.gz")
set(googletest_HASH "SHA256=ad7fdba11ea011c1d925b3289cf4af2c66a352e18d4c7264392fead75e919363")
# If you don't have access to the Internet,
# please pre-download googletest
set(possible_file_locations
$ENV{HOME}/Downloads/googletest-1.13.0.tar.gz
${PROJECT_SOURCE_DIR}/googletest-1.13.0.tar.gz
${PROJECT_BINARY_DIR}/googletest-1.13.0.tar.gz
/tmp/googletest-1.13.0.tar.gz
/star-fj/fangjun/download/github/googletest-1.13.0.tar.gz
)
foreach(f IN LISTS possible_file_locations)
if(EXISTS ${f})
set(googletest_URL "${f}")
file(TO_CMAKE_PATH "${googletest_URL}" googletest_URL)
set(googletest_URL2)
break()
endif()
endforeach()
set(BUILD_GMOCK ON CACHE BOOL "" FORCE) set(BUILD_GMOCK ON CACHE BOOL "" FORCE)
set(INSTALL_GTEST OFF CACHE BOOL "" FORCE) set(INSTALL_GTEST OFF CACHE BOOL "" FORCE)
@ -33,13 +53,15 @@ function(download_googltest)
set(gtest_force_shared_crt ON CACHE BOOL "" FORCE) set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
FetchContent_Declare(googletest FetchContent_Declare(googletest
URL ${googletest_URL} URL
${googletest_URL}
${googletest_URL2}
URL_HASH ${googletest_HASH} URL_HASH ${googletest_HASH}
) )
FetchContent_GetProperties(googletest) FetchContent_GetProperties(googletest)
if(NOT googletest_POPULATED) if(NOT googletest_POPULATED)
message(STATUS "Downloading googletest") message(STATUS "Downloading googletest from ${googletest_URL}")
FetchContent_Populate(googletest) FetchContent_Populate(googletest)
endif() endif()
message(STATUS "googletest is downloaded to ${googletest_SOURCE_DIR}") message(STATUS "googletest is downloaded to ${googletest_SOURCE_DIR}")

View File

@ -0,0 +1,65 @@
# Findkaldifeat
# -------------
#
# Finds the kaldifeat library
#
# This will define the following variables:
#
# KALDIFEAT_FOUND -- True if the system has the kaldifeat library
# KALDIFEAT_INCLUDE_DIRS -- The include directories for kaldifeat
# KALDIFEAT_LIBRARIES -- Libraries to link against
# KALDIFEAT_CXX_FLAGS -- Additional (required) compiler flags
# KALDIFEAT_TORCH_VERSION_MAJOR -- The major version of PyTorch used to compile kaldifeat
# KALDIFEAT_TORCH_VERSION_MINOR -- The minor version of PyTorch used to compile kaldifeat
# KALDIFEAT_VERSION -- The version of kaldifeat
#
# and the following imported targets:
#
# kaldifeat_core
# This file is modified from pytorch/cmake/TorchConfig.cmake.in
set(KALDIFEAT_CXX_FLAGS "@CMAKE_CXX_FLAGS@")
set(KALDIFEAT_TORCH_VERSION_MAJOR @KALDIFEAT_TORCH_VERSION_MAJOR@)
set(KALDIFEAT_TORCH_VERSION_MINOR @KALDIFEAT_TORCH_VERSION_MINOR@)
set(KALDIFEAT_VERSION @kaldifeat_VERSION@)
if(DEFINED ENV{KALDIFEAT_INSTALL_PREFIX})
set(KALDIFEAT_INSTALL_PREFIX $ENV{KALDIFEAT_INSTALL_PREFIX})
else()
# Assume we are in <install-prefix>/share/cmake/kaldifeat/kaldifeatConfig.cmake
get_filename_component(CMAKE_CURRENT_LIST_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH)
get_filename_component(KALDIFEAT_INSTALL_PREFIX "${CMAKE_CURRENT_LIST_DIR}/../../../" ABSOLUTE)
endif()
set(KALDIFEAT_INCLUDE_DIRS ${KALDIFEAT_INSTALL_PREFIX}/include)
set(KALDIFEAT_LIBRARIES kaldifeat_core)
foreach(lib IN LISTS KALDIFEAT_LIBRARIES)
find_library(location_${lib} ${lib}
PATHS
"${KALDIFEAT_INSTALL_PREFIX}/lib"
"${KALDIFEAT_INSTALL_PREFIX}/lib64"
)
if(NOT MSVC)
add_library(${lib} SHARED IMPORTED)
else()
add_library(${lib} STATIC IMPORTED)
endif()
set_target_properties(${lib} PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES "${KALDIFEAT_INCLUDE_DIRS}"
IMPORTED_LOCATION "${location_${lib}}"
CXX_STANDARD 14
)
set_property(TARGET ${lib} PROPERTY INTERFACE_COMPILE_OPTIONS @CMAKE_CXX_FLAGS@)
endforeach()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(kaldifeat DEFAULT_MSG
location_kaldifeat_core
)

View File

@ -0,0 +1,12 @@
# This file is modified from pytorch/cmake/TorchConfigVersion.cmake.in
set(PACKAGE_VERSION "@kaldifeat_VERSION@")
# Check whether the requested PACKAGE_FIND_VERSION is compatible
if("${PACKAGE_VERSION}" VERSION_LESS "${PACKAGE_FIND_VERSION}")
set(PACKAGE_VERSION_COMPATIBLE FALSE)
else()
set(PACKAGE_VERSION_COMPATIBLE TRUE)
if("${PACKAGE_VERSION}" VERSION_EQUAL "${PACKAGE_FIND_VERSION}")
set(PACKAGE_VERSION_EXACT TRUE)
endif()
endif()

View File

@ -8,20 +8,39 @@ function(download_pybind11)
include(FetchContent) include(FetchContent)
set(pybind11_URL "https://github.com/pybind/pybind11/archive/v2.6.0.tar.gz") set(pybind11_URL "https://github.com/pybind/pybind11/archive/refs/tags/v2.12.0.tar.gz")
set(pybind11_HASH "SHA256=90b705137b69ee3b5fc655eaca66d0dc9862ea1759226f7ccd3098425ae69571") set(pybind11_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/pybind11-2.12.0.tar.gz")
set(pybind11_HASH "SHA256=bf8f242abd1abcd375d516a7067490fb71abd79519a282d22b6e4d19282185a7")
# If you don't have access to the Internet,
# please pre-download pybind11
set(possible_file_locations
$ENV{HOME}/Downloads/pybind11-2.12.0.tar.gz
${CMAKE_SOURCE_DIR}/pybind11-2.12.0.tar.gz
${CMAKE_BINARY_DIR}/pybind11-2.12.0.tar.gz
/tmp/pybind11-2.12.0.tar.gz
/star-fj/fangjun/download/github/pybind11-2.12.0.tar.gz
)
foreach(f IN LISTS possible_file_locations)
if(EXISTS ${f})
set(pybind11_URL "${f}")
file(TO_CMAKE_PATH "${pybind11_URL}" pybind11_URL)
set(pybind11_URL2)
break()
endif()
endforeach()
set(double_quotes "\"")
set(dollar "\$")
set(semicolon "\;")
FetchContent_Declare(pybind11 FetchContent_Declare(pybind11
URL ${pybind11_URL} URL
${pybind11_URL}
${pybind11_URL2}
URL_HASH ${pybind11_HASH} URL_HASH ${pybind11_HASH}
) )
FetchContent_GetProperties(pybind11) FetchContent_GetProperties(pybind11)
if(NOT pybind11_POPULATED) if(NOT pybind11_POPULATED)
message(STATUS "Downloading pybind11") message(STATUS "Downloading pybind11 from ${pybind11_URL}")
FetchContent_Populate(pybind11) FetchContent_Populate(pybind11)
endif() endif()
message(STATUS "pybind11 is downloaded to ${pybind11_SOURCE_DIR}") message(STATUS "pybind11 is downloaded to ${pybind11_SOURCE_DIR}")

View File

@ -8,6 +8,7 @@ execute_process(
OUTPUT_STRIP_TRAILING_WHITESPACE OUTPUT_STRIP_TRAILING_WHITESPACE
OUTPUT_VARIABLE TORCH_DIR OUTPUT_VARIABLE TORCH_DIR
) )
message(STATUS "TORCH_DIR: ${TORCH_DIR}")
list(APPEND CMAKE_PREFIX_PATH "${TORCH_DIR}") list(APPEND CMAKE_PREFIX_PATH "${TORCH_DIR}")
find_package(Torch REQUIRED) find_package(Torch REQUIRED)

View File

@ -1,6 +1,6 @@
dataclasses dataclasses
recommonmark recommonmark
sphinx sphinx<7.0
sphinx-autodoc-typehints sphinx-autodoc-typehints
sphinx_rtd_theme sphinx_rtd_theme
sphinxcontrib-bibtex sphinxcontrib-bibtex

View File

View File

@ -1,72 +0,0 @@
#!/usr/bin/env python3
# Copyright 2021 Xiaomi Corporation (authors: Fangjun Kuang)
import numpy as np
import soundfile as sf
import torch
import kaldifeat
def read_wave(filename) -> torch.Tensor:
"""Read a wave file and return it as a 1-D tensor.
Note:
You don't need to scale it to [-32768, 32767].
We use scaling here to follow the approach in Kaldi.
Args:
filename:
Filename of a sound file.
Returns:
Return a 1-D tensor containing audio samples.
"""
with sf.SoundFile(filename) as sf_desc:
sampling_rate = sf_desc.samplerate
assert sampling_rate == 16000
data = sf_desc.read(dtype=np.float32, always_2d=False)
data *= 32768
return torch.from_numpy(data)
def test_fbank():
device = torch.device("cpu")
if torch.cuda.is_available():
device = torch.device("cuda", 0)
wave0 = read_wave("test_data/test.wav")
wave1 = read_wave("test_data/test2.wav")
wave0 = wave0.to(device)
wave1 = wave1.to(device)
opts = kaldifeat.FbankOptions()
opts.frame_opts.dither = 0
opts.device = device
fbank = kaldifeat.Fbank(opts)
# We can compute fbank features in batches
features = fbank([wave0, wave1])
assert isinstance(features, list), f"{type(features)}"
assert len(features) == 2
# We can also compute fbank features for a single wave
features0 = fbank(wave0)
features1 = fbank(wave1)
assert torch.allclose(features[0], features0)
assert torch.allclose(features[1], features1)
# To compute fbank features for only a specified frame
audio_frames = fbank.convert_samples_to_frames(wave0)
feature_frame_1 = fbank.compute(audio_frames[1])
feature_frame_10 = fbank.compute(audio_frames[10])
assert torch.allclose(features0[1], feature_frame_1)
assert torch.allclose(features0[10], feature_frame_10)
if __name__ == "__main__":
test_fbank()

View File

@ -59,7 +59,7 @@ templates_path = ["_templates"]
# List of patterns, relative to source directory, that match files and # List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files. # directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path. # This pattern also affects html_static_path and html_extra_path.
exclude_patterns = [] exclude_patterns = ["images/*.md"]
source_suffix = { source_suffix = {
".rst": "restructuredtext", ".rst": "restructuredtext",
@ -102,3 +102,35 @@ html_theme_options = {
"prev_next_buttons_location": "bottom", "prev_next_buttons_location": "bottom",
"style_external_links": True, "style_external_links": True,
} }
rst_epilog = """
.. _kaldifeat: https://github.com/csukuangfj/kaldifeat
.. _Kaldi: https://github.com/kaldi-asr/kaldi
.. _PyTorch: https://pytorch.org/
.. _kaldifeat.Fbank: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/fbank.py#L10
.. _kaldifeat.Mfcc: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/mfcc.py#L10
.. _kaldifeat.Plp: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/plp.py#L10
.. _kaldifeat.Spectrogram: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/spectrogram.py#L9
.. _kaldifeat.OnlineFbank: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/fbank.py#L16
.. _kaldifeat.OnlineMfcc: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/mfcc.py#L16
.. _kaldifeat.OnlinePlp: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/plp.py#L16
.. _compute-fbank-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-fbank-feats.cc
.. _compute-mfcc-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-mfcc-feats.cc
.. _compute-plp-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-plp-feats.cc
.. _compute-spectrogram-feats: https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-spectrogram-feats.cc
.. _kaldi::OnlineFbank: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L160
.. _kaldi::OnlineMfcc: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L158
.. _kaldi::OnlinePlp: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L159
.. _kaldifeat.FbankOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-fbank.h#L19
.. _kaldi::FbankOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-fbank.h#L41
.. _kaldifeat.MfccOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-mfcc.h#L22
.. _kaldi::MfccOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-mfcc.h#L38
.. _kaldifeat.PlpOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-plp.h#L24
.. _kaldi::PlpOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-plp.h#L42
.. _kaldifeat.SpectrogramOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-spectrogram.h#L18
.. _kaldi::SpectrogramOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-spectrogram.h#L38
.. _kaldifeat.FrameExtractionOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-window.h#L30
.. _kaldi::FrameExtractionOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-window.h#L35
.. _kaldifeat.MelBanksOptions: https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/mel-computations.h#L17
.. _kaldi::MelBanksOptions: https://github.com/kaldi-asr/kaldi/blob/master/src/feat/mel-computations.h#L43
"""

View File

@ -0,0 +1,8 @@
## File description
<https://shields.io/> is used to create the following files:
- ./os.svg
- ./python_ge_3.6-blue.svg
- ./cuda_ge_10.1-orange.svg
- ./pytorch_ge_1.5.0-green.svg

View File

@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="176" height="20" role="img" aria-label="os: Linux | macOS | Windows"><title>os: Linux | macOS | Windows</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="176" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="23" height="20" fill="#555"/><rect x="23" width="153" height="20" fill="#97ca00"/><rect width="176" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="125" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="130">os</text><text x="125" y="140" transform="scale(.1)" fill="#fff" textLength="130">os</text><text aria-hidden="true" x="985" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="1430">Linux | macOS | Windows</text><text x="985" y="140" transform="scale(.1)" fill="#fff" textLength="1430">Linux | macOS | Windows</text></g></svg>

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

@ -1 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="118" height="20" role="img" aria-label="OS: Linux | macOS"><title>OS: Linux | macOS</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="118" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="27" height="20" fill="#555"/><rect x="27" width="91" height="20" fill="#4c1"/><rect width="118" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="145" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="170">OS</text><text x="145" y="140" transform="scale(.1)" fill="#fff" textLength="170">OS</text><text aria-hidden="true" x="715" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="810">Linux | macOS</text><text x="715" y="140" transform="scale(.1)" fill="#fff" textLength="810">Linux | macOS</text></g></svg>

Before

Width:  |  Height:  |  Size: 1.1 KiB

View File

@ -6,19 +6,11 @@
kaldifeat kaldifeat
========= =========
`kaldifeat <https://github.com/csukuangfj/kaldifeat>`_ implements
feature extraction algorithms **compatible** with kaldi using PyTorch, supporting CUDA
as well as autograd.
Currently, only fbank features are supported.
It can produce the same feature output as ``compute-fbank-feats`` (from kaldi)
when given the same options.
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Contents: :caption: Contents
installation intro
usage installation/index
usage/index

View File

@ -1,54 +0,0 @@
Installation
============
.. _from source:
Install kaldifeat from source
-----------------------------
You have to install ``cmake`` and ``PyTorch`` first.
- ``cmake`` 3.11 is known to work. Other CMake versions may also work.
- ``PyTorch`` 1.8.1 is known to work. Other PyTorch versions may also work.
- Python >= 3.6
The commands to install ``kaldifeat`` from source are:
.. code-block:: bash
git clone https://github.com/csukuangfj/kaldifeat
cd kaldifeat
python3 setup.py install
To test that you have installed ``kaldifeat`` successfully, please run:
.. code-block:: bash
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
It should print the version, e.g., ``1.0``.
Install kaldifeat from PyPI
---------------------------
The pre-built ``kaldifeat`` hosted on PyPI uses PyTorch 1.8.1.
If you install ``kaldifeat`` using pip, it will replace your locally
installed PyTorch automatically with PyTorch 1.8.1.
If you don't want this happen, please `Install kaldifeat from source`_.
The command to install ``kaldifeat`` from PyPI is:
.. code-block:: bash
pip install kaldifeat
To test that you have installed ``kaldifeat`` successfully, please run:
.. code-block:: bash
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
It should print the version, e.g., ``1.0``.

View File

@ -0,0 +1,48 @@
FAQs
====
How to install a CUDA version of kaldifeat from source
------------------------------------------------------
You need to first install a CUDA version of `PyTorch`_ and then install `kaldifeat`_.
.. note::
You can use a CUDA version of `kaldifeat`_ on machines with no GPUs.
How to install a CPU version of kaldifeat from source
-----------------------------------------------------
You need to first install a CPU version of `PyTorch`_ and then install `kaldifeat`_.
How to fix `Caffe2: Cannot find cuDNN library`
----------------------------------------------
.. code-block::
Your installed Caffe2 version uses cuDNN but I cannot find the cuDNN
libraries. Please set the proper cuDNN prefixes and / or install cuDNN.
You will have such an error when you want to install a CUDA version of `kaldifeat`_
by ``pip install kaldifeat`` or from source.
You need to first install cuDNN. Assume you have installed cuDNN to the
path ``/path/to/cudnn``. You can fix the error by using ``one`` of the following
commands.
(1) Fix for installation using ``pip install``
.. code-block:: bash
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DCUDNN_LIBRARY_PATH=/path/to/cudnn/lib/libcudnn.so -DCUDNN_INCLUDE_PATH=/path/to/cudnn/include"
pip install --verbose kaldifeat
(2) Fix for installation from source
.. code-block:: bash
mkdir /some/path
git clone https://github.com/csukuangfj/kaldifeat.git
cd kaldifeat
export KALDIFEAT_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Release -DCUDNN_LIBRARY_PATH=/path/to/cudnn/lib/libcudnn.so -DCUDNN_INCLUDE_PATH=/path/to/cudnn/include"
python setup.py install

View File

@ -0,0 +1,47 @@
.. _from source:
Install kaldifeat from source
=============================
You have to install ``cmake`` and `PyTorch`_ first.
- ``cmake`` 3.11 is known to work. Other CMake versions may also work.
- `PyTorch`_ >= 1.5.0 is known to work. Other PyTorch versions may also work.
- Python >= 3.6
- A compiler that supports C++ 14
The commands to install `kaldifeat`_ from source are:
.. code-block:: bash
git clone https://github.com/csukuangfj/kaldifeat
cd kaldifeat
python3 setup.py install
To test that you have installed `kaldifeat`_ successfully, please run:
.. code-block:: bash
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
It should print the version, e.g., ``1.0``.
.. _from PyPI:
Install kaldifeat from PyPI
---------------------------
The command to install `kaldifeat`_ from PyPI is:
.. code-block:: bash
pip install --verbose kaldifeat
To test that you have installed `kaldifeat`_ successfully, please run:
.. code-block:: bash
python3 -c "import kaldifeat; print(kaldifeat.__version__)"
It should print the version, e.g., ``1.0``.

View File

@ -0,0 +1,139 @@
From pre-compiled wheels (Recommended)
=======================================
You can find pre-compiled wheels at
- CPU wheels: `<https://csukuangfj.github.io/kaldifeat/cpu.html>`_
- CUDA wheels: `<https://csukuangfj.github.io/kaldifeat/cuda.html>`_
We give a few examples below to show you how to install `kaldifeat`_ from
pre-compiled wheels.
.. hint::
The following lists only some examples. We suggest that you always select the
latest version of ``kaldifeat``.
Linux (CPU)
-----------
Suppose you want to install the following wheel:
.. code-block:: bash
https://huggingface.co/csukuangfj/kaldifeat/resolve/main/ubuntu-cpu/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
you can use one of the following methods:
.. code-block:: bash
# method 1
pip install torch==2.4.0+cpu -f https://download.pytorch.org/whl/torch/
pip install kaldifeat==1.25.4.dev20240725+cpu.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cpu.html
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# pip install kaldifeat==1.25.4.dev20240725+cpu.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cpu-cn.html
# method 2
pip install torch==2.4.0+cpu -f https://download.pytorch.org/whl/torch/
wget https://huggingface.co/csukuangfj/kaldifeat/resolve/main/ubuntu-cpu/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# wget https://hf-mirror.com/csukuangfj/kaldifeat/resolve/main/ubuntu-cpu/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
pip install ./kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Windows (CPU)
--------------
Suppose you want to install the following wheel:
.. code-block:: bash
https://huggingface.co/csukuangfj/kaldifeat/resolve/main/windows-cpu/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-win_amd64.whl
you can use one of the following methods:
.. code-block:: bash
# method 1
pip install torch==2.4.0+cpu -f https://download.pytorch.org/whl/torch/
pip install kaldifeat==1.25.4.dev20240725+cpu.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cpu.html
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# pip install kaldifeat==1.25.4.dev20240725+cpu.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cpu-cn.html
# method 2
pip install torch==2.4.0+cpu -f https://download.pytorch.org/whl/torch/
wget https://huggingface.co/csukuangfj/kaldifeat/resolve/main/windows-cpu/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-win_amd64.whl
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# wget https://hf-mirror.com/csukuangfj/kaldifeat/resolve/main/windows-cpu/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-win_amd64.whl
pip install ./kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp312-cp312-win_amd64.whl
macOS (CPU)
-----------
Suppose you want to install the following wheel:
.. code-block:: bash
https://huggingface.co/csukuangfj/kaldifeat/resolve/main/macos/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp311-cp311-macosx_11_0_arm64.whl
you can use one of the following methods:
.. code-block:: bash
# method 1
pip install torch==2.4.0
pip install kaldifeat==1.25.4.dev20240725+cpu.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cpu.html
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# pip install kaldifeat==1.25.4.dev20240725+cpu.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cpu-cn.html
# method 2
pip install torch==2.4.0 -f https://download.pytorch.org/whl/torch/
wget https://huggingface.co/csukuangfj/kaldifeat/resolve/main/macos/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp311-cp311-macosx_11_0_arm64.whl
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# wget https://hf-mirror.com/csukuangfj/kaldifeat/resolve/main/macos/kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp311-cp311-macosx_11_0_arm64.whl
pip install ./kaldifeat-1.25.4.dev20240725+cpu.torch2.4.0-cp311-cp311-macosx_11_0_arm64.whl
Linux (CUDA)
------------
Suppose you want to install the following wheel:
.. code-block:: bash
https://huggingface.co/csukuangfj/kaldifeat/resolve/main/ubuntu-cuda/kaldifeat-1.25.4.dev20240725+cuda12.4.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
you can use one of the following methods:
.. code-block:: bash
# method 1
pip install torch==2.4.0+cu124 -f https://download.pytorch.org/whl/torch/
pip install kaldifeat==1.25.4.dev20240725+cuda12.4.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cuda.html
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# pip install kaldifeat==1.25.4.dev20240725+cuda12.4.torch2.4.0 -f https://csukuangfj.github.io/kaldifeat/cuda-cn.html
# method 2
pip install torch==2.4.0+cu124 -f https://download.pytorch.org/whl/torch/
wget https://huggingface.co/csukuangfj/kaldifeat/resolve/main/ubuntu-cuda/kaldifeat-1.25.4.dev20240725+cuda12.4.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
# For users from China
# 中国国内用户,如果访问不了 huggingface, 请使用
# wget https://hf-mirror.com/csukuangfj/kaldifeat/resolve/main/ubuntu-cuda/kaldifeat-1.25.4.dev20240725+cuda12.4.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
pip install ./kaldifeat-1.25.4.dev20240725+cuda12.4.torch2.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

View File

@ -0,0 +1,11 @@
Installation
============
.. toctree::
:maxdepth: 3
./from_wheels.rst
./from_source.rst
./faq.rst

103
doc/source/intro.rst Normal file
View File

@ -0,0 +1,103 @@
Introduction
============
`kaldifeat`_ implements
speech feature extraction algorithms **compatible** with `Kaldi`_ using `PyTorch`_,
supporting CUDA as well as autograd.
`kaldifeat`_ has the following features:
- Fully compatible with `Kaldi`_
.. note::
The underlying C++ code is copied & modified from `Kaldi`_ directly.
It is rewritten with `PyTorch` C++ APIs.
- Provide not only ``C++ APIs`` but also ``Python APIs``
.. note::
You can access `kaldifeat`_ from ``Python``.
- Support autograd
- Support ``CUDA`` and ``CPU``
.. note::
You can use CUDA for feature extraction.
- Support ``online`` (i.e., ``streaming``) and ``offline`` (i.e., ``non-streaming``)
feature extraction
- Support chunk-based processing
.. note::
This is especially usefull if you want to process audios of several
hours long, which may cause OOM if you send them for computation at once.
With chunk-based processing, you can process audios of arbirtray length.
- Support batch processing
.. note::
With `kaldifeat`_ you can extract features for a batch of audios
.. see https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html
Currently implemented speech features and their counterparts in `Kaldi`_ are
listed in the following table.
.. list-table:: Supported speech features
:widths: 50 50
:header-rows: 1
* - Supported speech features
- Counterpart in `Kaldi`_
* - `kaldifeat.Fbank`_
- `compute-fbank-feats`_
* - `kaldifeat.Mfcc`_
- `compute-mfcc-feats`_
* - `kaldifeat.Plp`_
- `compute-plp-feats`_
* - `kaldifeat.Spectrogram`_
- `compute-spectrogram-feats`_
* - `kaldifeat.OnlineFbank`_
- `kaldi::OnlineFbank`_
* - `kaldifeat.OnlineMfcc`_
- `kaldi::OnlineMfcc`_
* - `kaldifeat.OnlinePlp`_
- `kaldi::OnlinePlp`_
Each feature computer needs an option. The following table lists the options
for each computer and the corresponding options in `Kaldi`_.
.. hint::
Note that we reuse the parameter names from `Kaldi`_.
Also, both online feature computers and offline feature computers share the
same option.
.. list-table:: Feature computer options
:widths: 50 50
:header-rows: 1
* - Options in `kaldifeat`_
- Corresponding options in `Kaldi`_
* - `kaldifeat.FbankOptions`_
- `kaldi::FbankOptions`_
* - `kaldifeat.MfccOptions`_
- `kaldi::MfccOptions`_
* - `kaldifeat.PlpOptions`_
- `kaldi::PlpOptions`_
* - `kaldifeat.SpectrogramOptions`_
- `kaldi::SpectrogramOptions`_
* - `kaldifeat.FrameExtractionOptions`_
- `kaldi::FrameExtractionOptions`_
* - `kaldifeat.MelBanksOptions`_
- `kaldi::MelBanksOptions`_
Read more to learn how to install `kaldifeat`_ and how to use each feature
computer.

View File

@ -1,212 +0,0 @@
Usage
=====
Let us first see the help message of kaldi's ``compute-fbank-feats``:
.. code-block:: bash
$ compute-fbank-feats
Create Mel-filter bank (FBANK) feature files.
Usage: compute-fbank-feats [options...] <wav-rspecifier> <feats-wspecifier>
Options:
--allow-downsample : If true, allow the input waveform to have a higher frequency than the specified --sample-frequency (and we'll downsample). (bool, default = false)
--allow-upsample : If true, allow the input waveform to have a lower frequency than the specified --sample-frequency (and we'll upsample). (bool, default = false)
--blackman-coeff : Constant coefficient for generalized Blackman window. (float, default = 0.42)
--channel : Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (int, default = -1)
--debug-mel : Print out debugging information for mel bin computation (bool, default = false)
--dither : Dithering constant (0.0 means no dither). If you turn this off, you should set the --energy-floor option, e.g. to 1.0 or 0.1 (float, default = 1)
--energy-floor : Floor on energy (absolute, not relative) in FBANK computation. Only makes a difference if --use-energy=true; only necessary if --dither=0.0. Suggested values: 0.1 or 1.0 (float, default = 0)
--frame-length : Frame length in milliseconds (float, default = 25)
--frame-shift : Frame shift in milliseconds (float, default = 10)
--high-freq : High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (float, default = 0)
--htk-compat : If true, put energy last. Warning: not sufficient to get HTK compatible features (need to change other parameters). (bool, default = false)
--low-freq : Low cutoff frequency for mel bins (float, default = 20)
--max-feature-vectors : Memory optimization. If larger than 0, periodically remove feature vectors so that only this number of the latest feature vectors is retained. (int, default = -1)
--min-duration : Minimum duration of segments to process (in seconds). (float, default = 0)
--num-mel-bins : Number of triangular mel-frequency bins (int, default = 23)
--output-format : Format of the output files [kaldi, htk] (string, default = "kaldi")
--preemphasis-coefficient : Coefficient for use in signal preemphasis (float, default = 0.97)
--raw-energy : If true, compute energy before preemphasis and windowing (bool, default = true)
--remove-dc-offset : Subtract mean from waveform on each frame (bool, default = true)
--round-to-power-of-two : If true, round window size to power of two by zero-padding input to FFT. (bool, default = true)
--sample-frequency : Waveform data sample frequency (must match the waveform file, if specified there) (float, default = 16000)
--snip-edges : If true, end effects will be handled by outputting only frames that completely fit in the file, and the number of frames depends on the frame-length. If false, the number of frames depends only on the frame-shift, and we reflect the data at the ends. (bool, default = true)
--subtract-mean : Subtract mean of each feature file [CMS]; not recommended to do it this way. (bool, default = false)
--use-energy : Add an extra dimension with energy to the FBANK output. (bool, default = false)
--use-log-fbank : If true, produce log-filterbank, else produce linear. (bool, default = true)
--use-power : If true, use power, else use magnitude. (bool, default = true)
--utt2spk : Utterance to speaker-id map (if doing VTLN and you have warps per speaker) (string, default = "")
--vtln-high : High inflection point in piecewise linear VTLN warping function (if negative, offset from high-mel-freq (float, default = -500)
--vtln-low : Low inflection point in piecewise linear VTLN warping function (float, default = 100)
--vtln-map : Map from utterance or speaker-id to vtln warp factor (rspecifier) (string, default = "")
--vtln-warp : Vtln warp factor (only applicable if vtln-map not specified) (float, default = 1)
--window-type : Type of window ("hamming"|"hanning"|"povey"|"rectangular"|"sine"|"blackmann") (string, default = "povey")
--write-utt2dur : Wspecifier to write duration of each utterance in seconds, e.g. 'ark,t:utt2dur'. (string, default = "")
Standard options:
--config : Configuration file to read (this option may be repeated) (string, default = "")
--help : Print out usage message (bool, default = false)
--print-args : Print the command line arguments (to stderr) (bool, default = true)
--verbose : Verbose level (higher->more logging) (int, default = 0)
FbankOptions
------------
``kaldifeat`` reuses the same options from kaldi's ``compute-fbank-feats``.
The following shows the default values of ``kaldifeat.FbankOptions``:
.. code-block:: python
>>> import kaldifeat
>>> fbank_opts = kaldifeat.FbankOptions()
>>> print(fbank_opts)
frame_opts:
samp_freq: 16000
frame_shift_ms: 10
frame_length_ms: 25
dither: 1
preemph_coeff: 0.97
remove_dc_offset: 1
window_type: povey
round_to_power_of_two: 1
blackman_coeff: 0.42
snip_edges: 1
mel_opts:
num_bins: 23
low_freq: 20
high_freq: 0
vtln_low: 100
vtln_high: -500
debug_mel: 0
htk_mode: 0
use_energy: 0
energy_floor: 0
raw_energy: 1
htk_compat: 0
use_log_fbank: 1
use_power: 1
device: cpu
It consists of three parts:
- ``frame_opts``
Options in this part are accessed by ``frame_opts.xxx``. That is, to access
the sample rate, you use:
.. code-block:: python
>>> fbank_opts = kaldifeat.FbankOptions()
>>> print(fbank_opts.frame_opts.samp_freq)
16000.0
- ``mel_opts``
Options in this part are accessed by ``mel_opts.xxx``. That is, to access
the number of mel bins, you use:
.. code-block:: python
>>> fbank_opts = kaldifeat.FbankOptions()
>>> print(fbank_opts.mel_opts.num_bins)
23
- fbank related
Options in this part are accessed directly. That is, to access the device
field, you use:
.. code-block::
>>> print(fbank_opts.device)
cpu
>>> fbank_opts.device = 'cuda:0'
>>> print(fbank_opts.device)
cuda:0
>>> import torch
>>> fbank_opts.device = torch.device('cuda', 0)
>>> print(fbank_opts.device)
cuda:0
To change the sample rate to 8000, you can use:
.. code-block:: python
>>> fbank_opts = kaldifeat.FbankOptions()
>>> print(fbank_opts.frame_opts.samp_freq)
16000.0
>>> fbank_opts.frame_opts.samp_freq = 8000
>>> print(fbank_opts.frame_opts.samp_freq)
8000.0
To change ``snip_edges`` to ``False``, you can use:
.. code-block:: python
>>> fbank_opts.frame_opts.snip_edges = False
>>> print(fbank_opts.frame_opts.snip_edges)
False
To change number of mel bins to 80, you can use:
.. code-block:: python
>>> print(fbank_opts.mel_opts.num_bins)
23
>>> fbank_opts.mel_opts.num_bins = 80
>>> print(fbank_opts.mel_opts.num_bins)
80
To change the device to ``cuda``, you can use:
Fbank
-----
The following shows how to use ``kaldifeat.Fbank`` to compute
the fbank features of sound files.
First, let us generate two sound files using ``sox``:
.. code-block:: bash
# generate a wav of two seconds, containing a sine-wave
# swept from 300 Hz to 3300 Hz
sox -n -r 16000 -b 16 test.wav synth 1.2 sine 300-3300
# another sound file with 0.5 seconds
sox -n -r 16000 -b 16 test2.wav synth 0.5 sine 300-3300
.. hint::
You can find the above two files by visiting the following two links:
- `test.wav <https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_data/test.wav>`_
- `test2.wav <https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_data/test2.wav>`_
The `following code <https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_fbank.py>`_
shows the usage of ``kaldifeat.Fbank``.
It shows:
- How to read a sound file. Note that audio samples are scaled to the range [-32768, 32768].
The intention is to produce the same output as kaldi. You don't need to scale it if
you don't care about the compatibility with kaldi
- ``kaldifeat.Fbank`` supports CUDA as well as CPU
- ``kaldifeat.Fbank`` supports processing sound file in a batch as well as accepting
a single sound file
.. literalinclude:: ./code/test_fbank.py
:caption: Demo of ``kaldifeat.Fbank``
:language: python

View File

@ -0,0 +1,46 @@
compute-fbank-feats
Create Mel-filter bank (FBANK) feature files.
Usage: compute-fbank-feats [options...] <wav-rspecifier> <feats-wspecifier>
Options:
--allow-downsample : If true, allow the input waveform to have a higher frequency than the specified --sample-frequency (and we'll downsample). (bool, default = false)
--allow-upsample : If true, allow the input waveform to have a lower frequency than the specified --sample-frequency (and we'll upsample). (bool, default = false)
--blackman-coeff : Constant coefficient for generalized Blackman window. (float, default = 0.42)
--channel : Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (int, default = -1)
--debug-mel : Print out debugging information for mel bin computation (bool, default = false)
--dither : Dithering constant (0.0 means no dither). If you turn this off, you should set the --energy-floor option, e.g. to 1.0 or 0.1 (float, default = 1)
--energy-floor : Floor on energy (absolute, not relative) in FBANK computation. Only makes a difference if --use-energy=true; only necessary if --dither=0.0. Suggested values: 0.1 or 1.0 (float, default = 0)
--frame-length : Frame length in milliseconds (float, default = 25)
--frame-shift : Frame shift in milliseconds (float, default = 10)
--high-freq : High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (float, default = 0)
--htk-compat : If true, put energy last. Warning: not sufficient to get HTK compatible features (need to change other parameters). (bool, default = false)
--low-freq : Low cutoff frequency for mel bins (float, default = 20)
--max-feature-vectors : Memory optimization. If larger than 0, periodically remove feature vectors so that only this number of the latest feature vectors is retained. (int, default = -1)
--min-duration : Minimum duration of segments to process (in seconds). (float, default = 0)
--num-mel-bins : Number of triangular mel-frequency bins (int, default = 23)
--output-format : Format of the output files [kaldi, htk] (string, default = "kaldi")
--preemphasis-coefficient : Coefficient for use in signal preemphasis (float, default = 0.97)
--raw-energy : If true, compute energy before preemphasis and windowing (bool, default = true)
--remove-dc-offset : Subtract mean from waveform on each frame (bool, default = true)
--round-to-power-of-two : If true, round window size to power of two by zero-padding input to FFT. (bool, default = true)
--sample-frequency : Waveform data sample frequency (must match the waveform file, if specified there) (float, default = 16000)
--snip-edges : If true, end effects will be handled by outputting only frames that completely fit in the file, and the number of frames depends on the frame-length. If false, the number of frames depends only on the frame-shift, and we reflect the data at the ends. (bool, default = true)
--subtract-mean : Subtract mean of each feature file [CMS]; not recommended to do it this way. (bool, default = false)
--use-energy : Add an extra dimension with energy to the FBANK output. (bool, default = false)
--use-log-fbank : If true, produce log-filterbank, else produce linear. (bool, default = true)
--use-power : If true, use power, else use magnitude. (bool, default = true)
--utt2spk : Utterance to speaker-id map (if doing VTLN and you have warps per speaker) (string, default = "")
--vtln-high : High inflection point in piecewise linear VTLN warping function (if negative, offset from high-mel-freq (float, default = -500)
--vtln-low : Low inflection point in piecewise linear VTLN warping function (float, default = 100)
--vtln-map : Map from utterance or speaker-id to vtln warp factor (rspecifier) (string, default = "")
--vtln-warp : Vtln warp factor (only applicable if vtln-map not specified) (float, default = 1)
--window-type : Type of window ("hamming"|"hanning"|"povey"|"rectangular"|"sine"|"blackmann") (string, default = "povey")
--write-utt2dur : Wspecifier to write duration of each utterance in seconds, e.g. 'ark,t:utt2dur'. (string, default = "")
Standard options:
--config : Configuration file to read (this option may be repeated) (string, default = "")
--help : Print out usage message (bool, default = false)
--print-args : Print the command line arguments (to stderr) (bool, default = true)
--verbose : Verbose level (higher->more logging) (int, default = 0)

View File

@ -0,0 +1,65 @@
$ python3
Python 3.8.0 (default, Oct 28 2019, 16:14:01)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import kaldifeat
>>> opts = kaldifeat.FbankOptions()
>>> print(opts)
frame_opts:
samp_freq: 16000
frame_shift_ms: 10
frame_length_ms: 25
dither: 1
preemph_coeff: 0.97
remove_dc_offset: 1
window_type: povey
round_to_power_of_two: 1
blackman_coeff: 0.42
snip_edges: 1
max_feature_vectors: -1
mel_opts:
num_bins: 23
low_freq: 20
high_freq: 0
vtln_low: 100
vtln_high: -500
debug_mel: 0
htk_mode: 0
use_energy: 0
energy_floor: 0
raw_energy: 1
htk_compat: 0
use_log_fbank: 1
use_power: 1
device: cpu
>>> print(opts.dither)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: '_kaldifeat.FbankOptions' object has no attribute 'dither'
>>>
>>> print(opts.frame_opts.dither)
1.0
>>> opts.frame_opts.dither = 0 # disable dither
>>> print(opts.frame_opts.dither)
0.0
>>> import torch
>>> print(opts.device)
cpu
>>> opts.device = 'cuda:0'
>>> print(opts.device)
cuda:0
>>> opts.device = torch.device('cuda', 1)
>>> print(opts.device)
cuda:1
>>> opts.device = 'cpu'
>>> print(opts.device)
cpu
>>> print(opts.mel_opts.num_bins)
23
>>> opts.mel_opts.num_bins = 80
>>> print(opts.mel_opts.num_bins)
80

View File

@ -0,0 +1 @@
../../../../kaldifeat/python/tests/test_fbank_options.py

View File

@ -0,0 +1,3 @@
kaldifeat.Fbank
===============

View File

@ -0,0 +1,51 @@
kaldifeat.FbankOptions
======================
If you want to construct an instance of `kaldifeat.Fbank`_ or
`kaldifeat.OnlineFbank`_, you have to provide an instance of
`kaldifeat.FbankOptions`_.
The following code shows how to construct an instance of `kaldifeat.FbankOptions`_.
.. literalinclude:: ./code/fbank_options-1.txt
:caption: Usage of `kaldifeat.FbankOptions`_
:emphasize-lines: 6,8,22,37
Note that we reuse the same option name with `compute-fbank-feats`_ from `Kaldi`_:
.. code-block:: bash
$ compute-fbank-feats --help
.. literalinclude:: ./code/compute-fbank-feats-help.txt
:caption: Output of ``compute-fbank-feats --help``
Please refer to the output of ``compute-fbank-feats --help`` for the meaning
of each field of `kaldifeat.FbankOptions`_.
One thing worth noting is that `kaldifeat.FbankOptions`_ has a field ``device``,
which is an instance of ``torch.device``. You can assign it either a string, e.g.,
``"cpu"`` or ``"cuda:0"``, or an instance of ``torch.device``, e.g., ``torch.device("cpu")`` or
``torch.device("cuda", 1)``.
.. hint::
You can use this field to control whether the feature computer
constructed from it performs computation on CPU or CUDA.
.. caution::
If you use a CUDA device, make sure that you have installed a CUDA version
of `PyTorch`_.
Example usage
-------------
The following code from
`<https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/tests/test_fbank_options.py>`_
demonstrate the usage of `kaldifeat.FbankOptions`_:
.. literalinclude:: ./code/test_fbank_options.py
:caption: Example usage of `kaldifeat.FbankOptions`_
:language: python

View File

@ -0,0 +1,11 @@
Usage
=====
This section describes how to use feature computers in `kaldifeat`_.
.. toctree::
:maxdepth: 2
fbank_options
fbank
online_fbank

View File

@ -0,0 +1,3 @@
kaldifeat.OnlineFbank
=====================

106
get_version.py Executable file
View File

@ -0,0 +1,106 @@
#!/usr/bin/env python3
import datetime
import os
import platform
import re
import shutil
import torch
def is_macos():
return platform.system() == "Darwin"
def is_windows():
return platform.system() == "Windows"
def with_cuda():
if shutil.which("nvcc") is None:
return False
if is_macos():
return False
return True
def get_pytorch_version():
# if it is 1.7.1+cuda101, then strip +cuda101
return torch.__version__.split("+")[0]
def get_cuda_version():
from torch.utils import collect_env
running_cuda_version = collect_env.get_running_cuda_version(collect_env.run)
cuda_version = torch.version.cuda
if running_cuda_version is not None and cuda_version is not None:
assert cuda_version in running_cuda_version, (
f"PyTorch is built with CUDA version: {cuda_version}.\n"
f"The current running CUDA version is: {running_cuda_version}"
)
return cuda_version
def is_for_pypi():
ans = os.environ.get("KALDIFEAT_IS_FOR_PYPI", None)
return ans is not None
def is_stable():
ans = os.environ.get("KALDIFEAT_IS_STABLE", None)
return ans is not None
def is_for_conda():
ans = os.environ.get("KALDIFEAT_IS_FOR_CONDA", None)
return ans is not None
def get_package_version():
# Set a default CUDA version here so that `pip install kaldifeat`
# uses the default CUDA version.
#
default_cuda_version = "10.1" # CUDA 10.1
if with_cuda():
cuda_version = get_cuda_version()
if is_for_pypi() and default_cuda_version == cuda_version:
cuda_version = ""
pytorch_version = ""
local_version = ""
else:
cuda_version = f"+cuda{cuda_version}"
pytorch_version = get_pytorch_version()
local_version = f"{cuda_version}.torch{pytorch_version}"
else:
pytorch_version = get_pytorch_version()
local_version = f"+cpu.torch{pytorch_version}"
if is_for_conda():
local_version = ""
if is_for_pypi() and is_macos():
local_version = ""
with open("CMakeLists.txt") as f:
content = f.read()
latest_version = re.search(r"set\(kaldifeat_VERSION (.*)\)", content).group(
1
)
latest_version = latest_version.strip('"')
if not is_stable():
dt = datetime.datetime.utcnow()
package_version = f"{latest_version}.dev{dt.year}{dt.month:02d}{dt.day:02d}{local_version}"
else:
package_version = f"{latest_version}"
return package_version
if __name__ == "__main__":
print(get_package_version())

View File

@ -1,2 +1,4 @@
add_subdirectory(csrc) add_subdirectory(csrc)
add_subdirectory(python) if(kaldifeat_BUILD_PYMODULE)
add_subdirectory(python)
endif()

View File

@ -10,14 +10,25 @@ set(kaldifeat_srcs
matrix-functions.cc matrix-functions.cc
mel-computations.cc mel-computations.cc
online-feature.cc online-feature.cc
whisper-fbank.cc
) )
add_library(kaldifeat_core SHARED ${kaldifeat_srcs}) add_library(kaldifeat_core ${kaldifeat_srcs})
target_link_libraries(kaldifeat_core PUBLIC ${TORCH_LIBRARIES}) target_link_libraries(kaldifeat_core PUBLIC ${TORCH_LIBRARIES})
target_compile_definitions(kaldifeat_core PUBLIC KALDIFEAT_TORCH_VERSION_MAJOR=${KALDIFEAT_TORCH_VERSION_MAJOR}) target_compile_definitions(kaldifeat_core PUBLIC KALDIFEAT_TORCH_VERSION_MAJOR=${KALDIFEAT_TORCH_VERSION_MAJOR})
target_compile_definitions(kaldifeat_core PUBLIC KALDIFEAT_TORCH_VERSION_MINOR=${KALDIFEAT_TORCH_VERSION_MINOR}) target_compile_definitions(kaldifeat_core PUBLIC KALDIFEAT_TORCH_VERSION_MINOR=${KALDIFEAT_TORCH_VERSION_MINOR})
if(APPLE)
execute_process(
COMMAND "${PYTHON_EXECUTABLE}" -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"
OUTPUT_STRIP_TRAILING_WHITESPACE
OUTPUT_VARIABLE PYTHON_SITE_PACKAGE_DIR
)
message(STATUS "PYTHON_SITE_PACKAGE_DIR: ${PYTHON_SITE_PACKAGE_DIR}")
target_link_libraries(kaldifeat_core PUBLIC "-L ${PYTHON_SITE_PACKAGE_DIR}/../..")
endif()
add_executable(test_kaldifeat test_kaldifeat.cc) add_executable(test_kaldifeat test_kaldifeat.cc)
target_link_libraries(test_kaldifeat PRIVATE kaldifeat_core) target_link_libraries(test_kaldifeat PRIVATE kaldifeat_core)
@ -31,9 +42,15 @@ function(kaldifeat_add_test source)
gtest_main gtest_main
) )
# NOTE: We set the working directory here so that
# it works also on windows. The reason is that
# the required DLLs are inside ${TORCH_DIR}/lib
# and they can be found by the exe if the current
# working directory is ${TORCH_DIR}\lib
add_test(NAME "Test.${name}" add_test(NAME "Test.${name}"
COMMAND COMMAND
$<TARGET_FILE:${name}> $<TARGET_FILE:${name}>
WORKING_DIRECTORY ${TORCH_DIR}/lib
) )
endfunction() endfunction()
@ -48,3 +65,29 @@ if(kaldifeat_BUILD_TESTS)
kaldifeat_add_test(${source}) kaldifeat_add_test(${source})
endforeach() endforeach()
endif() endif()
file(MAKE_DIRECTORY
DESTINATION
${PROJECT_BINARY_DIR}/include/kaldifeat/csrc
)
file(GLOB_RECURSE all_headers *.h)
message(STATUS "All headers: ${all_headers}")
file(COPY
${all_headers}
DESTINATION
${PROJECT_BINARY_DIR}/include/kaldifeat/csrc
)
if(BUILD_SHARED_LIBS AND WIN32)
install(TARGETS kaldifeat_core
DESTINATION ../
)
endif()
install(TARGETS kaldifeat_core
DESTINATION ${CMAKE_INSTALL_LIBDIR}
)
install(FILES ${all_headers}
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/kaldifeat/csrc
)

View File

@ -0,0 +1 @@
exclude_files=whisper-mel-bank.h,whisper-v3-mel-bank.h

View File

@ -55,10 +55,17 @@ torch::Tensor OfflineFeatureTpl<F>::ComputeFeatures(const torch::Tensor &wave,
int32_t padding = frame_opts.PaddedWindowSize() - strided_input.size(1); int32_t padding = frame_opts.PaddedWindowSize() - strided_input.size(1);
if (padding > 0) { if (padding > 0) {
#ifdef __ANDROID__
auto padding_value = torch::zeros(
{strided_input.size(0), padding},
torch::dtype(torch::kFloat).device(strided_input.device()));
strided_input = torch::cat({strided_input, padding_value}, 1);
#else
strided_input = torch::nn::functional::pad( strided_input = torch::nn::functional::pad(
strided_input, torch::nn::functional::PadFuncOptions({0, padding}) strided_input, torch::nn::functional::PadFuncOptions({0, padding})
.mode(torch::kConstant) .mode(torch::kConstant)
.value(0)); .value(0));
#endif
} }
return computer_.Compute(log_energy_pre_window, vtln_warp, strided_input); return computer_.Compute(log_energy_pre_window, vtln_warp, strided_input);

View File

@ -65,7 +65,7 @@ torch::Tensor FbankComputer::Compute(torch::Tensor signal_raw_log_energy,
// note spectrum is in magnitude, not power, because of `abs()` // note spectrum is in magnitude, not power, because of `abs()`
#if defined(KALDIFEAT_HAS_FFT_NAMESPACE) #if defined(KALDIFEAT_HAS_FFT_NAMESPACE)
// signal_frame shape: [x, 512] // signal_frame shape: [x, 512]
// spectrum shape [x, 257 // spectrum shape [x, 257]
torch::Tensor spectrum = torch::fft::rfft(signal_frame).abs(); torch::Tensor spectrum = torch::fft::rfft(signal_frame).abs();
#else #else
// signal_frame shape [x, 512] // signal_frame shape [x, 512]

View File

@ -44,20 +44,18 @@ struct FbankOptions {
std::string ToString() const { std::string ToString() const {
std::ostringstream os; std::ostringstream os;
os << "frame_opts: \n"; os << "FbankOptions(";
os << frame_opts << "\n";
os << "\n";
os << "mel_opts: \n"; os << "frame_opts=" << frame_opts.ToString() << ", ";
os << mel_opts << "\n"; os << "mel_opts=" << mel_opts.ToString() << ", ";
os << "use_energy: " << use_energy << "\n"; os << "use_energy=" << (use_energy ? "True" : "False") << ", ";
os << "energy_floor: " << energy_floor << "\n"; os << "energy_floor=" << energy_floor << ", ";
os << "raw_energy: " << raw_energy << "\n"; os << "raw_energy=" << (raw_energy ? "True" : "False") << ", ";
os << "htk_compat: " << htk_compat << "\n"; os << "htk_compat=" << (htk_compat ? "True" : "False") << ", ";
os << "use_log_fbank: " << use_log_fbank << "\n"; os << "use_log_fbank=" << (use_log_fbank ? "True" : "False") << ", ";
os << "use_power: " << use_power << "\n"; os << "use_power=" << (use_power ? "True" : "False") << ", ";
os << "device: " << device << "\n"; os << "device=\"" << device << "\")";
return os.str(); return os.str();
} }
}; };

View File

@ -53,20 +53,18 @@ struct MfccOptions {
std::string ToString() const { std::string ToString() const {
std::ostringstream os; std::ostringstream os;
os << "frame_opts: \n"; os << "MfccOptions(";
os << frame_opts << "\n"; os << "frame_opts=" << frame_opts.ToString() << ", ";
os << "\n"; os << "mel_opts=" << mel_opts.ToString() << ", ";
os << "mel_opts: \n"; os << "num_ceps=" << num_ceps << ", ";
os << mel_opts << "\n"; os << "use_energy=" << (use_energy ? "True" : "False") << ", ";
os << "energy_floor=" << energy_floor << ", ";
os << "raw_energy=" << (raw_energy ? "True" : "False") << ", ";
os << "cepstral_lifter=" << cepstral_lifter << ", ";
os << "htk_compat=" << (htk_compat ? "True" : "False") << ", ";
os << "device=\"" << device << "\")";
os << "num_ceps: " << num_ceps << "\n";
os << "use_energy: " << use_energy << "\n";
os << "energy_floor: " << energy_floor << "\n";
os << "raw_energy: " << raw_energy << "\n";
os << "cepstral_lifter: " << cepstral_lifter << "\n";
os << "htk_compat: " << htk_compat << "\n";
os << "device: " << device << "\n";
return os.str(); return os.str();
} }
}; };

View File

@ -61,23 +61,21 @@ struct PlpOptions {
std::string ToString() const { std::string ToString() const {
std::ostringstream os; std::ostringstream os;
os << "frame_opts: \n"; os << "PlpOptions(";
os << frame_opts << "\n";
os << "\n";
os << "mel_opts: \n"; os << "frame_opts=" << frame_opts.ToString() << ", ";
os << mel_opts << "\n"; os << "mel_opts=" << mel_opts.ToString() << ", ";
os << "lpc_order: " << lpc_order << "\n"; os << "lpc_order=" << lpc_order << ", ";
os << "num_ceps: " << num_ceps << "\n"; os << "num_ceps=" << num_ceps << ", ";
os << "use_energy: " << use_energy << "\n"; os << "use_energy=" << (use_energy ? "True" : "False") << ", ";
os << "energy_floor: " << energy_floor << "\n"; os << "energy_floor=" << energy_floor << ", ";
os << "raw_energy: " << raw_energy << "\n"; os << "raw_energy=" << (raw_energy ? "True" : "False") << ", ";
os << "compress_factor: " << compress_factor << "\n"; os << "compress_factor=" << compress_factor << ", ";
os << "cepstral_lifter: " << cepstral_lifter << "\n"; os << "cepstral_lifter=" << cepstral_lifter << ", ";
os << "cepstral_scale: " << cepstral_scale << "\n"; os << "cepstral_scale=" << cepstral_scale << ", ";
os << "htk_compat: " << htk_compat << "\n"; os << "htk_compat=" << (htk_compat ? "True" : "False") << ", ";
os << "device: " << device << "\n"; os << "device=\"" << device << "\")";
return os.str(); return os.str();
} }
}; };

View File

@ -36,13 +36,12 @@ struct SpectrogramOptions {
std::string ToString() const { std::string ToString() const {
std::ostringstream os; std::ostringstream os;
os << "frame_opts: \n"; os << "SpectrogramOptions(";
os << frame_opts << "\n"; os << "frame_opts=" << frame_opts.ToString() << ", ";
os << "energy_floor=" << energy_floor << ", ";
os << "energy_floor: " << energy_floor << "\n"; os << "raw_energy=" << (raw_energy ? "True" : "False") << ", ";
os << "raw_energy: " << raw_energy << "\n"; os << "return_raw_fft=" << (return_raw_fft ? "True" : "False") << ", ";
// os << "return_raw_fft: " << return_raw_fft << "\n"; os << "device=\"" << device << "\")";
os << "device: " << device << "\n";
return os.str(); return os.str();
} }
}; };

View File

@ -29,6 +29,13 @@ FeatureWindowFunction::FeatureWindowFunction(const FrameExtractionOptions &opts,
float *window_data = window.data_ptr<float>(); float *window_data = window.data_ptr<float>();
double a = M_2PI / (frame_length - 1); double a = M_2PI / (frame_length - 1);
if (opts.window_type == "hann") {
// see https://pytorch.org/docs/stable/generated/torch.hann_window.html
// We assume periodic is true
a = M_2PI / frame_length;
}
for (int32_t i = 0; i < frame_length; i++) { for (int32_t i = 0; i < frame_length; i++) {
double i_fl = static_cast<double>(i); double i_fl = static_cast<double>(i);
if (opts.window_type == "hanning") { if (opts.window_type == "hanning") {
@ -39,6 +46,8 @@ FeatureWindowFunction::FeatureWindowFunction(const FrameExtractionOptions &opts,
window_data[i] = sin(0.5 * a * i_fl); window_data[i] = sin(0.5 * a * i_fl);
} else if (opts.window_type == "hamming") { } else if (opts.window_type == "hamming") {
window_data[i] = 0.54 - 0.46 * cos(a * i_fl); window_data[i] = 0.54 - 0.46 * cos(a * i_fl);
} else if (opts.window_type == "hann") {
window_data[i] = 0.50 - 0.50 * cos(a * i_fl);
} else if (opts.window_type == } else if (opts.window_type ==
"povey") { // like hamming but goes to zero at edges. "povey") { // like hamming but goes to zero at edges.
window_data[i] = pow(0.5 - 0.5 * cos(a * i_fl), 0.85); window_data[i] = pow(0.5 - 0.5 * cos(a * i_fl), 0.85);

View File

@ -62,21 +62,20 @@ struct FrameExtractionOptions {
} }
std::string ToString() const { std::string ToString() const {
std::ostringstream os; std::ostringstream os;
#define KALDIFEAT_PRINT(x) os << #x << ": " << x << "\n" os << "FrameExtractionOptions(";
KALDIFEAT_PRINT(samp_freq); os << "samp_freq=" << samp_freq << ", ";
KALDIFEAT_PRINT(frame_shift_ms); os << "frame_shift_ms=" << frame_shift_ms << ", ";
KALDIFEAT_PRINT(frame_length_ms); os << "frame_length_ms=" << frame_length_ms << ", ";
KALDIFEAT_PRINT(dither); os << "dither=" << dither << ", ";
KALDIFEAT_PRINT(preemph_coeff); os << "preemph_coeff=" << preemph_coeff << ", ";
KALDIFEAT_PRINT(remove_dc_offset); os << "remove_dc_offset=" << (remove_dc_offset ? "True" : "False") << ", ";
KALDIFEAT_PRINT(window_type); os << "window_type=" << '"' << window_type << '"' << ", ";
KALDIFEAT_PRINT(round_to_power_of_two); os << "round_to_power_of_two=" << (round_to_power_of_two ? "True" : "False")
KALDIFEAT_PRINT(blackman_coeff); << ", ";
KALDIFEAT_PRINT(snip_edges); os << "blackman_coeff=" << blackman_coeff << ", ";
// KALDIFEAT_PRINT(allow_downsample); os << "snip_edges=" << (snip_edges ? "True" : "False") << ", ";
// KALDIFEAT_PRINT(allow_upsample); os << "max_feature_vectors=" << max_feature_vectors << ")";
KALDIFEAT_PRINT(max_feature_vectors);
#undef KALDIFEAT_PRINT
return os.str(); return os.str();
} }
}; };

View File

@ -0,0 +1,39 @@
#!/usr/bin/env python3
# Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)
import librosa
import numpy as np
def main():
m = librosa.filters.mel(sr=16000, n_fft=400, n_mels=128)
assert m.shape == (128, 201)
s = "// Auto-generated. Do NOT edit!\n\n"
s += "// Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)\n\n"
s += "\n"
s += "#ifndef KALDIFEAT_CSRC_WHISPER_V3_MEL_BANK_H_\n"
s += "#define KALDIFEAT_CSRC_WHISPER_V3_MEL_BANK_H_\n"
s += "namespace kaldifeat {\n\n"
s += f"constexpr int32_t kWhisperV3MelRows = {m.shape[0]};\n"
s += f"constexpr int32_t kWhisperV3MelCols = {m.shape[1]};\n"
s += "\n"
s += "constexpr float kWhisperV3MelArray[] = {\n"
sep = ""
for i, f in enumerate(m.reshape(-1).tolist()):
s += f"{sep}{f:.8f}"
sep = ", "
if i and i % 7 == 0:
s += ",\n"
sep = ""
s += "};\n\n"
s += "} // namespace kaldifeat\n\n"
s += "#endif // KALDIFEAT_CSRC_WHISPER_V3_MEL_BANK_H_\n"
with open("whisper-v3-mel-bank.h", "w") as f:
f.write(s)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,39 @@
#!/usr/bin/env python3
# Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)
import librosa
import numpy as np
def main():
m = librosa.filters.mel(sr=16000, n_fft=400, n_mels=80)
assert m.shape == (80, 201)
s = "// Auto-generated. Do NOT edit!\n\n"
s += "// Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)\n\n"
s += "\n"
s += "#ifndef KALDIFEAT_CSRC_WHISPER_MEL_BANK_H_\n"
s += "#define KALDIFEAT_CSRC_WHISPER_MEL_BANK_H_\n"
s += "namespace kaldifeat {\n\n"
s += f"constexpr int32_t kWhisperMelRows = {m.shape[0]};\n"
s += f"constexpr int32_t kWhisperMelCols = {m.shape[1]};\n"
s += "\n"
s += "constexpr float kWhisperMelArray[] = {\n"
sep = ""
for i, f in enumerate(m.reshape(-1).tolist()):
s += f"{sep}{f:.8f}"
sep = ", "
if i and i % 7 == 0:
s += ",\n"
sep = ""
s += "};\n\n"
s += "} // namespace kaldifeat\n\n"
s += "#endif // KALDIFEAT_CSRC_WHISPER_MEL_BANK_H_\n"
with open("whisper-mel-bank.h", "w") as f:
f.write(s)
if __name__ == "__main__":
main()

View File

@ -5,6 +5,7 @@
#ifndef KALDIFEAT_CSRC_LOG_H_ #ifndef KALDIFEAT_CSRC_LOG_H_
#define KALDIFEAT_CSRC_LOG_H_ #define KALDIFEAT_CSRC_LOG_H_
#include <cstdint>
#include <cstdlib> #include <cstdlib>
#include <iostream> #include <iostream>
#include <sstream> #include <sstream>

View File

@ -138,7 +138,7 @@ MelBanks::MelBanks(const MelBanksOptions &opts,
<< " and vtln-high " << vtln_high << ", versus " << " and vtln-high " << vtln_high << ", versus "
<< "low-freq " << low_freq << " and high-freq " << high_freq; << "low-freq " << low_freq << " and high-freq " << high_freq;
// we will transpose bins_mat_ at the end of this funciton // we will transpose bins_mat_ at the end of this function
bins_mat_ = torch::zeros({num_bins, num_fft_bins}, torch::kFloat); bins_mat_ = torch::zeros({num_bins, num_fft_bins}, torch::kFloat);
int32_t stride = bins_mat_.strides()[0]; int32_t stride = bins_mat_.strides()[0];
@ -179,12 +179,14 @@ MelBanks::MelBanks(const MelBanksOptions &opts,
last_index = i; last_index = i;
} }
} }
KALDIFEAT_ASSERT(first_index != -1 && last_index >= first_index &&
"You may have set num_mel_bins too large."); // Note: It is possible that first_index == last_index == -1 at this line.
// Replicate a bug in HTK, for testing purposes. // Replicate a bug in HTK, for testing purposes.
if (opts.htk_mode && bin == 0 && mel_low_freq != 0.0f) if (opts.htk_mode && bin == 0 && mel_low_freq != 0.0f &&
first_index != -1) {
this_bin[first_index] = 0.0f; this_bin[first_index] = 0.0f;
}
} }
if (debug_) KALDIFEAT_LOG << bins_mat_; if (debug_) KALDIFEAT_LOG << bins_mat_;
@ -196,6 +198,15 @@ MelBanks::MelBanks(const MelBanksOptions &opts,
} }
} }
MelBanks::MelBanks(const float *weights, int32_t num_rows, int32_t num_cols,
torch::Device device)
: debug_(false), htk_mode_(false) {
bins_mat_ = torch::from_blob(const_cast<float *>(weights),
{num_rows, num_cols}, torch::kFloat)
.t()
.to(device);
}
torch::Tensor MelBanks::Compute(const torch::Tensor &spectrum) const { torch::Tensor MelBanks::Compute(const torch::Tensor &spectrum) const {
return torch::mm(spectrum, bins_mat_); return torch::mm(spectrum, bins_mat_);
} }

View File

@ -36,13 +36,14 @@ struct MelBanksOptions {
std::string ToString() const { std::string ToString() const {
std::ostringstream os; std::ostringstream os;
os << "num_bins: " << num_bins << "\n"; os << "MelBanksOptions(";
os << "low_freq: " << low_freq << "\n"; os << "num_bins=" << num_bins << ", ";
os << "high_freq: " << high_freq << "\n"; os << "low_freq=" << low_freq << ", ";
os << "vtln_low: " << vtln_low << "\n"; os << "high_freq=" << high_freq << ", ";
os << "vtln_high: " << vtln_high << "\n"; os << "vtln_low=" << vtln_low << ", ";
os << "debug_mel: " << debug_mel << "\n"; os << "vtln_high=" << vtln_high << ", ";
os << "htk_mode: " << htk_mode << "\n"; os << "debug_mel=" << (debug_mel ? "True" : "False") << ", ";
os << "htk_mode=" << (htk_mode ? "True" : "False") << ")";
return os.str(); return os.str();
} }
}; };
@ -75,6 +76,17 @@ class MelBanks {
const FrameExtractionOptions &frame_opts, float vtln_warp_factor, const FrameExtractionOptions &frame_opts, float vtln_warp_factor,
torch::Device device); torch::Device device);
// Initialize with a 2-d weights matrix
//
// Note: This constructor is for Whisper. It does not initialize
// center_freqs_.
//
// @param weights Pointer to the start address of the matrix
// @param num_rows It equals to number of mel bins
// @param num_cols It equals to (number of fft bins)/2+1
MelBanks(const float *weights, int32_t num_rows, int32_t num_cols,
torch::Device device);
// CAUTION: we save a transposed version of bins_mat_, so return size(1) here // CAUTION: we save a transposed version of bins_mat_, so return size(1) here
int32_t NumBins() const { return static_cast<int32_t>(bins_mat_.size(1)); } int32_t NumBins() const { return static_cast<int32_t>(bins_mat_.size(1)); }
@ -88,7 +100,8 @@ class MelBanks {
private: private:
// A 2-D matrix. Its shape is NOT [num_bins, num_fft_bins] // A 2-D matrix. Its shape is NOT [num_bins, num_fft_bins]
// Its shape is [num_fft_bins, num_bins]. // Its shape is [num_fft_bins, num_bins] for non-whisper.
// For whisper, its shape is [num_fft_bins/2+1, num_bins]
torch::Tensor bins_mat_; torch::Tensor bins_mat_;
// center frequencies of bins, numbered from 0 ... num_bins-1. // center frequencies of bins, numbered from 0 ... num_bins-1.

View File

@ -0,0 +1,88 @@
/**
* Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)
*
* See LICENSE for clarification regarding multiple authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "kaldifeat/csrc/whisper-fbank.h"
#include <cmath>
#include <vector>
#include "kaldifeat/csrc/mel-computations.h"
#include "kaldifeat/csrc/whisper-mel-bank.h"
#include "kaldifeat/csrc/whisper-v3-mel-bank.h"
#ifndef M_2PI
#define M_2PI 6.283185307179586476925286766559005
#endif
namespace kaldifeat {
WhisperFbankComputer::WhisperFbankComputer(const WhisperFbankOptions &opts)
: opts_(opts) {
if (opts.num_mels == 80) {
mel_banks_ = std::make_unique<MelBanks>(kWhisperMelArray, kWhisperMelRows,
kWhisperMelCols, opts.device);
} else if (opts.num_mels == 128) {
mel_banks_ = std::make_unique<MelBanks>(
kWhisperV3MelArray, kWhisperV3MelRows, kWhisperV3MelCols, opts.device);
} else {
KALDIFEAT_ERR << "Unsupported num_mels: " << opts.num_mels
<< ". Support only 80 and 128";
}
opts_.frame_opts.samp_freq = 16000;
opts_.frame_opts.frame_shift_ms = 10;
opts_.frame_opts.frame_length_ms = 25;
opts_.frame_opts.dither = 0;
opts_.frame_opts.preemph_coeff = 0;
opts_.frame_opts.remove_dc_offset = false;
opts_.frame_opts.window_type = "hann";
opts_.frame_opts.round_to_power_of_two = false;
opts_.frame_opts.snip_edges = false;
}
torch::Tensor WhisperFbankComputer::Compute(
torch::Tensor /*signal_raw_log_energy*/, float /*vtln_warp*/,
const torch::Tensor &signal_frame) {
KALDIFEAT_ASSERT(signal_frame.dim() == 2);
KALDIFEAT_ASSERT(signal_frame.size(1) == opts_.frame_opts.PaddedWindowSize());
// note spectrum is in magnitude, not power, because of `abs()`
#if defined(KALDIFEAT_HAS_FFT_NAMESPACE)
// signal_frame shape: [x, 512]
// power shape [x, 257]
torch::Tensor power = torch::fft::rfft(signal_frame).abs().pow(2);
#else
// signal_frame shape [x, 512]
// real_imag shape [x, 257, 2],
// where [..., 0] is the real part
// [..., 1] is the imaginary part
torch::Tensor real_imag = torch::rfft(signal_frame, 1);
torch::Tensor real = real_imag.index({"...", 0});
torch::Tensor imag = real_imag.index({"...", 1});
torch::Tensor power = (real.square() + imag.square());
#endif
torch::Tensor mel_energies = mel_banks_->Compute(power);
torch::Tensor log_spec = torch::clamp_min(mel_energies, 1e-10).log10();
log_spec = torch::maximum(log_spec, log_spec.max() - 8.0);
torch::Tensor mel = (log_spec + 4.0) / 4.0;
return mel;
}
} // namespace kaldifeat

View File

@ -0,0 +1,78 @@
/**
* Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)
*
* See LICENSE for clarification regarding multiple authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef KALDIFEAT_CSRC_WHISPER_FBANK_H_
#define KALDIFEAT_CSRC_WHISPER_FBANK_H_
#include <memory>
#include <string>
#include <vector>
#include "kaldifeat/csrc/feature-common.h"
#include "kaldifeat/csrc/feature-window.h"
#include "kaldifeat/csrc/mel-computations.h"
namespace kaldifeat {
struct WhisperFbankOptions {
FrameExtractionOptions frame_opts;
// for large v3, please use 128
int32_t num_mels = 80;
torch::Device device{"cpu"};
std::string ToString() const {
std::ostringstream os;
os << "WhisperFbankOptions(";
os << "frame_opts=" << frame_opts.ToString() << ", ";
os << "num_mels=" << num_mels << ", ";
os << "device=\"" << device << "\")";
return os.str();
}
};
class WhisperFbankComputer {
public:
// note: Only frame_opts.device is used. All other fields from frame_opts
// are ignored
explicit WhisperFbankComputer(const WhisperFbankOptions &opts = {});
int32_t Dim() const { return opts_.num_mels; }
const FrameExtractionOptions &GetFrameOptions() const {
return opts_.frame_opts;
}
const WhisperFbankOptions &GetOptions() const { return opts_; }
torch::Tensor Compute(torch::Tensor /*signal_raw_log_energy*/,
float /*vtln_warp*/, const torch::Tensor &signal_frame);
// if true, compute log_energy_pre_window but after dithering and dc removal
bool NeedRawLogEnergy() const { return false; }
using Options = WhisperFbankOptions;
private:
WhisperFbankOptions opts_;
std::unique_ptr<MelBanks> mel_banks_;
};
using WhisperFbank = OfflineFeatureTpl<WhisperFbankComputer>;
} // namespace kaldifeat
#endif // KALDIFEAT_CSRC_WHISPER_FBANK_H_

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -9,9 +9,32 @@ pybind11_add_module(_kaldifeat
mel-computations.cc mel-computations.cc
online-feature.cc online-feature.cc
utils.cc utils.cc
whisper-fbank.cc
) )
if(APPLE)
execute_process(
COMMAND "${PYTHON_EXECUTABLE}" -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"
OUTPUT_STRIP_TRAILING_WHITESPACE
OUTPUT_VARIABLE PYTHON_SITE_PACKAGE_DIR
)
message(STATUS "PYTHON_SITE_PACKAGE_DIR: ${PYTHON_SITE_PACKAGE_DIR}")
target_link_libraries(_kaldifeat PRIVATE "-Wl,-rpath,${PYTHON_SITE_PACKAGE_DIR}")
endif()
if(NOT WIN32)
target_link_libraries(_kaldifeat PRIVATE "-Wl,-rpath,${kaldifeat_rpath_origin}/kaldifeat/${CMAKE_INSTALL_LIBDIR}")
endif()
target_link_libraries(_kaldifeat PRIVATE kaldifeat_core) target_link_libraries(_kaldifeat PRIVATE kaldifeat_core)
if(UNIX AND NOT APPLE) if(UNIX AND NOT APPLE)
target_link_libraries(_kaldifeat PUBLIC ${TORCH_DIR}/lib/libtorch_python.so) target_link_libraries(_kaldifeat PUBLIC ${TORCH_DIR}/lib/libtorch_python.so)
target_link_libraries(_kaldifeat PUBLIC ${PYTHON_LIBRARY}) # target_link_libraries(_kaldifeat PUBLIC ${PYTHON_LIBRARY})
elseif(WIN32)
target_link_libraries(_kaldifeat PUBLIC ${TORCH_DIR}/lib/torch_python.lib)
# target_link_libraries(_kaldifeat PUBLIC ${PYTHON_LIBRARIES})
endif() endif()
install(TARGETS _kaldifeat
DESTINATION ../
)

View File

@ -16,6 +16,35 @@ static void PybindFbankOptions(py::module &m) {
using PyClass = FbankOptions; using PyClass = FbankOptions;
py::class_<PyClass>(m, "FbankOptions") py::class_<PyClass>(m, "FbankOptions")
.def(py::init<>()) .def(py::init<>())
.def(py::init([](const MelBanksOptions &mel_opts,
const FrameExtractionOptions &frame_opts =
FrameExtractionOptions(),
bool use_energy = false, float energy_floor = 0.0f,
bool raw_energy = true, bool htk_compat = false,
bool use_log_fbank = true, bool use_power = true,
py::object device =
py::str("cpu")) -> std::unique_ptr<FbankOptions> {
auto opts = std::make_unique<FbankOptions>();
opts->frame_opts = frame_opts;
opts->mel_opts = mel_opts;
opts->use_energy = use_energy;
opts->energy_floor = energy_floor;
opts->raw_energy = raw_energy;
opts->htk_compat = htk_compat;
opts->use_log_fbank = use_log_fbank;
opts->use_power = use_power;
std::string s = static_cast<py::str>(device);
opts->device = torch::Device(s);
return opts;
}),
py::arg("mel_opts"),
py::arg("frame_opts") = FrameExtractionOptions(),
py::arg("use_energy") = false, py::arg("energy_floor") = 0.0f,
py::arg("raw_energy") = true, py::arg("htk_compat") = false,
py::arg("use_log_fbank") = true, py::arg("use_power") = true,
py::arg("device") = py::str("cpu"))
.def_readwrite("frame_opts", &PyClass::frame_opts) .def_readwrite("frame_opts", &PyClass::frame_opts)
.def_readwrite("mel_opts", &PyClass::mel_opts) .def_readwrite("mel_opts", &PyClass::mel_opts)
.def_readwrite("use_energy", &PyClass::use_energy) .def_readwrite("use_energy", &PyClass::use_energy)
@ -53,7 +82,7 @@ static void PybindFbank(py::module &m) {
.def("dim", &PyClass::Dim) .def("dim", &PyClass::Dim)
.def_property_readonly("options", &PyClass::GetOptions) .def_property_readonly("options", &PyClass::GetOptions)
.def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"), .def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"),
py::arg("vtln_warp")) py::arg("vtln_warp"), py::call_guard<py::gil_scoped_release>())
.def(py::pickle( .def(py::pickle(
[](const PyClass &self) -> py::dict { [](const PyClass &self) -> py::dict {
return AsDict(self.GetOptions()); return AsDict(self.GetOptions());

View File

@ -16,6 +16,35 @@ void PybindMfccOptions(py::module &m) {
using PyClass = MfccOptions; using PyClass = MfccOptions;
py::class_<PyClass>(m, "MfccOptions") py::class_<PyClass>(m, "MfccOptions")
.def(py::init<>()) .def(py::init<>())
.def(py::init([](const MelBanksOptions &mel_opts,
const FrameExtractionOptions &frame_opts =
FrameExtractionOptions(),
int32_t num_ceps = 13, bool use_energy = true,
float energy_floor = 0.0, bool raw_energy = true,
float cepstral_lifter = 22.0, bool htk_compat = false,
py::object device =
py::str("cpu")) -> std::unique_ptr<MfccOptions> {
auto opts = std::make_unique<MfccOptions>();
opts->frame_opts = frame_opts;
opts->mel_opts = mel_opts;
opts->num_ceps = num_ceps;
opts->use_energy = use_energy;
opts->energy_floor = energy_floor;
opts->raw_energy = raw_energy;
opts->cepstral_lifter = cepstral_lifter;
opts->htk_compat = htk_compat;
std::string s = static_cast<py::str>(device);
opts->device = torch::Device(s);
return opts;
}),
py::arg("mel_opts"),
py::arg("frame_opts") = FrameExtractionOptions(),
py::arg("num_ceps") = 13, py::arg("use_energy") = true,
py::arg("energy_floor") = 0.0f, py::arg("raw_energy") = true,
py::arg("cepstral_lifter") = 22.0, py::arg("htk_compat") = false,
py::arg("device") = py::str("cpu"))
.def_readwrite("frame_opts", &PyClass::frame_opts) .def_readwrite("frame_opts", &PyClass::frame_opts)
.def_readwrite("mel_opts", &PyClass::mel_opts) .def_readwrite("mel_opts", &PyClass::mel_opts)
.def_readwrite("num_ceps", &PyClass::num_ceps) .def_readwrite("num_ceps", &PyClass::num_ceps)
@ -53,7 +82,7 @@ static void PybindMfcc(py::module &m) {
.def("dim", &PyClass::Dim) .def("dim", &PyClass::Dim)
.def_property_readonly("options", &PyClass::GetOptions) .def_property_readonly("options", &PyClass::GetOptions)
.def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"), .def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"),
py::arg("vtln_warp")) py::arg("vtln_warp"), py::call_guard<py::gil_scoped_release>())
.def(py::pickle( .def(py::pickle(
[](const PyClass &self) -> py::dict { [](const PyClass &self) -> py::dict {
return AsDict(self.GetOptions()); return AsDict(self.GetOptions());

View File

@ -16,6 +16,41 @@ void PybindPlpOptions(py::module &m) {
using PyClass = PlpOptions; using PyClass = PlpOptions;
py::class_<PyClass>(m, "PlpOptions") py::class_<PyClass>(m, "PlpOptions")
.def(py::init<>()) .def(py::init<>())
.def(py::init([](const MelBanksOptions &mel_opts,
const FrameExtractionOptions &frame_opts =
FrameExtractionOptions(),
int32_t lpc_order = 12, int32_t num_ceps = 13,
bool use_energy = true, float energy_floor = 0.0,
bool raw_energy = true, float compress_factor = 0.33333,
int32_t cepstral_lifter = 22, float cepstral_scale = 1.0,
bool htk_compat = false,
py::object device =
py::str("cpu")) -> std::unique_ptr<PlpOptions> {
auto opts = std::make_unique<PlpOptions>();
opts->frame_opts = frame_opts;
opts->mel_opts = mel_opts;
opts->lpc_order = lpc_order;
opts->num_ceps = num_ceps;
opts->use_energy = use_energy;
opts->energy_floor = energy_floor;
opts->raw_energy = raw_energy;
opts->compress_factor = compress_factor;
opts->cepstral_lifter = cepstral_lifter;
opts->cepstral_scale = cepstral_scale;
opts->htk_compat = htk_compat;
std::string s = static_cast<py::str>(device);
opts->device = torch::Device(s);
return opts;
}),
py::arg("mel_opts"),
py::arg("frame_opts") = FrameExtractionOptions(),
py::arg("lpc_order") = 12, py::arg("num_ceps") = 13,
py::arg("use_energy") = true, py::arg("energy_floor") = 0.0,
py::arg("raw_energy") = true, py::arg("compress_factor") = 0.33333,
py::arg("cepstral_lifter") = 22, py::arg("cepstral_scale") = 1.0,
py::arg("htk_compat") = false, py::arg("device") = py::str("cpu"))
.def_readwrite("frame_opts", &PyClass::frame_opts) .def_readwrite("frame_opts", &PyClass::frame_opts)
.def_readwrite("mel_opts", &PyClass::mel_opts) .def_readwrite("mel_opts", &PyClass::mel_opts)
.def_readwrite("lpc_order", &PyClass::lpc_order) .def_readwrite("lpc_order", &PyClass::lpc_order)
@ -56,7 +91,7 @@ static void PybindPlp(py::module &m) {
.def("dim", &PyClass::Dim) .def("dim", &PyClass::Dim)
.def_property_readonly("options", &PyClass::GetOptions) .def_property_readonly("options", &PyClass::GetOptions)
.def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"), .def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"),
py::arg("vtln_warp")) py::arg("vtln_warp"), py::call_guard<py::gil_scoped_release>())
.def(py::pickle( .def(py::pickle(
[](const PyClass &self) -> py::dict { [](const PyClass &self) -> py::dict {
return AsDict(self.GetOptions()); return AsDict(self.GetOptions());

View File

@ -15,7 +15,27 @@ namespace kaldifeat {
static void PybindSpectrogramOptions(py::module &m) { static void PybindSpectrogramOptions(py::module &m) {
using PyClass = SpectrogramOptions; using PyClass = SpectrogramOptions;
py::class_<PyClass>(m, "SpectrogramOptions") py::class_<PyClass>(m, "SpectrogramOptions")
.def(py::init<>()) .def(py::init([](const FrameExtractionOptions &frame_opts =
FrameExtractionOptions(),
float energy_floor = 0.0, bool raw_energy = true,
bool return_raw_fft = false,
py::object device = py::str(
"cpu")) -> std::unique_ptr<SpectrogramOptions> {
auto opts = std::make_unique<SpectrogramOptions>();
opts->frame_opts = frame_opts;
opts->energy_floor = energy_floor;
opts->raw_energy = raw_energy;
opts->return_raw_fft = return_raw_fft;
std::string s = static_cast<py::str>(device);
opts->device = torch::Device(s);
return opts;
}),
py::arg("frame_opts") = FrameExtractionOptions(),
py::arg("energy_floor") = 0.0, py::arg("raw_energy") = true,
py::arg("return_raw_fft") = false,
py::arg("device") = py::str("cpu"))
.def_readwrite("frame_opts", &PyClass::frame_opts) .def_readwrite("frame_opts", &PyClass::frame_opts)
.def_readwrite("energy_floor", &PyClass::energy_floor) .def_readwrite("energy_floor", &PyClass::energy_floor)
.def_readwrite("raw_energy", &PyClass::raw_energy) .def_readwrite("raw_energy", &PyClass::raw_energy)
@ -53,7 +73,7 @@ static void PybindSpectrogram(py::module &m) {
.def("dim", &PyClass::Dim) .def("dim", &PyClass::Dim)
.def_property_readonly("options", &PyClass::GetOptions) .def_property_readonly("options", &PyClass::GetOptions)
.def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"), .def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"),
py::arg("vtln_warp")) py::arg("vtln_warp"), py::call_guard<py::gil_scoped_release>())
.def(py::pickle( .def(py::pickle(
[](const PyClass &self) -> py::dict { [](const PyClass &self) -> py::dict {
return AsDict(self.GetOptions()); return AsDict(self.GetOptions());

View File

@ -4,6 +4,7 @@
#include "kaldifeat/python/csrc/feature-window.h" #include "kaldifeat/python/csrc/feature-window.h"
#include <memory>
#include <string> #include <string>
#include "kaldifeat/csrc/feature-window.h" #include "kaldifeat/csrc/feature-window.h"
@ -14,7 +15,38 @@ namespace kaldifeat {
static void PybindFrameExtractionOptions(py::module &m) { static void PybindFrameExtractionOptions(py::module &m) {
using PyClass = FrameExtractionOptions; using PyClass = FrameExtractionOptions;
py::class_<PyClass>(m, "FrameExtractionOptions") py::class_<PyClass>(m, "FrameExtractionOptions")
.def(py::init<>()) .def(
py::init([](float samp_freq = 16000, float frame_shift_ms = 10.0f,
float frame_length_ms = 25.0f, float dither = 1.0f,
float preemph_coeff = 0.97f, bool remove_dc_offset = true,
const std::string &window_type = "povey",
bool round_to_power_of_two = true,
float blackman_coeff = 0.42f, bool snip_edges = true,
int32_t max_feature_vectors =
-1) -> std::unique_ptr<FrameExtractionOptions> {
auto opts = std::make_unique<FrameExtractionOptions>();
opts->samp_freq = samp_freq;
opts->frame_shift_ms = frame_shift_ms;
opts->frame_length_ms = frame_length_ms;
opts->dither = dither;
opts->preemph_coeff = preemph_coeff;
opts->remove_dc_offset = remove_dc_offset;
opts->window_type = window_type;
opts->round_to_power_of_two = round_to_power_of_two;
opts->blackman_coeff = blackman_coeff;
opts->snip_edges = snip_edges;
opts->max_feature_vectors = max_feature_vectors;
return opts;
}),
py::arg("samp_freq") = 16000, py::arg("frame_shift_ms") = 10.0f,
py::arg("frame_length_ms") = 25.0f, py::arg("dither") = 1.0f,
py::arg("preemph_coeff") = 0.97f, py::arg("remove_dc_offset") = true,
py::arg("window_type") = "povey",
py::arg("round_to_power_of_two") = true,
py::arg("blackman_coeff") = 0.42f, py::arg("snip_edges") = true,
py::arg("max_feature_vectors") = -1)
.def_readwrite("samp_freq", &PyClass::samp_freq) .def_readwrite("samp_freq", &PyClass::samp_freq)
.def_readwrite("frame_shift_ms", &PyClass::frame_shift_ms) .def_readwrite("frame_shift_ms", &PyClass::frame_shift_ms)
.def_readwrite("frame_length_ms", &PyClass::frame_length_ms) .def_readwrite("frame_length_ms", &PyClass::frame_length_ms)

View File

@ -12,6 +12,7 @@
#include "kaldifeat/python/csrc/feature-window.h" #include "kaldifeat/python/csrc/feature-window.h"
#include "kaldifeat/python/csrc/mel-computations.h" #include "kaldifeat/python/csrc/mel-computations.h"
#include "kaldifeat/python/csrc/online-feature.h" #include "kaldifeat/python/csrc/online-feature.h"
#include "kaldifeat/python/csrc/whisper-fbank.h"
#include "torch/torch.h" #include "torch/torch.h"
namespace kaldifeat { namespace kaldifeat {
@ -22,6 +23,7 @@ PYBIND11_MODULE(_kaldifeat, m) {
PybindFeatureWindow(m); PybindFeatureWindow(m);
PybindMelComputations(m); PybindMelComputations(m);
PybindFeatureFbank(m); PybindFeatureFbank(m);
PybindWhisperFbank(&m);
PybindFeatureMfcc(m); PybindFeatureMfcc(m);
PybindFeaturePlp(m); PybindFeaturePlp(m);
PybindFeatureSpectrogram(m); PybindFeatureSpectrogram(m);

View File

@ -4,6 +4,7 @@
#include "kaldifeat/python/csrc/mel-computations.h" #include "kaldifeat/python/csrc/mel-computations.h"
#include <memory>
#include <string> #include <string>
#include "kaldifeat/csrc/mel-computations.h" #include "kaldifeat/csrc/mel-computations.h"
@ -14,7 +15,24 @@ namespace kaldifeat {
static void PybindMelBanksOptions(py::module &m) { static void PybindMelBanksOptions(py::module &m) {
using PyClass = MelBanksOptions; using PyClass = MelBanksOptions;
py::class_<PyClass>(m, "MelBanksOptions") py::class_<PyClass>(m, "MelBanksOptions")
.def(py::init<>()) .def(py::init(
[](int32_t num_bins = 25, float low_freq = 20,
float high_freq = 0, float vtln_low = 100,
float vtln_high = -500,
bool debug_mel = false) -> std::unique_ptr<MelBanksOptions> {
auto opts = std::make_unique<MelBanksOptions>();
opts->num_bins = num_bins;
opts->low_freq = low_freq;
opts->high_freq = high_freq;
opts->vtln_low = vtln_low;
opts->vtln_high = vtln_high;
return opts;
}),
py::arg("num_bins") = 25, py::arg("low_freq") = 20,
py::arg("high_freq") = 0, py::arg("vtln_low") = 100,
py::arg("vtln_high") = -500, py::arg("debug_mel") = false)
.def_readwrite("num_bins", &PyClass::num_bins) .def_readwrite("num_bins", &PyClass::num_bins)
.def_readwrite("low_freq", &PyClass::low_freq) .def_readwrite("low_freq", &PyClass::low_freq)
.def_readwrite("high_freq", &PyClass::high_freq) .def_readwrite("high_freq", &PyClass::high_freq)

View File

@ -22,9 +22,11 @@ void PybindOnlineFeatureTpl(py::module &m, const std::string &class_name,
.def_property_readonly("num_frames_ready", &PyClass::NumFramesReady) .def_property_readonly("num_frames_ready", &PyClass::NumFramesReady)
.def("is_last_frame", &PyClass::IsLastFrame, py::arg("frame")) .def("is_last_frame", &PyClass::IsLastFrame, py::arg("frame"))
.def("get_frame", &PyClass::GetFrame, py::arg("frame")) .def("get_frame", &PyClass::GetFrame, py::arg("frame"))
.def("get_frames", &PyClass::GetFrames, py::arg("frames")) .def("get_frames", &PyClass::GetFrames, py::arg("frames"),
py::call_guard<py::gil_scoped_release>())
.def("accept_waveform", &PyClass::AcceptWaveform, .def("accept_waveform", &PyClass::AcceptWaveform,
py::arg("sampling_rate"), py::arg("waveform")) py::arg("sampling_rate"), py::arg("waveform"),
py::call_guard<py::gil_scoped_release>())
.def("input_finished", &PyClass::InputFinished); .def("input_finished", &PyClass::InputFinished);
} }

View File

@ -123,6 +123,35 @@ py::dict AsDict(const FbankOptions &opts) {
return dict; return dict;
} }
WhisperFbankOptions WhisperFbankOptionsFromDict(py::dict dict) {
WhisperFbankOptions opts;
if (dict.contains("frame_opts")) {
opts.frame_opts = FrameExtractionOptionsFromDict(dict["frame_opts"]);
}
FROM_DICT(int_, num_mels);
if (dict.contains("device")) {
opts.device = torch::Device(std::string(py::str(dict["device"])));
}
return opts;
}
py::dict AsDict(const WhisperFbankOptions &opts) {
py::dict dict;
dict["frame_opts"] = AsDict(opts.frame_opts);
AS_DICT(num_mels);
auto torch_device = py::module_::import("torch").attr("device");
dict["device"] = torch_device(opts.device.str());
return dict;
}
MfccOptions MfccOptionsFromDict(py::dict dict) { MfccOptions MfccOptionsFromDict(py::dict dict) {
MfccOptions opts; MfccOptions opts;

View File

@ -11,6 +11,7 @@
#include "kaldifeat/csrc/feature-spectrogram.h" #include "kaldifeat/csrc/feature-spectrogram.h"
#include "kaldifeat/csrc/feature-window.h" #include "kaldifeat/csrc/feature-window.h"
#include "kaldifeat/csrc/mel-computations.h" #include "kaldifeat/csrc/mel-computations.h"
#include "kaldifeat/csrc/whisper-fbank.h"
#include "kaldifeat/python/csrc/kaldifeat.h" #include "kaldifeat/python/csrc/kaldifeat.h"
/* /*
@ -36,6 +37,9 @@ py::dict AsDict(const MelBanksOptions &opts);
FbankOptions FbankOptionsFromDict(py::dict dict); FbankOptions FbankOptionsFromDict(py::dict dict);
py::dict AsDict(const FbankOptions &opts); py::dict AsDict(const FbankOptions &opts);
WhisperFbankOptions WhisperFbankOptionsFromDict(py::dict dict);
py::dict AsDict(const WhisperFbankOptions &opts);
MfccOptions MfccOptionsFromDict(py::dict dict); MfccOptions MfccOptionsFromDict(py::dict dict);
py::dict AsDict(const MfccOptions &opts); py::dict AsDict(const MfccOptions &opts);

View File

@ -0,0 +1,84 @@
// kaldifeat/python/csrc/whisper-fbank.cc
//
// Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)
#include "kaldifeat/python/csrc/whisper-fbank.h"
#include <memory>
#include <string>
#include "kaldifeat/csrc/whisper-fbank.h"
#include "kaldifeat/python/csrc/utils.h"
namespace kaldifeat {
static void PybindWhisperFbankOptions(py::module *m) {
using PyClass = WhisperFbankOptions;
py::class_<PyClass>(*m, "WhisperFbankOptions")
.def(py::init<>())
.def(py::init([](const FrameExtractionOptions &frame_opts =
FrameExtractionOptions(),
int32_t num_mels = 80,
py::object device = py::str(
"cpu")) -> std::unique_ptr<WhisperFbankOptions> {
auto opts = std::make_unique<WhisperFbankOptions>();
opts->frame_opts = frame_opts;
opts->num_mels = num_mels;
std::string s = static_cast<py::str>(device);
opts->device = torch::Device(s);
return opts;
}),
py::arg("frame_opts") = FrameExtractionOptions(),
py::arg("num_mels") = 80, py::arg("device") = py::str("cpu"))
.def_readwrite("frame_opts", &PyClass::frame_opts)
.def_readwrite("num_mels", &PyClass::num_mels)
.def_property(
"device",
[](const PyClass &self) -> py::object {
py::object ans = py::module_::import("torch").attr("device");
return ans(self.device.str());
},
[](PyClass &self, py::object obj) -> void {
std::string s = static_cast<py::str>(obj);
self.device = torch::Device(s);
})
.def("__str__",
[](const PyClass &self) -> std::string { return self.ToString(); })
.def("as_dict",
[](const PyClass &self) -> py::dict { return AsDict(self); })
.def_static("from_dict",
[](py::dict dict) -> PyClass {
return WhisperFbankOptionsFromDict(dict);
})
.def(py::pickle(
[](const PyClass &self) -> py::dict { return AsDict(self); },
[](py::dict dict) -> PyClass {
return WhisperFbankOptionsFromDict(dict);
}));
}
static void PybindWhisperFbankImpl(py::module *m) {
using PyClass = WhisperFbank;
py::class_<PyClass>(*m, "WhisperFbank")
.def(py::init<const WhisperFbankOptions &>(), py::arg("opts"))
.def("dim", &PyClass::Dim)
.def_property_readonly("options", &PyClass::GetOptions)
.def("compute_features", &PyClass::ComputeFeatures, py::arg("wave"),
py::arg("vtln_warp"), py::call_guard<py::gil_scoped_release>())
.def(py::pickle(
[](const PyClass &self) -> py::dict {
return AsDict(self.GetOptions());
},
[](py::dict dict) -> std::unique_ptr<PyClass> {
return std::make_unique<PyClass>(WhisperFbankOptionsFromDict(dict));
}));
}
void PybindWhisperFbank(py::module *m) {
PybindWhisperFbankOptions(m);
PybindWhisperFbankImpl(m);
}
} // namespace kaldifeat

View File

@ -0,0 +1,16 @@
// kaldifeat/python/csrc/whisper-fbank.h
//
// Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)
#ifndef KALDIFEAT_PYTHON_CSRC_WHISPER_FBANK_H_
#define KALDIFEAT_PYTHON_CSRC_WHISPER_FBANK_H_
#include "kaldifeat/python/csrc/kaldifeat.h"
namespace kaldifeat {
void PybindWhisperFbank(py::module *m);
} // namespace kaldifeat
#endif // KALDIFEAT_PYTHON_CSRC_WHISPER_FBANK_H_

View File

@ -1,4 +1,15 @@
import torch import torch
from .torch_version import kaldifeat_torch_version
if torch.__version__.split("+")[0] != kaldifeat_torch_version.split("+")[0]:
raise ImportError(
f"kaldifeat was built using PyTorch {kaldifeat_torch_version}\n"
f"But you are using PyTorch {torch.__version__} to run it"
)
from pathlib import Path as _Path
from _kaldifeat import ( from _kaldifeat import (
FbankOptions, FbankOptions,
FrameExtractionOptions, FrameExtractionOptions,
@ -6,9 +17,17 @@ from _kaldifeat import (
MfccOptions, MfccOptions,
PlpOptions, PlpOptions,
SpectrogramOptions, SpectrogramOptions,
WhisperFbankOptions,
num_frames,
) )
from .fbank import Fbank, OnlineFbank from .fbank import Fbank, OnlineFbank
from .mfcc import Mfcc, OnlineMfcc from .mfcc import Mfcc, OnlineMfcc
from .offline_feature import OfflineFeature
from .online_feature import OnlineFeature
from .plp import OnlinePlp, Plp from .plp import OnlinePlp, Plp
from .spectrogram import Spectrogram from .spectrogram import Spectrogram
from .whisper_fbank import WhisperFbank
cmake_prefix_path = _Path(__file__).parent / "share" / "cmake"
del _Path

View File

@ -91,5 +91,15 @@ class OnlineFeature(object):
""" """
self.computer.input_finished() self.computer.input_finished()
@property
def dim(self) -> int:
"""Return the feature dimension of this extractor"""
return self.computer.dim
@property
def frame_shift_in_seconds(self) -> int:
"""Return frame shift in seconds of this extractor"""
return self.computer.frame_shift_in_seconds
def __getstate__(self): def __getstate__(self):
return self.opts.as_dict() return self.opts.as_dict()

View File

@ -0,0 +1,12 @@
# Auto generated by the toplevel CMakeLists.txt.
#
# DO NOT EDIT.
# The torch version used to build kaldifeat. We will check it against the
# torch version that is used to run kaldifeat. If they are not the same,
# `import kaldifeat` will throw.
#
# Some example values are:
# - 1.10.0+cu102
# - 1.5.0+cpu
kaldifeat_torch_version = "@TORCH_VERSION@"

View File

@ -0,0 +1,12 @@
# Copyright (c) 2021 Xiaomi Corporation (authors: Fangjun Kuang)
import _kaldifeat
from .offline_feature import OfflineFeature
class WhisperFbank(OfflineFeature):
def __init__(self, opts: _kaldifeat.WhisperFbankOptions):
super().__init__(opts)
self.computer = _kaldifeat.WhisperFbank(opts)

View File

@ -0,0 +1,13 @@
.PHONY: test
test:
python3 ./test_fbank.py
python3 ./test_fbank_options.py
python3 ./test_frame_extraction_options.py
python3 ./test_mel_bank_options.py
python3 ./test_mfcc.py
python3 ./test_mfcc_options.py
python3 ./test_plp.py
python3 ./test_plp_options.py
python3 ./test_spectrogram.py
python3 ./test_spectrogram_options.py

View File

@ -12,6 +12,7 @@ import kaldifeat
def test_default(): def test_default():
opts = kaldifeat.FbankOptions() opts = kaldifeat.FbankOptions()
print(opts)
assert opts.frame_opts.samp_freq == 16000 assert opts.frame_opts.samp_freq == 16000
assert opts.frame_opts.frame_shift_ms == 10.0 assert opts.frame_opts.frame_shift_ms == 10.0
assert opts.frame_opts.frame_length_ms == 25.0 assert opts.frame_opts.frame_length_ms == 25.0

View File

@ -9,6 +9,7 @@ import kaldifeat
def test_default(): def test_default():
opts = kaldifeat.FrameExtractionOptions() opts = kaldifeat.FrameExtractionOptions()
print(opts)
assert opts.samp_freq == 16000 assert opts.samp_freq == 16000
assert opts.frame_shift_ms == 10.0 assert opts.frame_shift_ms == 10.0
assert opts.frame_length_ms == 25.0 assert opts.frame_length_ms == 25.0
@ -22,7 +23,9 @@ def test_default():
def test_set_get(): def test_set_get():
opts = kaldifeat.FrameExtractionOptions() opts = kaldifeat.FrameExtractionOptions(samp_freq=22150)
assert opts.samp_freq == 22150
opts.samp_freq = 44100 opts.samp_freq = 44100
assert opts.samp_freq == 44100 assert opts.samp_freq == 44100

View File

@ -9,6 +9,7 @@ import kaldifeat
def test_default(): def test_default():
opts = kaldifeat.MelBanksOptions() opts = kaldifeat.MelBanksOptions()
print(opts)
assert opts.num_bins == 25 assert opts.num_bins == 25
assert opts.low_freq == 20 assert opts.low_freq == 20
assert opts.high_freq == 0 assert opts.high_freq == 0
@ -19,10 +20,12 @@ def test_default():
def test_set_get(): def test_set_get():
opts = kaldifeat.MelBanksOptions() opts = kaldifeat.MelBanksOptions(num_bins=100)
opts.num_bins = 100
assert opts.num_bins == 100 assert opts.num_bins == 100
opts.num_bins = 200
assert opts.num_bins == 200
opts.low_freq = 22 opts.low_freq = 22
assert opts.low_freq == 22 assert opts.low_freq == 22

View File

@ -12,6 +12,7 @@ import kaldifeat
def test_default(): def test_default():
opts = kaldifeat.MfccOptions() opts = kaldifeat.MfccOptions()
print(opts)
assert opts.frame_opts.samp_freq == 16000 assert opts.frame_opts.samp_freq == 16000
assert opts.frame_opts.frame_shift_ms == 10.0 assert opts.frame_opts.frame_shift_ms == 10.0

View File

@ -12,6 +12,7 @@ import kaldifeat
def test_default(): def test_default():
opts = kaldifeat.PlpOptions() opts = kaldifeat.PlpOptions()
print(opts)
assert opts.frame_opts.samp_freq == 16000 assert opts.frame_opts.samp_freq == 16000
assert opts.frame_opts.frame_shift_ms == 10.0 assert opts.frame_opts.frame_shift_ms == 10.0
assert opts.frame_opts.frame_length_ms == 25.0 assert opts.frame_opts.frame_length_ms == 25.0

View File

@ -12,6 +12,7 @@ import kaldifeat
def test_default(): def test_default():
opts = kaldifeat.SpectrogramOptions() opts = kaldifeat.SpectrogramOptions()
print(opts)
assert opts.frame_opts.samp_freq == 16000 assert opts.frame_opts.samp_freq == 16000
assert opts.frame_opts.frame_shift_ms == 10.0 assert opts.frame_opts.frame_shift_ms == 10.0
@ -30,7 +31,8 @@ def test_default():
def test_set_get(): def test_set_get():
opts = kaldifeat.SpectrogramOptions() opts = kaldifeat.SpectrogramOptions(energy_floor=10)
assert opts.energy_floor == 10
opts.energy_floor = 1 opts.energy_floor = 1
assert opts.energy_floor == 1 assert opts.energy_floor == 1
@ -138,6 +140,30 @@ def test_pickle():
assert str(opts) == str(opts2) assert str(opts) == str(opts2)
def test_device():
opts = kaldifeat.SpectrogramOptions(device="cpu")
assert opts.device == torch.device("cpu")
opts = kaldifeat.SpectrogramOptions(device="cuda")
assert opts.device == torch.device("cuda")
opts = kaldifeat.SpectrogramOptions(device="cuda:1")
assert opts.device == torch.device("cuda:1")
print(opts)
opts = kaldifeat.SpectrogramOptions(device=torch.device("cpu"))
assert opts.device == torch.device("cpu")
opts = kaldifeat.SpectrogramOptions(device=torch.device("cuda"))
assert opts.device == torch.device("cuda")
opts = kaldifeat.SpectrogramOptions(device=torch.device("cuda:3"))
assert opts.device == torch.device("cuda:3")
opts = kaldifeat.SpectrogramOptions(device=torch.device("cuda", 2))
assert opts.device == torch.device("cuda", 2)
def main(): def main():
test_default() test_default()
test_set_get() test_set_get()
@ -146,6 +172,7 @@ def main():
test_from_dict_partial() test_from_dict_partial()
test_from_dict_full_and_as_dict() test_from_dict_full_and_as_dict()
test_pickle() test_pickle()
test_device()
if __name__ == "__main__": if __name__ == "__main__":

View File

@ -0,0 +1,48 @@
#!/usr/bin/env python3
# Copyright 2023 Xiaomi Corporation (authors: Fangjun Kuang)
import librosa
import torch
import kaldifeat
def get_ground_truth(x):
N_FFT = 400
HOP_LENGTH = 160
m = librosa.filters.mel(sr=16000, n_fft=400, n_mels=80)
m = torch.from_numpy(m)
# print(m.shape) # [80, 201]
window = torch.hann_window(N_FFT)
stft = torch.stft(x, N_FFT, HOP_LENGTH, window=window, return_complex=True)
# print(stft.shape) # [201, 301]
magnitudes = stft[..., :-1].abs() ** 2
# print(magnitudes.shape) # [201, 300]
mel_spec = m @ magnitudes
# print(mel_spec.shape) # [80, 300]
log_spec = torch.clamp(mel_spec, min=1e-10).log10()
log_spec = torch.maximum(log_spec, log_spec.max() - 8.0)
log_spec = (log_spec + 4.0) / 4.0
return log_spec.t()
def test_whisper_fbank():
x = torch.rand(16000 * 3)
gt = get_ground_truth(x)
print(gt.shape) # [300, 80]
opts = kaldifeat.WhisperFbankOptions(device="cpu")
whisper_fbank = kaldifeat.WhisperFbank(opts)
y = whisper_fbank(x) # [298, 80]
print(y.shape) # [298, 80]
# print(gt[:5, :5])
# print(y[:5, :5])
if __name__ == "__main__":
torch.manual_seed(20231108)
test_whisper_fbank()

View File

@ -0,0 +1,49 @@
#!/usr/bin/env python3
# Copyright 2023 Xiaomi Corporation (authors: Fangjun Kuang)
import librosa
import torch
import kaldifeat
def get_ground_truth(x):
N_FFT = 400
HOP_LENGTH = 160
m = librosa.filters.mel(sr=16000, n_fft=400, n_mels=128)
m = torch.from_numpy(m)
# print(m.shape) # [128, 201]
window = torch.hann_window(N_FFT)
stft = torch.stft(x, N_FFT, HOP_LENGTH, window=window, return_complex=True)
# print(stft.shape) # [201, 301]
magnitudes = stft[..., :-1].abs() ** 2
# print(magnitudes.shape) # [201, 300]
mel_spec = m @ magnitudes
# print(mel_spec.shape) # [128, 300]
log_spec = torch.clamp(mel_spec, min=1e-10).log10()
log_spec = torch.maximum(log_spec, log_spec.max() - 8.0)
log_spec = (log_spec + 4.0) / 4.0
return log_spec.t()
def test_whisper_v3_fbank():
x = torch.rand(16000 * 3)
gt = get_ground_truth(x)
print(gt.shape) # [300, 128]
opts = kaldifeat.WhisperFbankOptions(num_mels=128, device="cpu")
print(opts)
whisper_fbank = kaldifeat.WhisperFbank(opts)
y = whisper_fbank(x) # [298, 128]
print(y.shape) # [298, 128]
print(gt[:5, :5])
print(y[:5, :5])
if __name__ == "__main__":
torch.manual_seed(20231109)
test_whisper_v3_fbank()

View File

@ -1,11 +0,0 @@
[tool.isort]
profile = "black"
[tool.black]
line-length = 80
exclude = '''
/(
\.git
| \.github
)/
'''

View File

@ -55,7 +55,7 @@ cd $kaldifeat_dir
export KALDIFEAT_ROOT_DIR=$kaldifeat_dir export KALDIFEAT_ROOT_DIR=$kaldifeat_dir
echo "KALDIFEAT_ROOT_DIR: $KALDIFEAT_ROOT_DIR" echo "KALDIFEAT_ROOT_DIR: $KALDIFEAT_ROOT_DIR"
KALDIFEAT_PYTHON_VERSION=$(python3 -c "import sys; print(sys.version[:3])") KALDIFEAT_PYTHON_VERSION=$(python -c "import sys; print('.'.join(sys.version.split('.')[:2]))")
if [ -z $KALDIFEAT_CUDA_VERSION ]; then if [ -z $KALDIFEAT_CUDA_VERSION ]; then
echo "env var KALDIFEAT_CUDA_VERSION is not set, defaults to 10.1" echo "env var KALDIFEAT_CUDA_VERSION is not set, defaults to 10.1"

81
scripts/build_conda_cpu.sh Executable file
View File

@ -0,0 +1,81 @@
#!/usr/bin/env bash
#
# Copyright 2021 Xiaomi Corp. (author: Fangjun Kuang)
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The following environment variables are supposed to be set by users
#
# - KALDIFEAT_TORCH_VERSION
# The PyTorch version. Example:
#
# export KALDIFEAT_TORCH_VERSION=1.7.1
#
# Defaults to 1.7.1 if not set.
#
# - KALDIFEAT_CONDA_TOKEN
# If not set, auto upload to anaconda.org is disabled.
#
# Its value is from https://anaconda.org/kaldifeat/settings/access
# (You need to login as user kaldifeat to see its value)
#
# - KALDIFEAT_BUILD_TYPE
# If not set, defaults to Release.
set -e
export CONDA_BUILD=1
cur_dir=$(cd $(dirname $BASH_SOURCE) && pwd)
kaldifeat_dir=$(cd $cur_dir/.. && pwd)
cd $kaldifeat_dir
export KALDIFEAT_ROOT_DIR=$kaldifeat_dir
echo "KALDIFEAT_ROOT_DIR: $KALDIFEAT_ROOT_DIR"
KALDIFEAT_PYTHON_VERSION=$(python -c "import sys; print('.'.join(sys.version.split('.')[:2]))")
if [ -z $KALDIFEAT_TORCH_VERSION ]; then
echo "env var KALDIFEAT_TORCH_VERSION is not set, defaults to 1.7.1"
KALDIFEAT_TORCH_VERSION=1.7.1
fi
if [ -z $KALDIFEAT_BUILD_TYPE ]; then
echo "env var KALDIFEAT_BUILD_TYPE is not set, defaults to Release"
KALDIFEAT_BUILD_TYPE=Release
fi
export KALDIFEAT_IS_FOR_CONDA=1
# Example value: 3.8
export KALDIFEAT_PYTHON_VERSION
# Example value: 1.7.1
export KALDIFEAT_TORCH_VERSION
export KALDIFEAT_BUILD_TYPE
if [ ! -z $KALDIFEAT_IS_GITHUB_ACTIONS ]; then
export KALDIFEAT_IS_GITHUB_ACTIONS
conda remove -q pytorch
conda clean -q -a
else
export KALDIFEAT_IS_GITHUB_ACTIONS=0
fi
if [ -z $KALDIFEAT_CONDA_TOKEN ]; then
echo "Auto upload to anaconda.org is disabled since KALDIFEAT_CONDA_TOKEN is not set"
conda build --no-test --no-anaconda-upload -c pytorch ./scripts/conda-cpu/kaldifeat
else
conda build --no-test -c pytorch --token $KALDIFEAT_CONDA_TOKEN ./scripts/conda-cpu/kaldifeat
fi

View File

@ -0,0 +1,10 @@
# this file is copied from
# https://github.com/pytorch/builder/tree/master/conda/cpuonly
package:
name: cpuonly
version: 1.0
build:
track_features:
- cpuonly
noarch: generic

View File

@ -0,0 +1,38 @@
package:
name: kaldifeat
version: "1.25.5"
source:
path: "{{ environ.get('KALDIFEAT_ROOT_DIR') }}"
build:
number: 0
string: cpu_py{{ environ.get('KALDIFEAT_PYTHON_VERSION') }}_torch{{ environ.get('KALDIFEAT_TORCH_VERSION') }}
script: conda install -y -q -c pytorch pytorch={{ environ.get('KALDIFEAT_TORCH_VERSION') }} cpuonly & {{ PYTHON }} setup.py install --single-version-externally-managed --record=record.txt
features:
- cpuonly # [not osx]
requirements:
build:
- {{ compiler('c') }} # [win]
- {{ compiler('cxx') }} # [win]
host:
- anaconda-client
- cmake
- python
- pytorch={{ environ.get('KALDIFEAT_TORCH_VERSION') }}
- cpuonly
run:
- python
- pytorch={{ environ.get('KALDIFEAT_TORCH_VERSION') }}
about:
home: https://github.com/csukuangfj/kaldifeat
license: Apache V2
license_file: LICENSE
summary: Kaldi-compatible feature extraction with PyTorch
description: |
Kaldi-compatible feature extraction with PyTorch,
supporting CUDA, batch processing, chunk processing, and autograd

Some files were not shown because too many files have changed in this diff Show More