mirror of
https://github.com/csukuangfj/kaldifeat.git
synced 2025-08-09 10:02:20 +00:00
234 lines
14 KiB
HTML
234 lines
14 KiB
HTML
|
|
|
|
<!DOCTYPE html>
|
|
<html class="writer-html5" lang="en">
|
|
<head>
|
|
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
|
|
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
|
<title>Introduction — kaldifeat 1.25.5 documentation</title>
|
|
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
|
|
<link rel="stylesheet" type="text/css" href="_static/css/theme.css" />
|
|
|
|
|
|
<script src="_static/jquery.js"></script>
|
|
<script src="_static/_sphinx_javascript_frameworks_compat.js"></script>
|
|
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
|
|
<script src="_static/doctools.js"></script>
|
|
<script src="_static/sphinx_highlight.js"></script>
|
|
<script src="_static/js/theme.js"></script>
|
|
<link rel="index" title="Index" href="genindex.html" />
|
|
<link rel="search" title="Search" href="search.html" />
|
|
<link rel="next" title="Installation" href="installation/index.html" />
|
|
<link rel="prev" title="kaldifeat" href="index.html" />
|
|
</head>
|
|
|
|
<body class="wy-body-for-nav">
|
|
<div class="wy-grid-for-nav">
|
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
|
<div class="wy-side-scroll">
|
|
<div class="wy-side-nav-search" >
|
|
|
|
|
|
|
|
<a href="index.html" class="icon icon-home">
|
|
kaldifeat
|
|
</a>
|
|
<div role="search">
|
|
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
|
|
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
|
|
<input type="hidden" name="check_keywords" value="yes" />
|
|
<input type="hidden" name="area" value="default" />
|
|
</form>
|
|
</div>
|
|
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
|
|
<p class="caption" role="heading"><span class="caption-text">Contents</span></p>
|
|
<ul class="current">
|
|
<li class="toctree-l1 current"><a class="current reference internal" href="#">Introduction</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="installation/index.html">Installation</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="usage/index.html">Usage</a></li>
|
|
</ul>
|
|
|
|
</div>
|
|
</div>
|
|
</nav>
|
|
|
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
|
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
|
<a href="index.html">kaldifeat</a>
|
|
</nav>
|
|
|
|
<div class="wy-nav-content">
|
|
<div class="rst-content style-external-links">
|
|
<div role="navigation" aria-label="Page navigation">
|
|
<ul class="wy-breadcrumbs">
|
|
<li><a href="index.html" class="icon icon-home" aria-label="Home"></a></li>
|
|
<li class="breadcrumb-item active">Introduction</li>
|
|
<li class="wy-breadcrumbs-aside">
|
|
<a href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/docs/source/intro.rst" class="fa fa-github"> Edit on GitHub</a>
|
|
</li>
|
|
</ul>
|
|
<hr/>
|
|
</div>
|
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
|
<div itemprop="articleBody">
|
|
|
|
<section id="introduction">
|
|
<h1>Introduction<a class="headerlink" href="#introduction" title="Permalink to this heading"></a></h1>
|
|
<p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat">kaldifeat</a> implements
|
|
speech feature extraction algorithms <strong>compatible</strong> with <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a> using <a class="reference external" href="https://pytorch.org/">PyTorch</a>,
|
|
supporting CUDA as well as autograd.</p>
|
|
<p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat">kaldifeat</a> has the following features:</p>
|
|
<blockquote>
|
|
<div><ul>
|
|
<li><p>Fully compatible with <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a></p>
|
|
<div class="admonition note">
|
|
<p class="admonition-title">Note</p>
|
|
<p>The underlying C++ code is copied & modified from <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a> directly.
|
|
It is rewritten with <cite>PyTorch</cite> C++ APIs.</p>
|
|
</div>
|
|
</li>
|
|
<li><p>Provide not only <code class="docutils literal notranslate"><span class="pre">C++</span> <span class="pre">APIs</span></code> but also <code class="docutils literal notranslate"><span class="pre">Python</span> <span class="pre">APIs</span></code></p>
|
|
<div class="admonition note">
|
|
<p class="admonition-title">Note</p>
|
|
<p>You can access <a class="reference external" href="https://github.com/csukuangfj/kaldifeat">kaldifeat</a> from <code class="docutils literal notranslate"><span class="pre">Python</span></code>.</p>
|
|
</div>
|
|
</li>
|
|
<li><p>Support autograd</p></li>
|
|
<li><p>Support <code class="docutils literal notranslate"><span class="pre">CUDA</span></code> and <code class="docutils literal notranslate"><span class="pre">CPU</span></code></p>
|
|
<div class="admonition note">
|
|
<p class="admonition-title">Note</p>
|
|
<p>You can use CUDA for feature extraction.</p>
|
|
</div>
|
|
</li>
|
|
<li><p>Support <code class="docutils literal notranslate"><span class="pre">online</span></code> (i.e., <code class="docutils literal notranslate"><span class="pre">streaming</span></code>) and <code class="docutils literal notranslate"><span class="pre">offline</span></code> (i.e., <code class="docutils literal notranslate"><span class="pre">non-streaming</span></code>)
|
|
feature extraction</p></li>
|
|
<li><p>Support chunk-based processing</p>
|
|
<div class="admonition note">
|
|
<p class="admonition-title">Note</p>
|
|
<p>This is especially usefull if you want to process audios of several
|
|
hours long, which may cause OOM if you send them for computation at once.
|
|
With chunk-based processing, you can process audios of arbirtray length.</p>
|
|
</div>
|
|
</li>
|
|
<li><p>Support batch processing</p>
|
|
<div class="admonition note">
|
|
<p class="admonition-title">Note</p>
|
|
<p>With <a class="reference external" href="https://github.com/csukuangfj/kaldifeat">kaldifeat</a> you can extract features for a batch of audios</p>
|
|
</div>
|
|
</li>
|
|
</ul>
|
|
</div></blockquote>
|
|
<p>Currently implemented speech features and their counterparts in <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a> are
|
|
listed in the following table.</p>
|
|
<table class="docutils align-default" id="id1">
|
|
<caption><span class="caption-number">Table 1 </span><span class="caption-text">Supported speech features</span><a class="headerlink" href="#id1" title="Permalink to this table"></a></caption>
|
|
<colgroup>
|
|
<col style="width: 50.0%" />
|
|
<col style="width: 50.0%" />
|
|
</colgroup>
|
|
<thead>
|
|
<tr class="row-odd"><th class="head"><p>Supported speech features</p></th>
|
|
<th class="head"><p>Counterpart in <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a></p></th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/fbank.py#L10">kaldifeat.Fbank</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-fbank-feats.cc">compute-fbank-feats</a></p></td>
|
|
</tr>
|
|
<tr class="row-odd"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/mfcc.py#L10">kaldifeat.Mfcc</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-mfcc-feats.cc">compute-mfcc-feats</a></p></td>
|
|
</tr>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/plp.py#L10">kaldifeat.Plp</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-plp-feats.cc">compute-plp-feats</a></p></td>
|
|
</tr>
|
|
<tr class="row-odd"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/spectrogram.py#L9">kaldifeat.Spectrogram</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/compute-spectrogram-feats.cc">compute-spectrogram-feats</a></p></td>
|
|
</tr>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/fbank.py#L16">kaldifeat.OnlineFbank</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L160">kaldi::OnlineFbank</a></p></td>
|
|
</tr>
|
|
<tr class="row-odd"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/mfcc.py#L16">kaldifeat.OnlineMfcc</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L158">kaldi::OnlineMfcc</a></p></td>
|
|
</tr>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/python/kaldifeat/plp.py#L16">kaldifeat.OnlinePlp</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/online-feature.h#L159">kaldi::OnlinePlp</a></p></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>Each feature computer needs an option. The following table lists the options
|
|
for each computer and the corresponding options in <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a>.</p>
|
|
<div class="admonition hint">
|
|
<p class="admonition-title">Hint</p>
|
|
<p>Note that we reuse the parameter names from <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a>.</p>
|
|
<p>Also, both online feature computers and offline feature computers share the
|
|
same option.</p>
|
|
</div>
|
|
<table class="docutils align-default" id="id2">
|
|
<caption><span class="caption-number">Table 2 </span><span class="caption-text">Feature computer options</span><a class="headerlink" href="#id2" title="Permalink to this table"></a></caption>
|
|
<colgroup>
|
|
<col style="width: 50.0%" />
|
|
<col style="width: 50.0%" />
|
|
</colgroup>
|
|
<thead>
|
|
<tr class="row-odd"><th class="head"><p>Options in <a class="reference external" href="https://github.com/csukuangfj/kaldifeat">kaldifeat</a></p></th>
|
|
<th class="head"><p>Corresponding options in <a class="reference external" href="https://github.com/kaldi-asr/kaldi">Kaldi</a></p></th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-fbank.h#L19">kaldifeat.FbankOptions</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-fbank.h#L41">kaldi::FbankOptions</a></p></td>
|
|
</tr>
|
|
<tr class="row-odd"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-mfcc.h#L22">kaldifeat.MfccOptions</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-mfcc.h#L38">kaldi::MfccOptions</a></p></td>
|
|
</tr>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-plp.h#L24">kaldifeat.PlpOptions</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-plp.h#L42">kaldi::PlpOptions</a></p></td>
|
|
</tr>
|
|
<tr class="row-odd"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-spectrogram.h#L18">kaldifeat.SpectrogramOptions</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-spectrogram.h#L38">kaldi::SpectrogramOptions</a></p></td>
|
|
</tr>
|
|
<tr class="row-even"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/feature-window.h#L30">kaldifeat.FrameExtractionOptions</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/feature-window.h#L35">kaldi::FrameExtractionOptions</a></p></td>
|
|
</tr>
|
|
<tr class="row-odd"><td><p><a class="reference external" href="https://github.com/csukuangfj/kaldifeat/blob/master/kaldifeat/csrc/mel-computations.h#L17">kaldifeat.MelBanksOptions</a></p></td>
|
|
<td><p><a class="reference external" href="https://github.com/kaldi-asr/kaldi/blob/master/src/feat/mel-computations.h#L43">kaldi::MelBanksOptions</a></p></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>Read more to learn how to install <a class="reference external" href="https://github.com/csukuangfj/kaldifeat">kaldifeat</a> and how to use each feature
|
|
computer.</p>
|
|
</section>
|
|
|
|
|
|
</div>
|
|
</div>
|
|
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
|
|
<a href="index.html" class="btn btn-neutral float-left" title="kaldifeat" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
|
<a href="installation/index.html" class="btn btn-neutral float-right" title="Installation" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
|
</div>
|
|
|
|
<hr/>
|
|
|
|
<div role="contentinfo">
|
|
<p>© Copyright 2021, Fangjun Kuang.</p>
|
|
</div>
|
|
|
|
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
|
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
|
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
|
|
|
|
|
</footer>
|
|
</div>
|
|
</div>
|
|
</section>
|
|
</div>
|
|
<script>
|
|
jQuery(function () {
|
|
SphinxRtdTheme.Navigation.enable(true);
|
|
});
|
|
</script>
|
|
|
|
</body>
|
|
</html> |