update faq for libpython3.10.so not found (#838)

This commit is contained in:
Fangjun Kuang 2023-01-13 15:21:29 +08:00 committed by GitHub
parent 958dbb3a1d
commit 5c8e9628cc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 79 additions and 31 deletions

View File

@ -81,6 +81,9 @@ todo_include_todos = True
rst_epilog = """
.. _sherpa-ncnn: https://github.com/k2-fsa/sherpa-ncnn
.. _icefall: https://github.com/k2-fsa/icefall
.. _git-lfs: https://git-lfs.com/
.. _ncnn: https://github.com/tencent/ncnn
.. _LibriSpeech: https://www.openslr.org/12
.. _musan: http://www.openslr.org/17/
"""

View File

@ -65,3 +65,43 @@ The fix is:
pip uninstall setuptools
pip install setuptools==58.0.4
ImportError: libpython3.10.so.1.0: cannot open shared object file: No such file or directory
--------------------------------------------------------------------------------------------
If you are using ``conda`` and encounter the following issue:
.. code-block::
Traceback (most recent call last):
File "/k2-dev/yangyifan/anaconda3/envs/icefall/lib/python3.10/site-packages/k2-1.23.3.dev20230112+cuda11.6.torch1.13.1-py3.10-linux-x86_64.egg/k2/__init__.py", line 24, in <module>
from _k2 import DeterminizeWeightPushingType
ImportError: libpython3.10.so.1.0: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/k2-dev/yangyifan/icefall/egs/librispeech/ASR/./pruned_transducer_stateless7_ctc_bs/decode.py", line 104, in <module>
import k2
File "/k2-dev/yangyifan/anaconda3/envs/icefall/lib/python3.10/site-packages/k2-1.23.3.dev20230112+cuda11.6.torch1.13.1-py3.10-linux-x86_64.egg/k2/__init__.py", line 30, in <module>
raise ImportError(
ImportError: libpython3.10.so.1.0: cannot open shared object file: No such file or directory
Note: If you're using anaconda and importing k2 on MacOS,
you can probably fix this by setting the environment variable:
export DYLD_LIBRARY_PATH=$CONDA_PREFIX/lib/python3.10/site-packages:$DYLD_LIBRARY_PATH
Please first try to find where ``libpython3.10.so.1.0`` locates.
For instance,
.. code-block:: bash
cd $CONDA_PREFIX/lib
find . -name "libpython*"
If you are able to find it inside ``$CODNA_PREFIX/lib``, please set the
following environment variable:
.. code-block:: bash
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

View File

@ -1,8 +1,8 @@
Distillation with HuBERT
========================
This totorial shows you how to perform knowledge distillation in ``icefall``
with the `LibriSpeech <https://www.openslr.org/12>`_ dataset. The distillation method
This tutorial shows you how to perform knowledge distillation in `icefall`_
with the `LibriSpeech`_ dataset. The distillation method
used here is called "Multi Vector Quantization Knowledge Distillation" (MVQ-KD).
Please have a look at our paper `Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation <https://arxiv.org/abs/2211.00508>`_
for more details about MVQ-KD.
@ -18,7 +18,7 @@ for more details about MVQ-KD.
.. note::
We assume you have read the page :ref:`install icefall` and have setup
the environment for ``icefall``.
the environment for `icefall`_.
.. HINT::
@ -27,13 +27,13 @@ for more details about MVQ-KD.
Data preparation
----------------
We first prepare necessary training data for ``LibriSpeech``.
This is the same as in `Pruned_transducer_statelessX <./pruned_transducer_stateless.rst>`_.
We first prepare necessary training data for `LibriSpeech`_.
This is the same as in :ref:`non_streaming_librispeech_pruned_transducer_stateless`.
.. hint::
The data preparation is the same as other recipes on LibriSpeech dataset,
if you have finished this step, you can skip to ``Codebook index preparation`` directly.
if you have finished this step, you can skip to :ref:`codebook_index_preparation` directly.
.. code-block:: bash
@ -61,8 +61,8 @@ For example,
.. HINT::
If you have pre-downloaded the `LibriSpeech <https://www.openslr.org/12>`_
dataset and the `musan <http://www.openslr.org/17/>`_ dataset, say,
If you have pre-downloaded the `LibriSpeech`_
dataset and the `musan`_ dataset, say,
they are saved in ``/tmp/LibriSpeech`` and ``/tmp/musan``, you can modify
the ``dl_dir`` variable in ``./prepare.sh`` to point to ``/tmp`` so that
``./prepare.sh`` won't re-download them.
@ -84,6 +84,8 @@ We provide the following YouTube video showing how to run ``./prepare.sh``.
.. youtube:: ofEIoJL-mGM
.. _codebook_index_preparation:
Codebook index preparation
--------------------------
@ -91,9 +93,10 @@ Here, we prepare necessary data for MVQ-KD. This requires the generation
of codebook indexes (please read our `paper <https://arxiv.org/abs/2211.00508>`_.
if you are interested in details). In this tutorial, we use the pre-computed
codebook indexes for convenience. The only thing you need to do is to
run ``./distillation_with_hubert.sh``.
run `./distillation_with_hubert.sh <https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/distillation_with_hubert.sh>`_.
.. note::
There are 5 stages in total, the first and second stage will be automatically skipped
when choosing to downloaded codebook indexes prepared by `icefall`_.
Of course, you can extract and compute the codebook indexes by yourself. This
@ -115,7 +118,7 @@ For example,
$ ./distillation_with_hubert.sh --stage 0 --stop-stage 0 # run only stage 0
$ ./distillation_with_hubert.sh --stage 2 --stop-stage 4 # run from stage 2 to stage 5
Here are a few options in ``./distillation_with_hubert.sh``
Here are a few options in `./distillation_with_hubert.sh <https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/distillation_with_hubert.sh>`_
you need to know before you proceed.
- ``--full_libri`` If True, use full 960h data. Otherwise only ``train-clean-100`` will be used
@ -151,7 +154,7 @@ following screenshot for the output of an example execution.
set ``use_extracted_codebook=False`` and set ``embedding_layer`` and
``num_codebooks`` by yourself.
Now, you should see the following files under the direcory ``./data/vq_fbank_layer36_cb8``.
Now, you should see the following files under the directory ``./data/vq_fbank_layer36_cb8``.
.. figure:: ./images/distillation_directory.png
:width: 800
@ -191,6 +194,7 @@ Here is the code snippet for training:
There are a few training arguments in the following
training commands that should be paid attention to.
- ``--enable-distillation`` If True, knowledge distillation training is enabled.
- ``--codebook-loss-scale`` The scale of the knowledge distillation loss.
- ``--manifest-dir`` The path to the MVQ-augmented manifest.
@ -217,4 +221,3 @@ You should get similar results as `here <https://github.com/k2-fsa/icefall/blob/
That's all! Feel free to experiment with your own setups and report your results.
If you encounter any problems during training, please open up an issue `here <https://github.com/k2-fsa/icefall/issues>`_.

View File

@ -1,3 +1,5 @@
.. _non_streaming_librispeech_pruned_transducer_stateless:
Pruned transducer statelessX
============================