Skip to content

Commit

Permalink
Add doc for using WeNet CTC models with sherpa-onnx (#507)
Browse files Browse the repository at this point in the history
  • Loading branch information
csukuangfj authored Nov 16, 2023
1 parent a34c2c8 commit 1c4fe27
Show file tree
Hide file tree
Showing 10 changed files with 155 additions and 7 deletions.
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ def get_version():
.. _PyTorch: https://pytorch.org/
.. _Huggingface: https://huggingface.co
.. _WenetSpeech: https://github.com/wenet-e2e/WenetSpeech
.. _wenet: https://github.com/k2-fsa/sherpa
.. _WeNet: https://github.com/wenet-e2e/wenet
.. _GigaSpeech: https://github.com/SpeechColab/GigaSpeech
.. _Kaldi: https://github.com/kaldi-asr/kaldi
.. _kaldifeat: https://csukuangfj.github.io/kaldifeat/installation.html
Expand Down
4 changes: 2 additions & 2 deletions docs/source/cpp/pretrained_models/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Two kinds of end-to-end (E2E) models are supported by `sherpa`_:

For CTC-based models, we support any type of models trained using CTC loss
as long as you can export the model via torchscript. Models from the following
frameworks are currently supported: `icefall`_, `wenet`_, and `torchaudio`_ (Wav2Vec 2.0).
frameworks are currently supported: `icefall`_, `WeNet`_, and `torchaudio`_ (Wav2Vec 2.0).
If you have a CTC model and want it to be supported in `sherpa`, please
create an issue at `<https://github.com/k2-fsa/sherpa/issues>`_.

Expand Down Expand Up @@ -46,7 +46,7 @@ This page lists all available pre-trained models that you can download.
for you to try offline recognition step by step.

It shows how to install sherpa and use it as offline recognizer,
which supports the models from icefall, the wenet framework and torchaudio.
which supports the models from icefall, the `WeNet`_ framework and torchaudio.

.. |Sherpa offline recognition python api colab notebook| image:: https://colab.research.google.com/assets/colab-badge.svg
:target: https://colab.research.google.com/drive/1RdU06GcytTpI-r8vkQ7NkI0ugytnwJVB?usp=sharing
Expand Down
2 changes: 1 addition & 1 deletion docs/source/cpp/pretrained_models/offline_ctc/wenet.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
WeNet
=====

This section lists models from `wenet`_.
This section lists models from `WeNet`_.

wenet-english-model (English)
-----------------------------
Expand Down
1 change: 1 addition & 0 deletions docs/source/onnx/pretrained_models/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ available pre-trained models.
offline-paraformer/index
offline-ctc/index
whisper/index
wenet/index
small-online-models
36 changes: 36 additions & 0 deletions docs/source/onnx/pretrained_models/wenet/all-models.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
All models from WeNet
=====================

`<https://github.com/wenet-e2e/wenet/blob/main/docs/pretrained_models.en.md>`_
lists all pre-trained models from `WeNet`_ and we have converted all of them
to `sherpa-onnx`_ using the following script:

`<https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/run.sh>`_.

We have uploaded the exported models to huggingface and you can find them from
the following figure:

.. figure:: ./pic/wenet-models-onnx-list.jpg
:alt: All pretrained models from `WeNet`
:width: 600

All pre-trained models from `WeNet`_.

To make it easier to copy the links, we list them below:

- `<https://huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-aishell>`_
- `<https://huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-aishell2>`_
- `<https://huggingface.co/csukuangfj/sherpa-onnx-en-wenet-gigaspeech>`_
- `<https://huggingface.co/csukuangfj/sherpa-onnx-en-wenet-librispeech>`_
- `<https://huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-multi-cn>`_
- `<https://huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-wenetspeech>`_

Colab
-----

We provide a colab notebook
|Sherpa-onnx wenet ctc colab notebook|
for you to try the exported `WeNet`_ models with `sherpa-onnx`_.

.. |Sherpa-onnx wenet ctc colab notebook| image:: https://colab.research.google.com/assets/colab-badge.svg
:target: https://github.com/k2-fsa/colab/blob/master/sherpa-onnx/sherpa_onnx_with_models_from_wenet.ipynb
99 changes: 99 additions & 0 deletions docs/source/onnx/pretrained_models/wenet/how-to-export.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
How to export models from WeNet to sherpa-onnx
==============================================

Suppose you have the following files from `WeNet`_:

- ``final.pt``
- ``train.yaml``
- ``global_cmvn``
- ``units.txt``

We describe below how to use scripts from `sherpa-onnx`_ to export your files.

.. hint::

Both streaming and non-streaming models are supported.

Export for non-streaming inference
----------------------------------

You can use the following script

`<https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/export-onnx.py>`_

to export your model to `sherpa-onnx`_. After running it, you should get two files:

- ``model.onnx``
- ``model.int8.onnx``.

Next, we rename ``units.txt`` to ``tokens.txt`` to follow the convention used in `sherpa-onnx`_:

.. code-block:: bash
mv units.txt tokens.txt
Now you can use the following command for speech recognition with the exported models:

.. code-block:: bash
# with float32 models
./build/bin/sherpa-onnx-offline \
--wenet-ctc-model=./model.onnx
--tokens=./tokens.txt \
/path/to/some.wav
# with int8 models
./build/bin/sherpa-onnx-offline \
--wenet-ctc-model=./model.int8.onnx
--tokens=./tokens.txt \
/path/to/some.wav
Export for streaming inference
------------------------------

You can use the following script

`<https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/export-onnx-streaming.py>`_

to export your model to `sherpa-onnx`_. After running it, you should get two files:

- ``model-streaming.onnx``
- ``model-streaming.int8.onnx``.

Next, we rename ``units.txt`` to ``tokens.txt`` to follow the convention used in `sherpa-onnx`_:

.. code-block:: bash
mv units.txt tokens.txt
Now you can use the following command for speech recognition with the exported models:

.. code-block:: bash
# with float32 models
./build/bin/sherpa-onnx \
--wenet-ctc-model=./model-streaming.onnx
--tokens=./tokens.txt \
/path/to/some.wav
# with int8 models
./build/bin/sherpa-onnx \
--wenet-ctc-model=./model-streaming.int8.onnx
--tokens=./tokens.txt \
/path/to/some.wav
FAQs
----

sherpa-onnx/csrc/online-wenet-ctc-model.cc:Init:144 head does not exist in the metadata
---------------------------------------------------------------------------------------

.. code-block::
/Users/fangjun/open-source/sherpa-onnx/sherpa-onnx/csrc/online-wenet-ctc-model.cc:Init:144 head does not exist in the metadata
To fix the above error, please check the following two items:

- Make sure you are using ``model-streaming.onnx`` or ``model-streaing.int8.onnx``. The executable
you are running requires a streaming model as input.
- Make sure you use the script from `sherpa-onnx`_ to export your model.
12 changes: 12 additions & 0 deletions docs/source/onnx/pretrained_models/wenet/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
WeNet
=====

This page lists all CTC models from `WeNet`_.


.. toctree::
:maxdepth: 5

how-to-export
all-models

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/source/sherpa/pretrained_models/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Two kinds of end-to-end (E2E) models are supported by `k2-fsa/sherpa`_:

For CTC-based models, we support any type of models trained using CTC loss
as long as you can export the model via torchscript. Models from the following
frameworks are currently supported: `icefall`_, `wenet`_, and `torchaudio`_ (Wav2Vec 2.0).
frameworks are currently supported: `icefall`_, `WeNet`_, and `torchaudio`_ (Wav2Vec 2.0).
If you have a CTC model and want it to be supported in `k2-fsa/sherpa`_, please
create an issue at `<https://github.com/k2-fsa/sherpa/issues>`_.

Expand Down Expand Up @@ -46,7 +46,7 @@ This page lists all available pre-trained models that you can download.
for you to try offline recognition step by step.

It shows how to install sherpa and use it as offline recognizer,
which supports the models from icefall, the wenet framework and torchaudio.
which supports the models from icefall, the `WeNet`_ framework and torchaudio.

.. |Sherpa offline recognition python api colab notebook| image:: https://colab.research.google.com/assets/colab-badge.svg
:target: https://github.com/k2-fsa/colab/blob/master/sherpa/sherpa_offline_recognition_python_api_demo.ipynb
Expand Down
2 changes: 1 addition & 1 deletion docs/source/sherpa/pretrained_models/offline_ctc/wenet.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
WeNet
=====

This section lists models from `wenet`_.
This section lists models from `WeNet`_.

wenet-english-model (English)
-----------------------------
Expand Down

0 comments on commit 1c4fe27

Please sign in to comment.