Skip to content

Commit

Permalink
fix lint
Browse files Browse the repository at this point in the history
  • Loading branch information
yuekaizhang committed Jan 22, 2024
1 parent b623c3b commit 8d9ab30
Show file tree
Hide file tree
Showing 10 changed files with 257 additions and 229 deletions.
7 changes: 7 additions & 0 deletions egs/aishell/ASR/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,10 @@ The following table lists the differences among them.
The decoder in `transducer_stateless` is modified from the paper
[Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).
We place an additional Conv1d layer right after the input embedding layer.

# Whisper

Recipe to finetune large pretrained models
| | Encoder | Decoder | Comment |
|------------------------------------|-----------|--------------------|-----------------------------------------------------------------------------------|
| `whisper` | Transformer | Transformer | support fine-tuning using deepspeed
8 changes: 4 additions & 4 deletions egs/aishell/ASR/RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ It's reworked Zipformer with Pruned RNNT loss.

Command for training is:
```bash
./prepare.sh
./prepare.sh

export CUDA_VISIBLE_DEVICES="0,1"

Expand Down Expand Up @@ -142,7 +142,7 @@ export CUDA_VISIBLE_DEVICES="0,1"
--feedforward-dim 512,768,768,768,768,768 \
--encoder-dim 192,256,256,256,256,256 \
--encoder-unmasked-dim 192,192,192,192,192,192 \
--max-duration 1200
--max-duration 1200
```

Command for decoding is:
Expand Down Expand Up @@ -192,7 +192,7 @@ export CUDA_VISIBLE_DEVICES="0,1"
--feedforward-dim 512,768,1536,2048,1536,768 \
--encoder-dim 192,256,512,768,512,256 \
--encoder-unmasked-dim 192,192,256,320,256,192 \
--max-duration 800
--max-duration 800
```

Command for decoding is:
Expand All @@ -208,7 +208,7 @@ for m in greedy_search modified_beam_search fast_beam_search ; do
--num-encoder-layers 2,2,4,5,4,2 \
--feedforward-dim 512,768,1536,2048,1536,768 \
--encoder-dim 192,256,512,768,512,256 \
--encoder-unmasked-dim 192,192,256,320,256,192
--encoder-unmasked-dim 192,192,256,320,256,192
done
```

Expand Down
23 changes: 18 additions & 5 deletions egs/aishell/ASR/local/compute_fbank_aishell.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,14 @@
from pathlib import Path

import torch
from lhotse import CutSet, Fbank, FbankConfig, WhisperFbank, WhisperFbankConfig, LilcomChunkyWriter
from lhotse import (
CutSet,
Fbank,
FbankConfig,
LilcomChunkyWriter,
WhisperFbank,
WhisperFbankConfig,
)
from lhotse.recipes.utils import read_manifests_if_cached

from icefall.utils import get_executor, str2bool
Expand All @@ -42,7 +49,9 @@
torch.set_num_interop_threads(1)


def compute_fbank_aishell(num_mel_bins: int = 80, perturb_speed: bool = False, whisper_fbank: bool = False):
def compute_fbank_aishell(
num_mel_bins: int = 80, perturb_speed: bool = False, whisper_fbank: bool = False
):
src_dir = Path("data/manifests")
output_dir = Path("data/fbank")
num_jobs = min(15, os.cpu_count())
Expand All @@ -69,7 +78,9 @@ def compute_fbank_aishell(num_mel_bins: int = 80, perturb_speed: bool = False, w
dataset_parts,
)
if whisper_fbank:
extractor = WhisperFbank(WhisperFbankConfig(num_filters=num_mel_bins, device='cuda'))
extractor = WhisperFbank(
WhisperFbankConfig(num_filters=num_mel_bins, device="cuda")
)
else:
extractor = Fbank(FbankConfig(num_mel_bins=num_mel_bins))

Expand All @@ -84,7 +95,7 @@ def compute_fbank_aishell(num_mel_bins: int = 80, perturb_speed: bool = False, w
supervisions=m["supervisions"],
)
if "train" in partition and perturb_speed:
logging.info(f"Doing speed perturb")
logging.info("Doing speed perturb")
cut_set = (
cut_set + cut_set.perturb_speed(0.9) + cut_set.perturb_speed(1.1)
)
Expand Down Expand Up @@ -129,5 +140,7 @@ def get_args():

args = get_args()
compute_fbank_aishell(
num_mel_bins=args.num_mel_bins, perturb_speed=args.perturb_speed, whisper_fbank=args.whisper_fbank
num_mel_bins=args.num_mel_bins,
perturb_speed=args.perturb_speed,
whisper_fbank=args.whisper_fbank,
)
2 changes: 1 addition & 1 deletion egs/aishell/ASR/prepare.sh
Original file line number Diff line number Diff line change
Expand Up @@ -387,4 +387,4 @@ if [ $stage -le 30 ] && [ $stop_stage -ge 30 ]; then
./local/compute_fbank_musan.py --num-mel-bins ${whisper_mel_bins} --whisper-fbank true
touch data/fbank/.aishell.whisper.done
fi
fi
fi
Loading

0 comments on commit 8d9ab30

Please sign in to comment.