Skip to content

Commit

Permalink
[DAP/Whisper] Remove redundant code.
Browse files Browse the repository at this point in the history
  • Loading branch information
taiqzheng committed Jul 1, 2024
1 parent 00b6aa9 commit e433100
Show file tree
Hide file tree
Showing 5 changed files with 22 additions and 51 deletions.
8 changes: 3 additions & 5 deletions examples/BuddyWhisper/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Buddy Compiler WHISPER Example

## Introduction
This example shows how to use Buddy Compiler to compile a WHISPER model to MLIR code then run it. The [model](openai/whisper-base) is a pre-trained model for automatic speech recognition (ASR) and speech translation.
This example shows how to use Buddy Compiler to compile a WHISPER model to MLIR code then run it. The [model](openai/whisper-base) is a pre-trained model for automatic speech recognition (ASR) and speech translation (ST).


## How to run
Expand All @@ -13,15 +13,13 @@ This example shows how to use Buddy Compiler to compile a WHISPER model to MLIR
$ export PYTHONPATH=/path-to-buddy-mlir/llvm/build/tools/mlir/python_packages/mlir_core:/path-to-buddy-mlir/build/python_packages:${PYTHONPATH}
```

3. Set model and dataset environment variable.
3. Set model environment variable.

```bash
$ export WHISPER_MODEL_PATH=/path-to-whisper-model/
$ export AUDIO_DATASET_PATH=/path-to-audio-dataset/

// For example:
$ export WHISPER_MODEL_PATH=/home/xxx/whisper-base
$ export AUDIO_DATASET_PATH=/home/xxx/librispeech_asr_dummy
```

4. Build and run the WHISPER example
Expand All @@ -33,4 +31,4 @@ $ cd bin
$ ./buddy-whisper-run
```

4. Enjoy it!
5. Enjoy it!
19 changes: 7 additions & 12 deletions examples/BuddyWhisper/import-whisper.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,15 @@
#
# ===---------------------------------------------------------------------------
#
# This is the example of whisper model.
# This is an example for whisper model.
#
# ===---------------------------------------------------------------------------

import os
import torch
import torch._dynamo as dynamo
from torch._inductor.decomposition import decompositions as inductor_decomp
from transformers import WhisperProcessor, WhisperForConditionalGeneration
from datasets import load_dataset
from transformers import WhisperForConditionalGeneration
import numpy

from buddy.compiler.frontend import DynamoCompiler
Expand All @@ -36,22 +35,18 @@
if model_path is None:
model_path = "openai/whisper-base"

# Initialize the tokenizer and model from the specified model path.
processor = WhisperProcessor.from_pretrained(model_path)
# Initialize the model from the specified model path.
model = WhisperForConditionalGeneration.from_pretrained(model_path)
model.config.use_cache = False

ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = ds[1]["audio"]
input_features = processor(
sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt"
).input_features

decoder_input_ids = torch.tensor([[50258] * 448], dtype=torch.long)
# Generate placeholder for inputs.
input_features = torch.zeros(size=(1, 80, 3000), dtype=torch.float32)
decoder_input_ids = torch.zeros(size=(1, 448), dtype=torch.long)
inputs = {
"input_features": input_features,
"decoder_input_ids": decoder_input_ids,
}

# Initialize Dynamo Compiler with specific configurations as an importer.
dynamo_compiler = DynamoCompiler(
primary_registry=tosa.ops_registry,
Expand Down
Binary file removed examples/BuddyWhisper/input_features.data
Binary file not shown.
9 changes: 6 additions & 3 deletions examples/BuddyWhisper/whisper-main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
// limitations under the License.
//
//===----------------------------------------------------------------------===//
//
// This file implements an example for Whisper Model Inference.
//
// ------------------------------------------------------------------------===//

#include "whisper-main.h"

Expand All @@ -29,8 +33,6 @@ int main() {
/// Define directories of vacabulary and parameter file.
const std::string vocabDir = "../../examples/BuddyWhisper/vocab.txt";
const std::string paramsDir = "../../examples/BuddyWhisper/arg0.data";
const std::string input_featuresDir =
"../../examples/BuddyWhisper/input_features.data";

/// Initialize data containers
// - Result container
Expand All @@ -47,7 +49,8 @@ int main() {

/// Fill data into containers
// - Output: register vocabulary.
// - Parameters: generate audioInput from rawAudioData.
// - Parameters: load parameters from the `arg0` file into the container.
// - Input: generate audioInput from rawAudioData.
outputContainer.loadVocab(vocabDir);
loadParameters(paramsDir, paramsContainer);
rawAudioData = std::move(MemRef<double, 1>(rawSpeech, inputShape));
Expand Down
37 changes: 6 additions & 31 deletions examples/BuddyWhisper/whisper-main.h

Large diffs are not rendered by default.

0 comments on commit e433100

Please sign in to comment.