Triton ensemble model configuration for transducer models #371

pavankumar-ds · 2023-04-27T17:48:34Z

Hello, could you please give a reference configuration for the ensemble transducer model in the example repository for a pure triton-based inference? Specifically, how do we interface the variable y from the scorer back to the decoder input? Also, the template misses the joiner_decoder_proj and joiner_decoder_proj parts.

The text was updated successfully, but these errors were encountered:

uni-saurabh-vyas · 2023-05-02T05:42:14Z

I am working on it now, and managed to follow the docs to make triton server up and running(all components including ensemble transducer are up and running), but when I am try to start client script it fails.

python3 decode_manifest_triton.py --manifest-filename /mnt/efs/dspavankumar/e/tamil_icefall/data/test_re/icefall_manifests/cuts_1.jsonl.gz --server-addr 0.0.0.0 --server-port 8001 --streaming  --model-name transducer --chunk_size 16 --context 2

tritonclient.utils.InferenceServerException: [StatusCode.INVALID_ARGUMENT] in ensemble 'transducer', inference request for sequence 10107 to model 'feature_extractor' must specify the START flag on the first request of the sequence

One weird thing I noticed was that when I start the server, I see these warnings/errors

Cleaning up...
free(): invalid pointer
free(): invalid pointer

Is this possibly related to memory leak issue ?
triton-inference-server/server#3777

uni-saurabh-vyas · 2023-05-02T09:57:45Z

Also, I am trying pretrained model from section "Deploy onnx with arbitrary pruned_transducer_stateless_X(2,3,4,5) model for Chinese or English recipes"
at https://github.com/k2-fsa/sherpa/tree/master/triton

After downloading the model files, I am getting the following error:

./pruned_transducer_stateless3/export_onnx.py \
    --exp-dir ./icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/exp \
    --tokenizer-file ./icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/data/lang_bpe_500/bpe.model \
    --epoch 999 \
    --avg 1 \
    --streaming-model 1 \
    --causal-convolution 1 \
    --onnx 1 \
    --left-context 64 \
    --right-context 4 \
    --fp16

    sp.load(params.tokenizer_file)
File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/sentencepiece/__init__.py", line 905, in Load
  return self.LoadFromFile(model_file)
File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
  return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

uni-saurabh-vyas · 2023-05-04T11:04:16Z

@csukuangfj

I am getting this error when I try to run default streaming example provided in sherpa/triton folder (https://github.com/k2-fsa/sherpa/tree/master/triton/model_repo_streaming)

tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] in ensemble 'transducer', Failed to process the request(s) for model instance 'feature_extractor_0_0', message: Exception: ('Invalid first chunk size', 9360, 14880)

https://github.com/k2-fsa/sherpa/blob/master/triton/model_repo_streaming/feature_extractor/1/model.py#L46

Did you guys encounter this issue as well ? What is your current status on transducer setup for triton, is it stable for you guys ?
I would appreciate if you can suggest some pointers to address this issue, I can spend some time to fix issues if there any known issues, or if you want me try something else to make it work.

Also, as mentioned in previous comment(#371 (comment)), I suspect its related kaldifeats library memory leak issue, if this is a known issue, do you suggest to try using a different library for feature extraction ?

csukuangfj · 2023-05-04T11:21:23Z

@yuekaizhang

Could you help to have a look at this issue?

yuekaizhang · 2023-05-04T13:11:26Z

I am working on it now, and managed to follow the docs to make triton server up and running(all components including ensemble transducer are up and running), but when I am try to start client script it fails.
python3 decode_manifest_triton.py --manifest-filename /mnt/efs/dspavankumar/e/tamil_icefall/data/test_re/icefall_manifests/cuts_1.jsonl.gz --server-addr 0.0.0.0 --server-port 8001 --streaming  --model-name transducer --chunk_size 16 --context 2

tritonclient.utils.InferenceServerException: [StatusCode.INVALID_ARGUMENT] in ensemble 'transducer', inference request for sequence 10107 to model 'feature_extractor' must specify the START flag on the first request of the sequence
One weird thing I noticed was that when I start the server, I see these warnings/errors
Cleaning up...
free(): invalid pointer
free(): invalid pointer
Is this possibly related to memory leak issue ? triton-inference-server/server#3777

Hi, thanks for trying this triton recipe.

inference request for sequence 10107 to model 'feature_extractor' must specify the START flag on the first request of the sequence
This error may be caused by outdated request. At the beginning of the service startup, due to insufficient warming up, if a request is cleared due to timeout, it will cause later arriving chunks to lose their start flag. You may first try to warmup service with small batch size and concurrency.
free(): invalid pointer This warning (which I have no idea yet) should be fine.
tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] in ensemble 'transducer', Failed to process the request(s) for model instance 'feature_extractor_0_0', message: Exception: ('Invalid first chunk size', 9360, 14880) This issue is caused by --context 2, you should use --encoder_right_context which is for icefall models. https://github.com/k2-fsa/sherpa/blob/master/triton/client/decode_manifest_triton.py#L161 This is for wenet models.

uni-saurabh-vyas · 2023-05-04T13:48:31Z

Hi @yuekaizhang
thanks for your response.

I ensured that the config parameters in $model_repo_path/*/config.pbtxt are matching properties as per onnx export log file
icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/exp/onnx_export.log

For reference:

ENCODER_LEFT_CONTEXT: 64
ENCODER_RIGHT_CONTEXT: 4
ENCODER_DIM: 512
DECODER_DIM: 512
VOCAB_SIZE: 500
DECODER_CONTEXT_SIZE: 2
CNN_MODULE_KERNEL: 31
ENCODER_LAYERS: 12
All params:{'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'encoder_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'decoder_dim': 512, 'joiner_dim': 512, 'model_warm_step': 3000, 'env_info': {'k2-version': '1.23.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '62e404dd3f3a811d73e424199b3408e309c06e1a', 'k2-git-date': 'Mon Jan 30 02:26:16 2023', 'lhotse-version': '1.12.0', 'torch-version': '1.13.0', 'torch-cuda-available': True, 'torch-cuda-version': '11.6', 'python-version': '3.1', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/mnt/efs/dspavankumar/tools/icefall', 'k2-path': '/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'ip-10-40-5-20', 'IP address': '127.0.0.1'}, 'epoch': 1111, 'iter': 0, 'avg': 1, 'exp_dir': PosixPath('icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/exp'), 'tokenizer_file': './icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/data/lang_bpe_500/bpe.model', 'onnx': True, 'context_size': 2, 'left_context': 64, 'right_context': 4, 'streaming_model': True, 'fp16': True, 'dynamic_chunk_training': False, 'causal_convolution': True, 'short_chunk_size': 25, 'num_left_chunks': 4, 'blank_id': 0, 'vocab_size': 500}

Then I ran the client again python3 decode_manifest_triton.py --encoder_right_context 4 --chunk_size 16 --manifest-filename /mnt/efs/dspavankumar/e/tamil_icefall/data/test_re/icefall_manifests/cuts.jsonl.gz --server-addr 0.0.0.0 --server-port 8001 --streaming --model-name transducer

Still getting same error

task-48: 0/221
task-49: 0/221
Traceback (most recent call last):
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 485, in <module>
    asyncio.run(main())
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/base_events.py", line 649, in run_until_c
omplete
    return future.result()
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 433, in main
    ans_list = await asyncio.gather(*tasks)
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 316, in send_streaming
    response = await triton_client.infer(model_name,
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/aio/__init__.py",
 line 727, in infer
    raise_error_grpc(rpc_error)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", lin
e 62, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] in ensemble 'transducer', Failed to process the request(s) fo
r model instance 'feature_extractor_0_1', message: Exception: ('Invalid first chunk size', 12640, 14880)

At:
  /mnt/efs/dspavankumar/tools/sherpa/triton/model_repo_streaming_pretrained/feature_extractor/1/model.py(47): add_wavs
  /mnt/efs/dspavankumar/tools/sherpa/triton/model_repo_streaming_pretrained/feature_extractor/1/model.py(221): execute

" This error may be caused by outdated request. At the beginning of the service startup, due to insufficient warming up, if a request is cleared due to timeout, it will cause later arriving chunks to lose their start flag. You may first try to warmup service with small batch size and concurrency."

I also tried with --num-tasks 1 argument, but it still fails.

/mnt/efs/dspavankumar/tools/sherpa/triton/client$ python3 decode_manifest_triton.py --num-tasks 1 --encoder_right_context 4 --chunk_size 16 --manifest-filename /mnt/efs/dspavankumar/e/tamil_icefall/data/test_re/icefall_manifests/cuts.jsonl.gz --server-addr 0.0.0.0 --server-port 8001 --streaming --model-name transducer
task-0: 0/11077
/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/lhotse/audio.py:164: UserWarning: You requested a subset of a recording that is read from disk via a bash command. Expect large I/O overhead if you are going to read many chunks like these, since every time we will read the whole file rather than its subset.
  warnings.warn(
Traceback (most recent call last):
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 485, in <module>
    asyncio.run(main())
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 433, in main
    ans_list = await asyncio.gather(*tasks)
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 316, in send_streaming
    response = await triton_client.infer(model_name,
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/aio/__init__.py", line 727, in infer
    raise_error_grpc(rpc_error)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", line 62, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INVALID_ARGUMENT] in ensemble 'transducer', inference request for sequence 10086 to model 'feature_extractor' must specify the START flag on the first request of the sequence

yuekaizhang · 2023-05-05T03:11:05Z

Hi @yuekaizhang thanks for your response.

I ensured that the config parameters in $model_repo_path/*/config.pbtxt are matching properties as per onnx export log file icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/exp/onnx_export.log

For reference:

ENCODER_LEFT_CONTEXT: 64
ENCODER_RIGHT_CONTEXT: 4
ENCODER_DIM: 512
DECODER_DIM: 512
VOCAB_SIZE: 500
DECODER_CONTEXT_SIZE: 2
CNN_MODULE_KERNEL: 31
ENCODER_LAYERS: 12
All params:{'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'encoder_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'decoder_dim': 512, 'joiner_dim': 512, 'model_warm_step': 3000, 'env_info': {'k2-version': '1.23.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '62e404dd3f3a811d73e424199b3408e309c06e1a', 'k2-git-date': 'Mon Jan 30 02:26:16 2023', 'lhotse-version': '1.12.0', 'torch-version': '1.13.0', 'torch-cuda-available': True, 'torch-cuda-version': '11.6', 'python-version': '3.1', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/mnt/efs/dspavankumar/tools/icefall', 'k2-path': '/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'ip-10-40-5-20', 'IP address': '127.0.0.1'}, 'epoch': 1111, 'iter': 0, 'avg': 1, 'exp_dir': PosixPath('icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/exp'), 'tokenizer_file': './icefall_librispeech_streaming_pruned_transducer_stateless3_giga_0.9_20220625/data/lang_bpe_500/bpe.model', 'onnx': True, 'context_size': 2, 'left_context': 64, 'right_context': 4, 'streaming_model': True, 'fp16': True, 'dynamic_chunk_training': False, 'causal_convolution': True, 'short_chunk_size': 25, 'num_left_chunks': 4, 'blank_id': 0, 'vocab_size': 500}

Then I ran the client again python3 decode_manifest_triton.py --encoder_right_context 4 --chunk_size 16 --manifest-filename /mnt/efs/dspavankumar/e/tamil_icefall/data/test_re/icefall_manifests/cuts.jsonl.gz --server-addr 0.0.0.0 --server-port 8001 --streaming --model-name transducer

Still getting same error

task-48: 0/221
task-49: 0/221
Traceback (most recent call last):
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 485, in <module>
    asyncio.run(main())
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/base_events.py", line 649, in run_until_c
omplete
    return future.result()
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 433, in main
    ans_list = await asyncio.gather(*tasks)
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 316, in send_streaming
    response = await triton_client.infer(model_name,
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/aio/__init__.py",
 line 727, in infer
    raise_error_grpc(rpc_error)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", lin
e 62, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] in ensemble 'transducer', Failed to process the request(s) fo
r model instance 'feature_extractor_0_1', message: Exception: ('Invalid first chunk size', 12640, 14880)

At:
  /mnt/efs/dspavankumar/tools/sherpa/triton/model_repo_streaming_pretrained/feature_extractor/1/model.py(47): add_wavs
  /mnt/efs/dspavankumar/tools/sherpa/triton/model_repo_streaming_pretrained/feature_extractor/1/model.py(221): execute

" This error may be caused by outdated request. At the beginning of the service startup, due to insufficient warming up, if a request is cleared due to timeout, it will cause later arriving chunks to lose their start flag. You may first try to warmup service with small batch size and concurrency."

I also tried with --num-tasks 1 argument, but it still fails.

/mnt/efs/dspavankumar/tools/sherpa/triton/client$ python3 decode_manifest_triton.py --num-tasks 1 --encoder_right_context 4 --chunk_size 16 --manifest-filename /mnt/efs/dspavankumar/e/tamil_icefall/data/test_re/icefall_manifests/cuts.jsonl.gz --server-addr 0.0.0.0 --server-port 8001 --streaming --model-name transducer
task-0: 0/11077
/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/lhotse/audio.py:164: UserWarning: You requested a subset of a recording that is read from disk via a bash command. Expect large I/O overhead if you are going to read many chunks like these, since every time we will read the whole file rather than its subset.
  warnings.warn(
Traceback (most recent call last):
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 485, in <module>
    asyncio.run(main())
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 433, in main
    ans_list = await asyncio.gather(*tasks)
  File "/mnt/efs/dspavankumar/tools/sherpa/triton/client/decode_manifest_triton.py", line 316, in send_streaming
    response = await triton_client.infer(model_name,
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/aio/__init__.py", line 727, in infer
    raise_error_grpc(rpc_error)
  File "/mnt/efs/dspavankumar/tools/miniconda3/envs/icefall_env/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", line 62, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INVALID_ARGUMENT] in ensemble 'transducer', inference request for sequence 10086 to model 'feature_extractor' must specify the START flag on the first request of the sequence

https://github.com/k2-fsa/sherpa/blob/master/triton/client/decode_manifest_triton.py#L381-L383

Here, please check your first_chunk_ms, decoding_window_length, decode_window_length = (args.chunk_size + 2 + args.encoder_right_context) * args.subsampling + 3 decode_window_length should be (16 + 2 + 4)*4 + 3 = 91, first_chunk_ms = (decode_window_length + add_frames) * frame_shift_ms add_frames should be 2.

uni-saurabh-vyas · 2023-05-07T14:34:24Z

I have checked, these values are correct.

ipdb> decode_window_length
91
ipdb> print(args.chunk_size )
16
ipdb> print(args.encoder_right_context)
4
ipdb> print(args.subsampling)
4
ipdb> print(add_frames)
2
ipdb> print(frame_shift_ms)
10

yuekaizhang · 2023-05-08T08:54:01Z

Failed to process the request(s) fo
r model instance 'feature_extractor_0_1', message: Exception: ('Invalid first chunk size', 12640, 14880)

If the values are correct, could you trace back to figure out how do you get this 12640 number?

uni-saurabh-vyas · 2023-05-09T11:51:09Z

Hi @yuekaizhang

I noticed that in the wav_segs(https://github.com/k2-fsa/sherpa/blob/master/triton/client/decode_manifest_triton.py#L269), in the last segment, the number of samples(length) are different from all other segments, causing an issue.

So after adding
del(wav_segs[-1])

at https://github.com/k2-fsa/sherpa/blob/master/triton/client/decode_manifest_triton.py#L282 problem is fixed.

Do you think this is a bug?

yuekaizhang · 2023-05-09T12:53:21Z

I am not sure. If it is a bug, it will exist in feature_extractor/1/model.py rather than this client here. Could you make sure that https://github.com/k2-fsa/sherpa/blob/master/triton/model_repo_streaming/feature_extractor/1/model.py#L52 here assert len(self.wav) > 0 always hold? Otherwise, there is a problem somewhere.

Since if you keep that last seg, I don't understand https://github.com/k2-fsa/sherpa/blob/master/triton/model_repo_streaming/feature_extractor/1/model.py#L45 why len(self.wav) become 0 except for first chunk.

How do you fix this previous issue inference request for sequence 10107 to model 'feature_extractor' must specify the START flag on the first request of the sequence I think it may be related to the outdate request.

uni-saurabh-vyas · 2023-05-09T13:21:47Z

Good observation,
so that error was caused due to few very short cuts present in jsonl, I used a different cuts file(which didnt have very short (<0.3 seconds cuts)), and I think that might have fixed that particular issue.

yuekaizhang · 2023-05-09T13:25:15Z

Okay, close the issue since it is fixed.

uni-saurabh-vyas mentioned this issue May 2, 2023

add missing git lfs command to docs #372

Merged

yuekaizhang closed this as completed May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triton ensemble model configuration for transducer models #371

Triton ensemble model configuration for transducer models #371

pavankumar-ds commented Apr 27, 2023 •

edited

Loading

uni-saurabh-vyas commented May 2, 2023

uni-saurabh-vyas commented May 2, 2023

uni-saurabh-vyas commented May 4, 2023

csukuangfj commented May 4, 2023

yuekaizhang commented May 4, 2023 •

edited

Loading

uni-saurabh-vyas commented May 4, 2023 •

edited

Loading

yuekaizhang commented May 5, 2023

uni-saurabh-vyas commented May 7, 2023

yuekaizhang commented May 8, 2023

uni-saurabh-vyas commented May 9, 2023

yuekaizhang commented May 9, 2023 •

edited

Loading

uni-saurabh-vyas commented May 9, 2023

yuekaizhang commented May 9, 2023

Triton ensemble model configuration for transducer models #371

Triton ensemble model configuration for transducer models #371

Comments

pavankumar-ds commented Apr 27, 2023 • edited Loading

uni-saurabh-vyas commented May 2, 2023

uni-saurabh-vyas commented May 2, 2023

uni-saurabh-vyas commented May 4, 2023

csukuangfj commented May 4, 2023

yuekaizhang commented May 4, 2023 • edited Loading

uni-saurabh-vyas commented May 4, 2023 • edited Loading

yuekaizhang commented May 5, 2023

uni-saurabh-vyas commented May 7, 2023

yuekaizhang commented May 8, 2023

uni-saurabh-vyas commented May 9, 2023

yuekaizhang commented May 9, 2023 • edited Loading

uni-saurabh-vyas commented May 9, 2023

yuekaizhang commented May 9, 2023

pavankumar-ds commented Apr 27, 2023 •

edited

Loading

yuekaizhang commented May 4, 2023 •

edited

Loading

uni-saurabh-vyas commented May 4, 2023 •

edited

Loading

yuekaizhang commented May 9, 2023 •

edited

Loading