forked from microsoft/onnxruntime-extensions
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
41 changed files
with
2,824 additions
and
1,525 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# How to write custom ops | ||
|
||
Custom Ops are based on ONNXRuntime-extensions API, especially **OrtLiteCustomOp** and **Tensor** class. C++ template metaprogramming is heavily used under the hood to provide big flexibility to the Custom Op authors on the parameter's count, type and order. | ||
|
||
## Basic scenario | ||
|
||
You have 2 ways to write a custom op: by writing a function, or by writing a structure. | ||
|
||
### Custom op in the form of function | ||
|
||
If your kernel is simple, you can use this option by just providing a function to compute the customized kernel. That function can have arbitrary number of inputs and outputs. For the inputs that are mandatory, their type would be like: | ||
|
||
```C++ | ||
const Ort::Custom::Tensor<T>& | ||
// or | ||
const Ort::Custom::Tensor<T>* | ||
``` | ||
|
||
For the inputs that are optional, their type would be like: | ||
|
||
```C++ | ||
std::optional<const Ort::Custom::Tensor<T>*> | ||
``` | ||
|
||
The function can also accept the pointer of **CUDAKernelContext**, where you can retrieve CUDA stream and other CUDA resources, if it requires to be run in CUDA GPU. | ||
|
||
The function will return the type **OrtStatusPtr** | ||
|
||
Please refer to [negpos_def.h](https://github.com/microsoft/onnxruntime-extensions/blob/main/operators/math/cuda/negpos_def.h) as an example and [tensor_tuple.inc](https://github.com/microsoft/onnxruntime-extensions/blob/main/include/custom_op/tensor_tuple.inc) for more possible parameter types. | ||
|
||
### Custom op in the form of structure | ||
|
||
If the kernel is complicated and there are extra properties of the custom op, you can use this option by providing a C++ structure where you can put these properties as the structure's member variables. Besides that, you also need to provide the following member functions: | ||
|
||
```C++ | ||
OrtStatusPtr OnModelAttach(const OrtApi& api, const OrtKernelInfo& info) // This function initialize the properties of the custom op | ||
|
||
OrtStatusPtr Compute(...) const // This function computes the customized kernel. | ||
``` | ||
The specification of the parameters of the Compute function is the same as the first way (custom op in the form of function) | ||
## Advanced scenario | ||
In some cases you need more control on the parameters, in this case you have to use the structure form, which you need to provide the implementations of the following member functions such as: | ||
```C++ | ||
// By default the function will return OrtMemType::OrtMemTypeDefault for all the inputs, | ||
// you can provide your own implementation to specify the ith input is in CPU or GPU. | ||
static OrtMemType GetInputMemoryType(size_t input_index) | ||
// You can specify input i shares the same memory with output j if possible, by allocating | ||
// two array with same length for the pointer input_index and output_index seperately, and | ||
// then let (*input_index)[k] = i and (*output_index)[k] = j. | ||
// The return value is the length of the allocated array. | ||
static size_t GetMayInplace(int** input_index, int** output_index) | ||
// Release the allocated array from the GetMayInplace() function. | ||
static void ReleaseMayInplace(int* input_index, int* output_index) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# ONNXRuntime Extensions C ABI | ||
|
||
ONNXRuntime Extensions provides a C-style ABI for pre-processing. It offers support for tokenization, image processing, speech feature extraction, and more. You can compile the ONNXRuntime Extensions as either a static library or a dynamic library to access these APIs. | ||
|
||
The C ABI header files are named `ortx_*.h` and can be found in the include folder. There are three types of data processing APIs available: | ||
|
||
- [`ortx_tokenizer.h`](../include/ortx_tokenizer.h): Provides tokenization for LLM models. | ||
- [`ortx_processor.h`](../include/ortx_processor.h): Offers image processing APIs for multimodels. | ||
- [`ortx_extraction.h`](../include/ortx_extractor.h): Provides speech feature extraction for audio data processing to assist the Whisper model. | ||
|
||
## ABI QuickStart | ||
|
||
Most APIs accept raw data inputs such as audio, image compressed binary formats, or UTF-8 encoded text for tokenization. | ||
|
||
**Tokenization:** You can create a tokenizer object using `OrtxCreateTokenizer` and then use the object to tokenize a text or decode the token ID into the text. A C-style code snippet is available [here](../test/pp_api_test/c_only_test.c). | ||
|
||
**Image processing:** `OrtxCreateProcessor` can create an image processor object from a pre-defined workflow in JSON format to process image files into a tensor-like data type. An example code snippet can be found [here](../test/pp_api_test/test_processor.cc#L75). | ||
|
||
**Audio feature extraction:** `OrtxCreateSpeechFeatureExtractor` creates a speech feature extractor to obtain log mel spectrum data as input for the Whisper model. An example code snippet can be found [here](../test/pp_api_test/test_feature_extractor.cc#L16). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
// C ABI header file for the onnxruntime-extensions tokenization module | ||
|
||
#pragma once | ||
|
||
#include "ortx_utils.h" | ||
|
||
typedef OrtxObject OrtxFeatureExtractor; | ||
typedef OrtxObject OrtxRawAudios; | ||
typedef OrtxObject OrtxTensorResult; | ||
|
||
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
|
||
/** | ||
* @brief Creates a feature extractor object. | ||
* | ||
* This function creates a feature extractor object based on the provided feature definition. | ||
* | ||
* @param[out] extractor Pointer to a pointer to the created feature extractor object. | ||
* @param[in] fe_def The feature definition used to create the feature extractor. | ||
* | ||
* @return An error code indicating the result of the operation. | ||
*/ | ||
extError_t ORTX_API_CALL OrtxCreateSpeechFeatureExtractor(OrtxFeatureExtractor** extractor, const char* fe_def); | ||
|
||
/** | ||
* Loads a collection of audio files into memory. | ||
* | ||
* This function loads a collection of audio files specified by the `audio_paths` array | ||
* into memory and returns a pointer to the loaded audio data in the `audios` parameter. | ||
* | ||
* @param audios A pointer to a pointer that will be updated with the loaded audio data. | ||
* The caller is responsible for freeing the memory allocated for the audio data. | ||
* @param audio_paths An array of strings representing the paths to the audio files to be loaded. | ||
* @param num_audios The number of audio files to be loaded. | ||
* | ||
* @return An `extError_t` value indicating the success or failure of the operation. | ||
*/ | ||
extError_t ORTX_API_CALL OrtxLoadAudios(OrtxRawAudios** audios, const char* const* audio_paths, size_t num_audios); | ||
|
||
/** | ||
* @brief Creates an array of raw audio objects. | ||
* | ||
* This function creates an array of raw audio objects based on the provided data and sizes. | ||
* | ||
* @param audios Pointer to the variable that will hold the created raw audio objects. | ||
* @param data Array of pointers to the audio data. | ||
* @param sizes Array of pointers to the sizes of the audio data. | ||
* @param num_audios Number of audio objects to create. | ||
* | ||
* @return extError_t Error code indicating the success or failure of the operation. | ||
*/ | ||
extError_t ORTX_API_CALL OrtxCreateRawAudios(OrtxRawAudios** audios, const void* data[], const int64_t* sizes[], size_t num_audios); | ||
|
||
/** | ||
* @brief Calculates the log mel spectrogram for a given audio using the specified feature extractor. | ||
* | ||
* This function takes an instance of the OrtxFeatureExtractor struct, an instance of the OrtxRawAudios struct, | ||
* and a pointer to a OrtxTensorResult pointer. It calculates the log mel spectrogram for the given audio using | ||
* the specified feature extractor and stores the result in the provided log_mel pointer. | ||
* | ||
* @param extractor The feature extractor to use for calculating the log mel spectrogram. | ||
* @param audio The raw audio data to process. | ||
* @param log_mel A pointer to a OrtxTensorResult pointer where the result will be stored. | ||
* @return An extError_t value indicating the success or failure of the operation. | ||
*/ | ||
extError_t ORTX_API_CALL OrtxSpeechLogMel(OrtxFeatureExtractor* extractor, OrtxRawAudios* audio, OrtxTensorResult** log_mel); | ||
|
||
#ifdef __cplusplus | ||
} | ||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.