diff --git a/README.md b/README.md index 81c181b9..16208e8e 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # ESP-DL [[中文]](./README_cn.md) -[![Documentation Status](./docs/_static/doc_latest.svg)](https://docs.espressif.com/projects/esp-dl/en/latest/index.html) +[![Documentation Status](./docs/_static/doc_latest.svg)](https://docs.espressif.com/projects/esp-dl/en/latest/index.html) [![Component Registry](https://components.espressif.com/components/espressif/esp-dl/badge.svg)](https://components.espressif.com/components/espressif/esp-dl) ESP-DL is a lightweight and efficient neural network inference framework designed specifically for ESP series chips. With ESP-DL, you can easily and quickly develop AI applications using Espressif's System on Chips (SoCs). @@ -40,7 +40,7 @@ pip install git+https://github.com/espressif/esp-ppq.git First, please refer to the [ESP-DL Operator Support State](./operator_support_state.md) to ensure that the operators in your model are already supported. ESP-PPQ can directly read ONNX models for quantization. Pytorch and TensorFlow need to be converted to ONNX models first, so make sure your model can be converted to ONNX models. -We provide the following python script templates. Please select the appropriate template to quantize your models. For more details about quantization, please refer to [tutorial/how_to_quantize_model](./tutorial/how_to_quantize_model_en.md). +We provide the following python script templates. Please select the appropriate template to quantize your models. For more details about quantization, please refer to [Using ESP-PPQ for Model Quantization](https://docs.espressif.com/projects/esp-dl/en/latest/tutorials/how_to_quantize_model.html). [quantize_onnx_model.py](./tools/quantization/quantize_onnx_model.py): Quantize ONNX models @@ -61,7 +61,7 @@ Model *model = new Model((const char *)espdl_model, fbs::MODEL_LOCATION_IN_FLASH model->run(inputs); // inputs is a tensor or a map of tensors ``` -For more details, please refer to [tutorial/how_to_load_model](./tutorial/how_to_load_model_en.md) and [mobilenet_v2 examples](./examples/mobilenet_v2/) +For more details, please refer to [Loading Models with ESP-DL](https://docs.espressif.com/projects/esp-dl/en/latest/tutorials/how_to_load_model.html) and [mobilenet_v2 examples](./examples/mobilenet_v2/) ## Support Models @@ -74,6 +74,6 @@ For more details, please refer to [tutorial/how_to_load_model](./tutorial/how_to ## Suport Operators -If you encounter unsupported operators, please point them out in the [issues](https://github.com/espressif/esp-dl/issues), and we will support them as soon as possible. Contributions to this ESPDL are also welcomed. +If you encounter unsupported operators, please point them out in the [issues](https://github.com/espressif/esp-dl/issues), and we will support them as soon as possible. Contributions to this ESP-DL are also welcomed, please refer to [Creating a New Module (Operator)](https://docs.espressif.com/projects/esp-dl/en/latest/tutorials/how_to_add_a_new_module%28operator%29.html) for more details. [ESP-DL Operator Support State](./operator_support_state.md) \ No newline at end of file diff --git a/README_cn.md b/README_cn.md index b311bb17..d78439d5 100644 --- a/README_cn.md +++ b/README_cn.md @@ -1,6 +1,6 @@ # ESP-DL [[English]](./README.md) -[![Documentation Status](./docs/_static/doc_latest.svg)](https://docs.espressif.com/projects/esp-dl/zh_CN/latest/index.html) +[![Documentation Status](./docs/_static/doc_latest.svg)](https://docs.espressif.com/projects/esp-dl/zh_CN/latest/index.html) [![Component Registry](https://components.espressif.com/components/espressif/esp-dl/badge.svg)](https://components.espressif.com/components/espressif/esp-dl) ESP-DL 是一个专为 ESP 系列芯片设计的轻量级且高效的神经网络推理框架。通过 ESP-DL,您可以轻松快速地使用乐鑫的系统级芯片 (SoC) 开发 AI 应用。 @@ -40,7 +40,7 @@ pip install git+https://github.com/espressif/esp-ppq.git ESP-PPQ 可以直接读取 ONNX 模型进行量化。Pytorch 和 TensorFlow 需要先转换为 ONNX 模型,因此请确保你的模型可以转换为 ONNX 模型。 -我们提供了以下 Python 脚本模板。你可以根据你自己的模型选择合适的模板进行修改。更多详细信息请参阅 [tutorial/how_to_quantize_model](./tutorial/how_to_quantize_model_cn.md)。 +我们提供了以下 Python 脚本模板。你可以根据你自己的模型选择合适的模板进行修改。更多详细信息请参阅 [使用 ESP-PPQ 量化模型](https://docs.espressif.com/projects/esp-dl/zh_CN/latest/tutorials/how_to_quantize_model.html)。 [quantize_onnx_model.py](./tools/quantization/quantize_onnx_model.py): 量化 ONNX 模型 @@ -60,7 +60,7 @@ Model *model = new Model((const char *)espdl_model, fbs::MODEL_LOCATION_IN_FLASH model->run(inputs); // inputs 是一个张量或张量映射 ``` -更多详细信息,请参阅 [tutorial/how_to_load_model](./tutorial/how_to_load_model_cn.md) 和 [mobilenet_v2 示例](./examples/mobilenet_v2/)。 +更多详细信息,请参阅 [使用 ESP-DL 加载模型](https://docs.espressif.com/projects/esp-dl/zh_CN/latest/tutorials/how_to_load_model.html) 和 [mobilenet_v2 示例](./examples/mobilenet_v2/)。 ## Support models @@ -73,6 +73,6 @@ model->run(inputs); // inputs 是一个张量或张量映射 ## Suport Operators 如果你有遇到不支持的算子,请将问题在[issues](https://github.com/espressif/esp-dl/issues)中反馈给我们,我们会尽快支持。 -也欢迎大家贡献新的算子。 +也欢迎大家贡献新的算子, 具体方法请参考[创建新模块(算子)](https://docs.espressif.com/projects/esp-dl/zh_CN/latest/tutorials/how_to_add_a_new_module%28operator%29.html)。 [算子支持状态](./operator_support_state.md) \ No newline at end of file diff --git a/esp-dl/idf_component.yml b/esp-dl/idf_component.yml index ae641398..0da189d7 100644 --- a/esp-dl/idf_component.yml +++ b/esp-dl/idf_component.yml @@ -1,4 +1,4 @@ -version: "3.0.0~1-rc.2" +version: "3.0.0" license: "MIT" targets: - esp32s3 diff --git a/examples/mobilenet_v2/README.md b/examples/mobilenet_v2/README.md index edaf8dff..e484be57 100644 --- a/examples/mobilenet_v2/README.md +++ b/examples/mobilenet_v2/README.md @@ -6,7 +6,7 @@ Deploy [MobileNet_v2](https://arxiv.org/abs/1801.04381) model from [torchvision](https://pytorch.org/vision/0.18/models/generated/torchvision.models.mobilenet_v2.html). -See [tutotial/how_to_deploy_mobilenet_v2](../../tutorial/how_to_deploy_mobilenet_v2_en.md) for more information. +See [Deploying MobileNet_v2 Using ESP-DL](https://docs.espressif.com/projects/esp-dl/en/latest/tutorials/how_to_deploy_mobilenet.html) for more information. # Example Output After the flashing you should see the output at idf monitor: diff --git a/operator_support_state.md b/operator_support_state.md index f5a03b0f..28cf5942 100644 --- a/operator_support_state.md +++ b/operator_support_state.md @@ -11,8 +11,8 @@ The rounding for ESP32-P4 is [rounding half to even](https://simple.wikipedia.or ## Support Operators -The ESP-DL operator interface is aligned with ONNX. The opset 13 is recommended to export ONNX. -Currently, the following 30 operators have been implemented and tested. Some operators do not implement all functionalities and attributes. Please refer to the description of each operator or [test cases](./tools/ops_test/config/op_cfg.toml) for details. +The ESP-DL operator interface is aligned with ONNX. The opset 13 is recommended to export ONNX. +Currently, the following 31 operators have been implemented and tested. Some operators do not implement all functionalities and attributes. Please refer to the description of each operator or [test cases](./tools/ops_test/config/op_cfg.toml) for details. | Operator | int8 | int16 | Description | |--------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|----------|---------------------------------------------| | Add[(ESP-DL)](esp-dl/dl/module/include/dl_module_add.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Add.html) | ✔ | ✔ | Support up to 4D | @@ -29,16 +29,17 @@ Currently, the following 30 operators have been implemented and tested. Some ope | HardSwish[(ESP-DL)](esp-dl/dl/module/include/dl_module_hard_swish.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__HardSwish.html) | ✔ | ✔ | | | LeakyRelu[(ESP-DL)](esp-dl/dl/module/include/dl_module_leaky_relu.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__LeakyRelu.html) | ✔ | ✔ | | | Log[(ESP-DL)](esp-dl/dl/module/include/dl_module_log.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Log.html) | ✔ | ✔ | | -| MatMul[(ESP-DL)](esp-dl/dl/module/include/dl_module_matmul.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__MatMul.html) | ✔ | ✔ | | +| MatMul[(ESP-DL)](esp-dl/dl/module/include/dl_module_matmul.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__MatMul.html) | ✔ | ✔ | Support up to 4D | | MaxPool[(ESP-DL)](esp-dl/dl/module/include/dl_module_max_pool.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__MaxPool.html) | ✔ | ✔ | | | Mul[(ESP-DL)](esp-dl/dl/module/include/dl_module_mul.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Mul.html) | ✔ | ✔ | Support up to 4D | | Pad[(ESP-DL)](esp-dl/dl/module/include/dl_module_pad.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Pad.html) | ✔ | ✔ | Do not support wrap mode | | PRelu[(ESP-DL)](esp-dl/dl/module/include/dl_module_prelu.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__PRelu.html) | ✔ | ✔ | | | Reshape[(ESP-DL)](esp-dl/dl/module/include/dl_module_reshape.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Reshape.html) | ✔ | ✔ | | -| Resize[(ESP-DL)](esp-dl/dl/module/include/dl_module_resize.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Resize.html) | ✔ | ✔ | Only support nearest and do not support roi | +| Resize[(ESP-DL)](esp-dl/dl/module/include/dl_module_resize.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Resize.html) | ✔ | ✖ | Only support nearest and do not support roi | | Sigmoid[(ESP-DL)](esp-dl/dl/module/include/dl_module_sigmoid.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Sigmoid.html) | ✔ | ✔ | | | Slice[(ESP-DL)](esp-dl/dl/module/include/dl_module_slice.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Slice.html) | ✔ | ✔ | | | Softmax[(ESP-DL)](esp-dl/dl/module/include/dl_module_softmax.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Softmax.html) | ✔ | ✔ | Dtype of output is float32 | +| Split[(ESP-DL)](esp-dl/dl/module/include/dl_module_split.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Split.html) | ✔ | ✔ | | | Sqrt[(ESP-DL)](esp-dl/dl/module/include/dl_module_sqrt.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Sqrt.html) | ✔ | ✔ | | | Squeeze[(ESP-DL)](esp-dl/dl/module/include/dl_module_squeeze.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Squeeze.html) | ✔ | ✔ | | | Sub[(ESP-DL)](esp-dl/dl/module/include/dl_module_sub.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Sub.html) | ✔ | ✔ | Support up to 4D | @@ -46,4 +47,4 @@ Currently, the following 30 operators have been implemented and tested. Some ope | Transpose[(ESP-DL)](esp-dl/dl/module/include/dl_module_transpose.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Transpose.html) | ✔ | ✔ | | | Unsqueeze[(ESP-DL)](esp-dl/dl/module/include/dl_module_unsqueeze.hpp)[(ONNX)](https://onnx.ai/onnx/operators/onnx__Unsqueeze.html) | ✔ | ✔ | | -Generation Time: 2024-12-13 15:55:57 \ No newline at end of file +Generation Time: 2024-12-20 17:28:29 \ No newline at end of file diff --git a/tutorial/how_to_add_a_new_module(operator)_cn.md b/tutorial/how_to_add_a_new_module(operator)_cn.md deleted file mode 100644 index ee76ea56..00000000 --- a/tutorial/how_to_add_a_new_module(operator)_cn.md +++ /dev/null @@ -1,100 +0,0 @@ -# 教程:创建新模块(算子) - -本教程将指导您在 `dl::module` 命名空间中创建一个新模块的过程。`Module` 类是所有模块的基类,您将扩展这个基类来创建您的自定义模块。 - -> 注意:ESP-DL 中的模块接口应与 ONNX 对齐。 - -## 理解基类 `Module` - -基类提供了几个必须在派生类中重写的虚方法。 - -- **方法:** - - `Module(const char *name, module_inplace_t inplace, quant_type_t quant_type)`:构造函数,用于初始化模块。 - - `~Module()`:析构函数,用于释放资源。 - - `get_output_shape(std::vector> &input_shapes)`:根据输入形状计算输出形状。 - - `forward(std::vector &tensors, runtime_mode_t mode)`:运行模块,高级接口。 - - `forward_args(void *args)`:运行模块,低级接口。 - - `deserialize(fbs::FbsModel *fbs_model, std::string node_name)`:从序列化信息创建模块实例。 - - `print()`:打印模块信息。 - -更多信息,请参考 [Module 类参考](../esp-dl/dl/module/include/dl_module_base.hpp)。 - -## 创建新模块类 - -要创建一个新模块,您需要从 `Module` 基类派生一个新类并重写必要的方法。 - -### 示例:创建 `MyCustomModule` 类 - -更多示例,请参考 [esp-dl/dl/module](../esp-dl/dl/module/include)。 - -```cpp -#include "module.h" // 包含定义 Module 类的头文件 - -namespace dl { -namespace module { - -class MyCustomModule : public Module { -public: - // 构造函数 - MyCustomModule(const char *name = "MyCustomModule", - module_inplace_t inplace = MODULE_NON_INPLACE, - quant_type_t quant_type = QUANT_TYPE_NONE) - : Module(name, inplace, quant_type) {} - - // 析构函数 - virtual ~MyCustomModule() {} - - // 重写 get_output_shape 方法 - std::vector> get_output_shape(std::vector> &input_shapes) override { - // 实现根据输入形状计算输出形状的逻辑 - std::vector> output_shapes; - // 示例:假设输出形状与输入形状相同 - output_shapes.push_back(input_shapes[0]); - return output_shapes; - } - - // 重写 forward 方法 - void forward(std::vector &tensors, runtime_mode_t mode = RUNTIME_MODE_AUTO) override { - // 实现运行模块的逻辑 - // 示例:对张量执行某些操作 - for (auto &tensor : tensors) { - // 对每个张量执行某些操作 - } - } - - // 重写 forward_args 方法 - void forward_args(void *args) override { - // 实现低级接口的逻辑 - // 示例:根据参数执行某些操作 - } - - // 从序列化信息反序列化模块实例 - static Module *deserialize(fbs::FbsModel *fbs_model, std::string node_name){ - // 实现反序列化模块实例的逻辑 - // 接口应与 ONNX 对齐 - } - - // 重写 print 方法 - void print() override { - // 打印模块信息 - ESP_LOGI("MyCustomModule", "Module Name: %s, Quant type: %d", name.c_str(), quant_type); - } -}; - -} // namespace module -} // namespace dl -``` - -### 注册 `MyCustomModule` 类 - -当您实现了 `MyCustomModule` 类后,请在 [dl_module_creator](../esp-dl/dl/module/include/dl_module_creator.hpp) 中注册您的模块,使其全局可用。 - -```cpp -void register_dl_modules() -{ - if (creators.empty()) { - ... - this->register_module("MyCustomModule", MyCustomModule::deserialize); - } -} -``` \ No newline at end of file diff --git a/tutorial/how_to_add_a_new_module(operator)_en.md b/tutorial/how_to_add_a_new_module(operator)_en.md deleted file mode 100644 index 77d0e23c..00000000 --- a/tutorial/how_to_add_a_new_module(operator)_en.md +++ /dev/null @@ -1,100 +0,0 @@ -# Tutorial: Creating a New Module(Operator) - -This tutorial will guide you through the process of creating a new module in the `dl::module` namespace. The `Module` class serves as the base class for all modules, and you will be extending this base class to create your custom module. - -> ATTENTION: The interface of modules in ESP-DL should be aligned with ONNX. - -## Understand the Base `Module` Class - -The base class provides several virtual methods that must be overridden in your derived class. - -- **Methods:** - - `Module(const char *name, module_inplace_t inplace, quant_type_t quant_type)`: Constructor to initialize the module. - - `~Module()`: Destructor to release resources. - - `get_output_shape(std::vector> &input_shapes)`: Calculates the output shape based on the input shape. - - `forward(std::vector &tensors, runtime_mode_t mode)`: Runs the module, high-level interface. - - `forward_args(void *args)`: Runs the module, low-level interface. - - `deserialize(fbs::FbsModel *fbs_model, std::string node_name)`: Creates a module instance from serialized information. - - `print()`: Prints module information. - -For more information, please refer to the [Module Class Reference](../esp-dl/dl/module/include/dl_module_base.hpp). - -## Create a New Module Class - -To create a new module, you need to derive a new class from the `Module` base class and override the necessary methods. - -### Example: Creating a `MyCustomModule` Class - -For more examples, please refer to [esp-dl/dl/module](../esp-dl/dl/module/include). - -```cpp -#include "module.h" // Include the header file where the Module class is defined - -namespace dl { -namespace module { - -class MyCustomModule : public Module { -public: - // Constructor - MyCustomModule(const char *name = "MyCustomModule", - module_inplace_t inplace = MODULE_NON_INPLACE, - quant_type_t quant_type = QUANT_TYPE_NONE) - : Module(name, inplace, quant_type) {} - - // Destructor - virtual ~MyCustomModule() {} - - // Override the get_output_shape method - std::vector> get_output_shape(std::vector> &input_shapes) override { - // Implement the logic to calculate the output shape based on input shapes - std::vector> output_shapes; - // Example: Assume the output shape is the same as the input shape - output_shapes.push_back(input_shapes[0]); - return output_shapes; - } - - // Override the forward method - void forward(std::vector &tensors, runtime_mode_t mode = RUNTIME_MODE_AUTO) override { - // Implement the logic to run the module - // Example: Perform some operation on the tensors - for (auto &tensor : tensors) { - // Perform some operation on each tensor - } - } - - // Override the forward_args method - void forward_args(void *args) override { - // Implement the low-level interface logic - // Example: Perform some operation based on the arguments - } - - // Deserialize module instance by serialization information - static Module *deserialize(fbs::FbsModel *fbs_model, std::string node_name){ - // Implement the logic to deserialize the module instance - // The interface shoud be align with ONNX - } - - // Override the print method - void print() override { - // Print module information - ESP_LOGI("MyCustomModule", "Module Name: %s, Quant type: %d", name.c_str(), quant_type); - } -}; - -} // namespace module -} // namespace dl -``` - -### Register `MyCustomModule` Class - -When you have implemented `MyCustomModule` Class, please register your module in [dl_module_creator](../esp-dl/dl/module/include/dl_module_creator.hpp) as a globally available module. - -``` - void register_dl_modules() - { - if (creators.empty()) { - ... - this->register_module("MyCustomModule", MyCustomModule::deserialize); - } - } -``` diff --git a/tutorial/how_to_deploy_mobilenet_v2_cn.md b/tutorial/how_to_deploy_mobilenet_v2_cn.md deleted file mode 100644 index 3fa706a7..00000000 --- a/tutorial/how_to_deploy_mobilenet_v2_cn.md +++ /dev/null @@ -1,93 +0,0 @@ -# 使用 ESP-DL 部署模型的教程 - -在本教程中,我们将介绍如何量化模型,如何将 ESP-DL 前向推理结果与电脑端 esp-ppq 前向推理结果做比较。 -其重在介绍如何量化模型,如何部署模型,不涉及模型输入数据的获取和处理,模型输入为随机数。 - -## 准备工作 - -在开始之前,请确保您已经安装了 ESP-IDF 开发环境,并且已经配置好了您的开发板。 - -除此之外,您还需要安装量化工具 [esp-ppq](https://github.com/espressif/esp-ppq),该工具基于优秀的开源量化工具 [ppq](https://github.com/OpenPPL/ppq),并添加适合 ESPRESSIF 芯片平台的客制化配置形成。 - -```bash -pip uninstall ppq -pip install git+https://github.com/espressif/esp-ppq.git -``` - -## 模型量化 - -MobileNet_v2模型量化请参考 [how_to_quantize_model_cn.md](./how_to_quantize_model_cn.md)。 - - -## 模型部署及推理精度测试 - -示例工程见 `examples/mobilenet_v2`,其目录结构如下: - - ```bash - $ tree examples/mobilenet_v2 - examples/mobilenet_v2 - ├── CMakeLists.txt - ├── main - │   ├── app_main.cpp - │   ├── CMakeLists.txt - │   └── Kconfig.projbuild - ├── models - │   ├── mobilenet_v2.espdl - │   ├── mobilenet_v2.info - │   ├── mobilenet_v2.json - │   └── mobilenet_v2.onnx - ├── pack_model.py - ├── partitions_esp32p4.csv - ├── sdkconfig.defaults - └── sdkconfig.defaults.esp32p4 - - 2 directories, 12 files - ``` - -主要文件介绍如下: -- `main/app_main.cpp` 展示了如何调用 ESP-DL 接口加载、运行模型。 -- `models` 目录存放模型相关文件,其中只有 `mobilenet_v2.espdl` 文件是必须的,将会被烧录到 flash 分区中。 -- `pack_model.py` 模型打包脚本,会被 `main/CMakeLists.txt` 调用执行。 -- `partitions_esp32p4.csv` 是分区表,在该工程中,模型文件 `models/mobilenet_v2.espdl` 将会被烧录到其中的 `model` 分区。 -- `sdkconfig.defaults.esp32p4` 是项目配置,其中 `CONFIG_MODEL_FILE_PATH` 配置了模型文件路径,是基于该项目的相对路径。 - - -### 模型加载运行 - -ESP-DL 支持自动构图及内存规划,目前支持的算子见 [esp-dl/dl/module/include](../esp-dl/dl/module/include)。 -对于模型的加载运行,只需要参照下面,简单调用几个接口即可。该示例采用构造函数,以系统分区的形式加载模型。 -更多加载方式请参考 [how_to_load_model](how_to_load_model_cn.md) - - ```cpp - Model *model = new Model("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION); - ...... - model->run(graph_test_inputs); - ``` - -模型输入 `graph_test_inputs`,在该示例中,通过 `get_graph_test_inputs` 函数获得。 -如下所示,该函数实现主要是构建 `TensorBase` 对象,传参 `input_data` 为模型输入数据 buffer 的首地址,buffer 中的数据需要是已经量化后的数据。 -由于该示例展示的是如何测试 ESP-DL 推理精度,所以这里 `input_data` 获取的是已经被 esp-ppq 打包进 `mobilenet_v2.espdl` 文件中的测试输入值。 -**input_data 需要是首地址16字节对齐的内存块,可通过 IDF 接口 `heap_caps_aligned_alloc` 分配** - - ```cpp - const void *input_data = parser_instance->get_test_input_tensor_raw_data(input_name); - if (input_data) { - TensorBase *test_input = - new TensorBase(input->shape, input_data, input->exponent, input->dtype, false, MALLOC_CAP_SPIRAM); - test_inputs.emplace(input_name, test_input); - } - ``` - -> 对于输入数据的量化处理,ESP-DL P4 采用的 round 策略为 "Rounding half to even",可参考 [bool TensorBase::assign(TensorBase *tensor)](../esp-dl/dl/tensor/src/dl_tensor_base.cpp) 中相关实现。量化所需的 exponent 等信息,可在 "*.info" 相关模型文件中查找。 - - -### 推理结果获取及测试 - -在 `model->run(graph_test_inputs)` 运行完之后,我们就可以通过 `model->get_outputs()` 获取 ESP-DL 的推理结果了,返回的是 std::map 对象。之后,就可以参考 `compare_test_outputs` 函数实现,与模型文件中的 esp-ppq 推理结果做比较。 -如果需要在 ESP-DL 中获取模型推理的中间结果,则需额外构建中间层对应的 `TensorBase` 对象,与其名字组成 `std::map` 对象传给 `user_outputs` 入参。`TensorBase` 对象的构造参照前面 `inputs TensorBase` 对象的构造。 - ```cpp - void Model::run(std::map &user_inputs, - runtime_mode_t mode, - std::map user_outputs); - ``` - diff --git a/tutorial/how_to_deploy_mobilenet_v2_en.md b/tutorial/how_to_deploy_mobilenet_v2_en.md deleted file mode 100644 index 8c9b96c3..00000000 --- a/tutorial/how_to_deploy_mobilenet_v2_en.md +++ /dev/null @@ -1,90 +0,0 @@ -# Tutorial on Deploying Models Using ESP-DL - -In this tutorial, we will introduce how to compare the forward inference results of ESP-DL with those of esp-ppq on a computer. The focus is on how to align ESP-DL and ESP-PPQ, without involving the acquisition and processing of model input data, which is random numbers in this case. - -## Prerequisites - -Before you begin, ensure that you have installed the ESP-IDF development environment and configured your development board. - -Additionally, you need to install the quantization tool [esp-ppq](https://github.com/espressif/esp-ppq). This tool is based on the excellent open-source quantization tool [ppq](https://github.com/OpenPPL/ppq) and includes custom configurations suitable for ESPRESSIF chip platforms. - -```bash -pip uninstall ppq -pip install git+https://github.com/espressif/esp-ppq.git -``` - -## Model Quantization - -For MobileNet_v2 model quantization, please refer to [how_to_quantize_model_cn.md](./how_to_quantize_model_cn.md). - -## Model Deployment and Inference Accuracy Testing - -The example project can be found in `examples/mobilenet_v2`, with the following directory structure: - -```bash -$ tree examples/mobilenet_v2 -examples/mobilenet_v2 -├── CMakeLists.txt -├── main -│ ├── app_main.cpp -│ ├── CMakeLists.txt -│ └── Kconfig.projbuild -├── models -│ ├── mobilenet_v2.espdl -│ ├── mobilenet_v2.info -│ ├── mobilenet_v2.json -│ └── mobilenet_v2.onnx -├── pack_model.py -├── partitions_esp32p4.csv -├── sdkconfig.defaults -└── sdkconfig.defaults.esp32p4 - -2 directories, 12 files -``` - -The main files are described as follows: -- `main/app_main.cpp` demonstrates how to load and run the model using ESP-DL interfaces. -- The `models` directory stores model-related files, with only the `mobilenet_v2.espdl` file being essential and will be flashed to the flash partition. -- `pack_model.py` is the model packaging script, which is invoked by `main/CMakeLists.txt`. -- `partitions_esp32p4.csv` is the partition table. In this project, the model file `models/mobilenet_v2.espdl` will be flashed to the `model` partition. -- `sdkconfig.defaults.esp32p4` is the project configuration, where `CONFIG_MODEL_FILE_PATH` configures the model file path, which is relative to the project. - -### Model Loading and Running - -ESP-DL supports automatic graph construction and memory planning. The currently supported operators can be found in `esp-dl/dl/module/include`. -For loading and running the model, you only need to call a few interfaces as shown below. This example uses the constructor to load the model in the form of a system partition. -For more loading methods, please refer to [how_to_load_model](how_to_load_model_cn.md). - -```cpp -Model *model = new Model("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION); -...... -model->run(graph_test_inputs); -``` - -The model input `graph_test_inputs` is obtained in this example through the `get_graph_test_inputs` function. - -As shown below, this function mainly constructs `TensorBase` objects. The parameter `input_data` is the starting address of the model input data buffer, and the data in the buffer needs to be already quantized. -Since this example demonstrates how to test the inference accuracy of ESP-DL, the `input_data` here is obtained from the test input values already packaged into the `mobilenet_v2.espdl` file by esp-ppq. -**The `input_data` needs to be a memory block aligned to a 16-byte boundary, which can be allocated using the IDF interface `heap_caps_aligned_alloc`.** - -```cpp -const void *input_data = parser_instance->get_test_input_tensor_raw_data(input_name); -if (input_data) { - TensorBase *test_input = - new TensorBase(input->shape, input_data, input->exponent, input->dtype, false, MALLOC_CAP_SPIRAM); - test_inputs.emplace(input_name, test_input); -} -``` - -> For the quantization processing of input data, ESP-DL P4 uses the "Rounding half to even" strategy. You can refer to the relevant implementation in [bool TensorBase::assign(TensorBase *tensor)](../esp-dl/dl/tensor/src/dl_tensor_base.cpp). The required exponent and other information for quantization can be found in the "*.info" related model files. - -### Inference Result Testing - -After running `model->run(graph_test_inputs)`, we can obtain the inference results of ESP-DL through `model->get_outputs()`, which returns an `std::map` object. Afterwards, you can refer to the `compare_test_outputs` function implementation to compare with the esp-ppq inference results in the model file. -If you need to obtain intermediate results of model inference in ESP-DL, you need to additionally construct `TensorBase` objects corresponding to the intermediate layers and form an `std::map` object with their names passed to the `user_outputs` parameter. The construction of `TensorBase` objects should refer to the construction of `inputs TensorBase` objects as mentioned earlier. - -```cpp -void Model::run(std::map &user_inputs, - runtime_mode_t mode, - std::map user_outputs); -``` \ No newline at end of file diff --git a/tutorial/how_to_load_model_cn.md b/tutorial/how_to_load_model_cn.md deleted file mode 100644 index f8a8d894..00000000 --- a/tutorial/how_to_load_model_cn.md +++ /dev/null @@ -1,71 +0,0 @@ -# 使用 ESP-DL 加载模型的教程 - -在本教程中,我们将介绍如何加载一个espdl的模型。 - -## 准备工作 - -在开始之前,请确保您已经安装了 ESP-IDF 开发环境,并且已经配置好了您的开发板。此外,您需要有一个预训练的模型文件,并且已经使用 `esp-ppq` 量化完成并导出为 `espdl` 模型格式。 - -## 方法1: 从 `rodata` 中加载模型 - -### 1. **在 `CMakeLists.txt` 中添加模型文件**: - 参考文档 [ESP-IDF 构建系统](https://docs.espressif.com/projects/esp-idf/zh_CN/stable/esp32/api-guides/build-system.html#cmake-embed-data),将 `espdl` 模型文件添加到芯片 flash 的 `.rodata` 段。 - - ```cmake - set(embed_files your_model_path/model_name.espdl) - idf_component_register(... - EMBED_FILES ${embed_files}) - ``` - -### 2. **在程序中加载模型**: - 使用以下方法加载模型: - - ```cpp - // "_binary_model_name_espdl_start" is composed of three parts: the prefix "binary", the filename "model_name_espdl", and the suffix "_start". - extern const uint8_t espdl_model[] asm("_binary_model_name_espdl_start"); - - Model *model = new Model((const char *)espdl_model, fbs::MODEL_LOCATION_IN_FLASH_RODATA); - ``` - -## 方法2: 从 `partition` 中加载模型 - -### 1. **在 `partition.csv` 中添加模型信息**: - 在 `partition.csv` 文件中添加模型的 `offset`、`size` 等信息。 - - ```csv - # Name, Type, SubType, Offset, Size, Flags - factory, app, factory, 0x010000, 4000K, - model, data, spiffs, , 4000K, - ``` - -### 2. **在 `CMakeLists.txt` 中添加自动加载程序**: - 如果选择手动烧写,可以跳过此步骤。 - - ```cmake - set(image_file your_model_path) - partition_table_get_partition_info(size "--partition-name model" "size") - if("${size}") - esptool_py_flash_to_partition(flash "model" "${image_file}") - else() - ``` - -### 3. **在程序中加载模型**: - 有两种方法可以加载模型。 - - - 使用构造函数加载模型: - - ```cpp - // method1: - Model *model = new Model("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION); - ``` - - - 首先加载 `fbs_model`,然后使用 `fbs_model` 指针创建模型: - - ```cpp - // method2: - fbs::FbsLoader *fbs_loader = new fbs::FbsLoader("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION); - fbs::FbsModel *fbs_model = fbs_loader->load(); - Model *model2 = new Model(fbs_model); - ``` - -通过以上步骤,您可以使用 ESP-DL 库成功加载一个预训练的模型。希望本教程对您有所帮助! 更多信息请参考代码 [fbs_loader.cpp](../esp-dl/fbs_loader/src/fbs_loader.cpp) 和 [fbs_loader.hpp](../esp-dl/fbs_loader/include/fbs_loader.hpp)。 diff --git a/tutorial/how_to_load_model_en.md b/tutorial/how_to_load_model_en.md deleted file mode 100644 index ea82a930..00000000 --- a/tutorial/how_to_load_model_en.md +++ /dev/null @@ -1,71 +0,0 @@ -# Tutorial on Loading Models with ESP-DL - -In this tutorial, we will guide you through the process of loading an ESP-DL model. - -## Prerequisites - -Before you begin, ensure that you have the ESP-IDF development environment installed and your development board properly configured. Additionally, you need to have a pre-trained model file that has been quantized using `esp-ppq` and exported in the `espdl` model format. - -## Method 1: Loading the Model from `rodata` - -### 1. **Add the Model File in `CMakeLists.txt`**: - Refer to the documentation [ESP-IDF Build System](https://docs.espressif.com/projects/esp-idf/zh_CN/stable/esp32/api-guides/build-system.html#cmake-embed-data) to add the `espdl` model file to the `.rodata` section of the chip flash. - - ```cmake - set(embed_files your_model_path/model_name.espdl) - idf_component_register(... - EMBED_FILES ${embed_files}) - ``` - -### 2. **Load the Model in Your Program**: - Use the following method to load the model: - - ```cpp - // "_binary_model_name_espdl_start" is composed of three parts: the prefix "binary", the filename "model_name_espdl", and the suffix "_start". - extern const uint8_t espdl_model[] asm("_binary_model_name_espdl_start"); - - Model *model = new Model((const char *)espdl_model, fbs::MODEL_LOCATION_IN_FLASH_RODATA); - ``` - -## Method 2: Loading the Model from `partition` - -### 1. **Add Model Information in `partition.csv`**: - Add the model's `offset`, `size`, and other information in the `partition.csv` file. - - ```csv - # Name, Type, SubType, Offset, Size, Flags - factory, app, factory, 0x010000, 4000K, - model, data, spiffs, , 4000K, - ``` - -### 2. **Add Automatic Loading Program in `CMakeLists.txt`**: - Skip this step if you choose to manually flash. - - ```cmake - set(image_file your_model_path) - partition_table_get_partition_info(size "--partition-name model" "size") - if("${size}") - esptool_py_flash_to_partition(flash "model" "${image_file}") - else() - ``` - -### 3. **Load the Model in Your Program**: - There are two methods to load the model. - - - Load the model using the constructor: - - ```cpp - // method1: - Model *model = new Model("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION); - ``` - - - First load the `fbs_model`, then create the model using the `fbs_model` pointer: - - ```cpp - // method2: - fbs::FbsLoader *fbs_loader = new fbs::FbsLoader("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION); - fbs::FbsModel *fbs_model = fbs_loader->load(); - Model *model2 = new Model(fbs_model); - ``` - -By following the steps above, you can successfully load a pre-trained model using the ESP-DL library. We hope this tutorial is helpful to you! For more information, please refer to the code in [fbs_loader.cpp](../esp-dl/fbs_loader/src/fbs_loader.cpp) and [fbs_loader.hpp](../esp-dl/fbs_loader/include/fbs_loader.hpp). \ No newline at end of file diff --git a/tutorial/how_to_quantize_model_cn.md b/tutorial/how_to_quantize_model_cn.md deleted file mode 100644 index 0de2219d..00000000 --- a/tutorial/how_to_quantize_model_cn.md +++ /dev/null @@ -1,551 +0,0 @@ -# 使用 ESP-PPQ 量化模型 (PTQ) - -在本教程中,我们将介绍如何使用esp-ppq量化预训练模型,并分析量化误差,量化方法为 Post Training Quantization (PTQ) 。 -esp-ppq在[ppq](https://github.com/OpenPPL/ppq)的基础上添加了乐鑫定制的quantizer和exporter,方便用户根据不同的芯片选择和esp-dl匹配的量化规则,并导出为esp-dl可以直接加载的标准模型文件。esp-ppq兼容ppq所有的API和量化脚本。更多细节请参考[ppq文档和视频](https://github.com/OpenPPL/ppq)。 - -## 准备工作 - -### 1. 安装esp-ppq,注意在安装esp-ppq之前需要先卸载ppq,否则可能会引发冲突: - -```bash -pip uninstall ppq -pip install git+https://github.com/espressif/esp-ppq.git -``` - -### 2. 模型文件 -目前支持ONNX, Pytorch, TensorFlow模型。在量化过程中,pytorch和TensorFlow会先转化为ONNX模型,因此请确保你的模型可以转化为ONNX模型。 -我们提供了以下量化脚本模板,方便用户根据自己的模型选择合适的模板进行修改: - -ONNX 模型请参考脚本 [quantize_onnx_model.py](../tools/quantization/quantize_onnx_model.py) -pytorch 模型请参考脚本 [quantize_pytorch_model.py](../tools/quantization/quantize_torch_model.py) -TensorFlow 模型请参考脚本 [quantize_tf_model.py](../tools/quantization/quantize_tf_model.py) - -## 模型量化示例 - -我们将以[MobileNet_v2](https://arxiv.org/abs/1801.04381)模型为例,介绍如何使用[quantize_torch_model.py](../tools/quantization/quantize_torch_model.py)脚本量化模型。 - -### 1. 准备预训练模型 -从torchvision加载MobileNet_v2的预训练模型,你也可以从 [ONNX models](https://github.com/onnx/models) 或 [TensorFlow models](https://github.com/tensorflow/models) 下载: -```python -import torchvision -from torchvision.models.mobilenetv2 import MobileNet_V2_Weights - -model = torchvision.models.mobilenet.mobilenet_v2(weights=MobileNet_V2_Weights.IMAGENET1K_V1) -``` - - -### 2. 准备校准数据集 - -校准数据集需要和你的模型输入格式一致,校准数据集需要尽可能覆盖你的模型输入的所有可能情况,以便更好地量化模型。这里以imagenet数据集为例,演示如何准备校准数据集。 -- 使用torchvision加载imagenet数据集: - -```python -from torchvision import datasets, transforms -from torch.utils.data import DataLoader - -transform = transforms.Compose([ - transforms.Resize(256), - transforms.CenterCrop(224), - transforms.ToTensor(), - transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), -]) - -calib_dataset = datasets.ImageNet(root=CALIB_DIR, split='val', transform=transform) -dataloader = DataLoader(calib_dataset, batch_size=BATCH_SIZE, shuffle=false) -``` -- 使用我们提供的[imagenet_util.py](../tools/quantization/datasets/imagenet_util.py)脚本和[imagenet校准数据集](https://dl.espressif.com/public/imagenet_calib.zip),快速下载和测试. -``` -# Load -from datasets.imagenet_util import load_imagenet_from_directory -dataloader = load_imagenet_from_directory( - directory=CALIB_DIR, - batchsize=BATCH_SIZE, - shuffle=False, - subset=1024, - require_label=False, - num_of_workers=4, - ) - -``` - - -### 3. 量化模型并导出espdl模型 - -使用 `espdl_quantize_torch` API量化模型并导出espdl模型文件, 量化后会导出三个文件,分别是 -``` -**.espdl: espdl模型二进制文件,可以直接用于芯片的推理 -**.info: espdl模型文本文件,用于调试和确定espdl模型是否被正确导出 -**.json: 量化信息文件,用于量化信息的保存和加载 -``` - -函数的参数说明如下: -``` -from ppq.api import espdl_quantize_torch - -def espdl_quantize_torch( - model: torch.nn.Module, - espdl_export_file: str, - calib_dataloader: DataLoader, - calib_steps: int, - input_shape: List[Any], - inputs: Union[dict, list, torch.Tensor, None] = None, - target:str = "esp32p4", - num_of_bits:int = 8, - collate_fn: Callable = None, - setting: QuantizationSetting = None, - device: str = "cpu", - error_report: bool = True, - test_output_names: List[str] = None, - skip_export: bool = False, - export_config: bool = True, - verbose: int = 0, -) -> BaseGraph: - - """Quantize ONNX model and return quantized ppq graph and executor . - - Args: - model (torch.nn.Module): torch model - calib_dataloader (DataLoader): calibration data loader - calib_steps (int): calibration steps - input_shape (List[int]):a list of ints indicating size of inputs and batch size must be 1 - inputs (List[str]): a list of Tensor and batch size must be 1 - target: target chip, support "esp32p4" and "esp32s3" - num_of_bits: the number of quantizer bits, 8 or 16 - collate_fn (Callable): batch collate func for preprocessing - setting (QuantizationSetting): Quantization setting, default espdl setting will be used when set None - device (str, optional): execution device, defaults to 'cpu'. - error_report (bool, optional): whether to print error report, defaults to True. - test_output_names (List[str], optional): tensor names of the model want to test, defaults to None. - skip_export (bool, optional): whether to export the quantized model, defaults to False. - export_config (bool, optional): whether to export the quantization configuration, defaults to True. - verbose (int, optional): whether to print details, defaults to 0. - - Returns: - BaseGraph: The Quantized Graph, containing all information needed for backend execution - """ -``` - -#### 8-bit量化测试 - -- **量化设置:** -``` -target="esp32p4" -num_of_bits=8 -batch_size=32 -setting=None -``` - -- **量化结果:** - -``` -Analysing Graphwise Quantization Error:: -Layer | NOISE:SIGNAL POWER RATIO -/features/features.16/conv/conv.2/Conv: | ████████████████████ | 48.831% -/features/features.15/conv/conv.2/Conv: | ███████████████████ | 45.268% -/features/features.17/conv/conv.2/Conv: | ██████████████████ | 43.112% -/features/features.18/features.18.0/Conv: | █████████████████ | 41.586% -/features/features.14/conv/conv.2/Conv: | █████████████████ | 41.135% -/features/features.13/conv/conv.2/Conv: | ██████████████ | 35.090% -/features/features.17/conv/conv.0/conv.0.0/Conv: | █████████████ | 32.895% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ████████████ | 29.226% -/features/features.12/conv/conv.2/Conv: | ████████████ | 28.895% -/features/features.16/conv/conv.0/conv.0.0/Conv: | ███████████ | 27.808% -/features/features.7/conv/conv.2/Conv: | ███████████ | 27.675% -/features/features.10/conv/conv.2/Conv: | ███████████ | 26.292% -/features/features.11/conv/conv.2/Conv: | ███████████ | 26.085% -/features/features.6/conv/conv.2/Conv: | ███████████ | 25.892% -/classifier/classifier.1/Gemm: | ██████████ | 25.591% -/features/features.15/conv/conv.0/conv.0.0/Conv: | ██████████ | 25.323% -/features/features.4/conv/conv.2/Conv: | ██████████ | 24.787% -/features/features.15/conv/conv.1/conv.1.0/Conv: | ██████████ | 24.354% -/features/features.14/conv/conv.1/conv.1.0/Conv: | ████████ | 20.207% -/features/features.9/conv/conv.2/Conv: | ████████ | 19.808% -/features/features.14/conv/conv.0/conv.0.0/Conv: | ████████ | 18.465% -/features/features.5/conv/conv.2/Conv: | ███████ | 17.868% -/features/features.12/conv/conv.1/conv.1.0/Conv: | ███████ | 16.589% -/features/features.13/conv/conv.1/conv.1.0/Conv: | ███████ | 16.143% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ██████ | 15.382% -/features/features.3/conv/conv.2/Conv: | ██████ | 15.105% -/features/features.13/conv/conv.0/conv.0.0/Conv: | ██████ | 15.029% -/features/features.10/conv/conv.1/conv.1.0/Conv: | ██████ | 14.875% -/features/features.2/conv/conv.2/Conv: | ██████ | 14.869% -/features/features.11/conv/conv.0/conv.0.0/Conv: | ██████ | 14.552% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ██████ | 14.050% -/features/features.8/conv/conv.1/conv.1.0/Conv: | ██████ | 13.929% -/features/features.8/conv/conv.2/Conv: | ██████ | 13.833% -/features/features.12/conv/conv.0/conv.0.0/Conv: | ██████ | 13.684% -/features/features.7/conv/conv.0/conv.0.0/Conv: | █████ | 12.942% -/features/features.6/conv/conv.1/conv.1.0/Conv: | █████ | 12.765% -/features/features.10/conv/conv.0/conv.0.0/Conv: | █████ | 12.251% -/features/features.5/conv/conv.1/conv.1.0/Conv: | █████ | 11.186% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ████ | 11.070% -/features/features.9/conv/conv.0/conv.0.0/Conv: | ████ | 10.371% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ████ | 10.356% -/features/features.6/conv/conv.0/conv.0.0/Conv: | ████ | 10.149% -/features/features.4/conv/conv.0/conv.0.0/Conv: | ████ | 9.472% -/features/features.8/conv/conv.0/conv.0.0/Conv: | ████ | 9.232% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ████ | 9.187% -/features/features.1/conv/conv.1/Conv: | ████ | 8.770% -/features/features.5/conv/conv.0/conv.0.0/Conv: | ███ | 8.408% -/features/features.7/conv/conv.1/conv.1.0/Conv: | ███ | 8.151% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ███ | 7.156% -/features/features.3/conv/conv.0/conv.0.0/Conv: | ███ | 6.328% -/features/features.2/conv/conv.0/conv.0.0/Conv: | ██ | 5.392% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.875% -/features/features.0/features.0.0/Conv: | | 0.119% -Analysing Layerwise quantization error:: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53/53 [08:44<00:00, 9.91s/it] -Layer | NOISE:SIGNAL POWER RATIO -/features/features.1/conv/conv.0/conv.0.0/Conv: | ████████████████████ | 14.303% -/features/features.0/features.0.0/Conv: | █ | 0.844% -/features/features.1/conv/conv.1/Conv: | █ | 0.667% -/features/features.2/conv/conv.1/conv.1.0/Conv: | █ | 0.574% -/features/features.3/conv/conv.1/conv.1.0/Conv: | █ | 0.419% -/features/features.15/conv/conv.1/conv.1.0/Conv: | | 0.272% -/features/features.9/conv/conv.1/conv.1.0/Conv: | | 0.238% -/features/features.17/conv/conv.1/conv.1.0/Conv: | | 0.214% -/features/features.4/conv/conv.1/conv.1.0/Conv: | | 0.180% -/features/features.11/conv/conv.1/conv.1.0/Conv: | | 0.151% -/features/features.12/conv/conv.1/conv.1.0/Conv: | | 0.148% -/features/features.16/conv/conv.1/conv.1.0/Conv: | | 0.146% -/features/features.14/conv/conv.2/Conv: | | 0.136% -/features/features.13/conv/conv.1/conv.1.0/Conv: | | 0.105% -/features/features.6/conv/conv.1/conv.1.0/Conv: | | 0.105% -/features/features.8/conv/conv.1/conv.1.0/Conv: | | 0.083% -/features/features.7/conv/conv.2/Conv: | | 0.076% -/features/features.5/conv/conv.1/conv.1.0/Conv: | | 0.076% -/features/features.3/conv/conv.2/Conv: | | 0.075% -/features/features.16/conv/conv.2/Conv: | | 0.074% -/features/features.13/conv/conv.0/conv.0.0/Conv: | | 0.072% -/features/features.15/conv/conv.2/Conv: | | 0.066% -/features/features.4/conv/conv.2/Conv: | | 0.065% -/features/features.11/conv/conv.2/Conv: | | 0.063% -/classifier/classifier.1/Gemm: | | 0.063% -/features/features.2/conv/conv.0/conv.0.0/Conv: | | 0.054% -/features/features.13/conv/conv.2/Conv: | | 0.050% -/features/features.10/conv/conv.1/conv.1.0/Conv: | | 0.042% -/features/features.17/conv/conv.0/conv.0.0/Conv: | | 0.040% -/features/features.2/conv/conv.2/Conv: | | 0.038% -/features/features.4/conv/conv.0/conv.0.0/Conv: | | 0.034% -/features/features.17/conv/conv.2/Conv: | | 0.030% -/features/features.14/conv/conv.0/conv.0.0/Conv: | | 0.025% -/features/features.16/conv/conv.0/conv.0.0/Conv: | | 0.024% -/features/features.10/conv/conv.2/Conv: | | 0.022% -/features/features.11/conv/conv.0/conv.0.0/Conv: | | 0.021% -/features/features.9/conv/conv.2/Conv: | | 0.021% -/features/features.14/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.7/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.5/conv/conv.2/Conv: | | 0.019% -/features/features.8/conv/conv.2/Conv: | | 0.018% -/features/features.12/conv/conv.2/Conv: | | 0.017% -/features/features.6/conv/conv.2/Conv: | | 0.014% -/features/features.7/conv/conv.0/conv.0.0/Conv: | | 0.014% -/features/features.3/conv/conv.0/conv.0.0/Conv: | | 0.013% -/features/features.12/conv/conv.0/conv.0.0/Conv: | | 0.009% -/features/features.15/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.5/conv/conv.0/conv.0.0/Conv: | | 0.006% -/features/features.6/conv/conv.0/conv.0.0/Conv: | | 0.005% -/features/features.9/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.18/features.18.0/Conv: | | 0.002% -/features/features.10/conv/conv.0/conv.0.0/Conv: | | 0.002% -/features/features.8/conv/conv.0/conv.0.0/Conv: | | 0.002% - -* Prec@1 60.500 Prec@5 83.275* -``` - -- **量化误差分析**:** - -量化后的top1准确率只有60.5%,和float模型的准确率(71.878)相差较远,量化模型精度损失较大,其中: - -**累计误差(Graphwise Error):** -该模型的最后一层为/classifier/classifier.1/Gemm, 该层的累计误差为25.591%。经验来说最后一层的累计误差小于10%,量化模型的精度损失较小。 - -**逐层误差Layerwise error:** -观察Layerwise error,发现大部分层的误差都在1%以下,说明大部分层的量化误差较小,只有少数几层误差较大,我们可以选择将误差较大的层使用int16进行量化。 -具体请看混合精度量化。 - -#### 混合精度量化测试 - -- **量化设置:** -``` -from ppq.api import get_target_platform -target="esp32p4" -num_of_bits=8 -batch_size=32 - -# 以下层使用int16进行量化 -quant_setting = QuantizationSettingFactory.espdl_setting() -quant_setting.dispatching_table.append("/features/features.1/conv/conv.0/conv.0.0/Conv", get_target_platform(TARGET, 16)) -quant_setting.dispatching_table.append("/features/features.1/conv/conv.0/conv.0.2/Clip", get_target_platform(TARGET, 16)) -``` - -- **量化结果:** - -``` -Layer | NOISE:SIGNAL POWER RATIO -/features/features.16/conv/conv.2/Conv: | ████████████████████ | 31.585% -/features/features.15/conv/conv.2/Conv: | ███████████████████ | 29.253% -/features/features.17/conv/conv.0/conv.0.0/Conv: | ████████████████ | 25.077% -/features/features.14/conv/conv.2/Conv: | ████████████████ | 24.819% -/features/features.17/conv/conv.2/Conv: | ████████████ | 19.546% -/features/features.13/conv/conv.2/Conv: | ████████████ | 19.283% -/features/features.16/conv/conv.0/conv.0.0/Conv: | ████████████ | 18.764% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ████████████ | 18.596% -/features/features.18/features.18.0/Conv: | ████████████ | 18.541% -/features/features.15/conv/conv.0/conv.0.0/Conv: | ██████████ | 15.633% -/features/features.12/conv/conv.2/Conv: | █████████ | 14.784% -/features/features.15/conv/conv.1/conv.1.0/Conv: | █████████ | 14.773% -/features/features.14/conv/conv.1/conv.1.0/Conv: | █████████ | 13.700% -/features/features.6/conv/conv.2/Conv: | ████████ | 12.824% -/features/features.10/conv/conv.2/Conv: | ███████ | 11.727% -/features/features.14/conv/conv.0/conv.0.0/Conv: | ███████ | 10.612% -/features/features.11/conv/conv.2/Conv: | ██████ | 10.262% -/features/features.9/conv/conv.2/Conv: | ██████ | 9.967% -/classifier/classifier.1/Gemm: | ██████ | 9.117% -/features/features.5/conv/conv.2/Conv: | ██████ | 8.915% -/features/features.7/conv/conv.2/Conv: | █████ | 8.690% -/features/features.3/conv/conv.2/Conv: | █████ | 8.586% -/features/features.4/conv/conv.2/Conv: | █████ | 7.525% -/features/features.13/conv/conv.1/conv.1.0/Conv: | █████ | 7.432% -/features/features.12/conv/conv.1/conv.1.0/Conv: | █████ | 7.317% -/features/features.13/conv/conv.0/conv.0.0/Conv: | ████ | 6.848% -/features/features.8/conv/conv.2/Conv: | ████ | 6.711% -/features/features.10/conv/conv.1/conv.1.0/Conv: | ████ | 6.100% -/features/features.8/conv/conv.1/conv.1.0/Conv: | ████ | 6.043% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ████ | 5.962% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ████ | 5.873% -/features/features.12/conv/conv.0/conv.0.0/Conv: | ████ | 5.833% -/features/features.7/conv/conv.0/conv.0.0/Conv: | ████ | 5.832% -/features/features.11/conv/conv.0/conv.0.0/Conv: | ████ | 5.736% -/features/features.6/conv/conv.1/conv.1.0/Conv: | ████ | 5.639% -/features/features.5/conv/conv.1/conv.1.0/Conv: | ███ | 5.017% -/features/features.10/conv/conv.0/conv.0.0/Conv: | ███ | 4.963% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ███ | 4.870% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ███ | 4.655% -/features/features.2/conv/conv.2/Conv: | ███ | 4.650% -/features/features.4/conv/conv.0/conv.0.0/Conv: | ███ | 4.648% -/features/features.1/conv/conv.1/Conv: | ███ | 4.318% -/features/features.9/conv/conv.0/conv.0.0/Conv: | ██ | 3.849% -/features/features.6/conv/conv.0/conv.0.0/Conv: | ██ | 3.712% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ██ | 3.394% -/features/features.8/conv/conv.0/conv.0.0/Conv: | ██ | 3.391% -/features/features.7/conv/conv.1/conv.1.0/Conv: | ██ | 2.713% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ██ | 2.637% -/features/features.2/conv/conv.0/conv.0.0/Conv: | ██ | 2.602% -/features/features.5/conv/conv.0/conv.0.0/Conv: | █ | 2.397% -/features/features.3/conv/conv.0/conv.0.0/Conv: | █ | 1.759% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.433% -/features/features.0/features.0.0/Conv: | | 0.119% -Analysing Layerwise quantization error:: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53/53 [08:27<00:00, 9.58s/it] -* -Layer | NOISE:SIGNAL POWER RATIO -/features/features.1/conv/conv.1/Conv: | ████████████████████ | 1.096% -/features/features.0/features.0.0/Conv: | ███████████████ | 0.844% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ██████████ | 0.574% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ████████ | 0.425% -/features/features.15/conv/conv.1/conv.1.0/Conv: | █████ | 0.272% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ████ | 0.238% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ████ | 0.214% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ███ | 0.180% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ███ | 0.151% -/features/features.12/conv/conv.1/conv.1.0/Conv: | ███ | 0.148% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ███ | 0.146% -/features/features.14/conv/conv.2/Conv: | ██ | 0.136% -/features/features.13/conv/conv.1/conv.1.0/Conv: | ██ | 0.105% -/features/features.6/conv/conv.1/conv.1.0/Conv: | ██ | 0.105% -/features/features.8/conv/conv.1/conv.1.0/Conv: | █ | 0.083% -/features/features.5/conv/conv.1/conv.1.0/Conv: | █ | 0.076% -/features/features.3/conv/conv.2/Conv: | █ | 0.075% -/features/features.16/conv/conv.2/Conv: | █ | 0.074% -/features/features.13/conv/conv.0/conv.0.0/Conv: | █ | 0.072% -/features/features.7/conv/conv.2/Conv: | █ | 0.071% -/features/features.15/conv/conv.2/Conv: | █ | 0.066% -/features/features.4/conv/conv.2/Conv: | █ | 0.065% -/features/features.11/conv/conv.2/Conv: | █ | 0.063% -/classifier/classifier.1/Gemm: | █ | 0.063% -/features/features.13/conv/conv.2/Conv: | █ | 0.059% -/features/features.2/conv/conv.0/conv.0.0/Conv: | █ | 0.054% -/features/features.10/conv/conv.1/conv.1.0/Conv: | █ | 0.042% -/features/features.17/conv/conv.0/conv.0.0/Conv: | █ | 0.040% -/features/features.2/conv/conv.2/Conv: | █ | 0.038% -/features/features.4/conv/conv.0/conv.0.0/Conv: | █ | 0.034% -/features/features.17/conv/conv.2/Conv: | █ | 0.030% -/features/features.14/conv/conv.0/conv.0.0/Conv: | | 0.025% -/features/features.16/conv/conv.0/conv.0.0/Conv: | | 0.024% -/features/features.10/conv/conv.2/Conv: | | 0.022% -/features/features.11/conv/conv.0/conv.0.0/Conv: | | 0.021% -/features/features.9/conv/conv.2/Conv: | | 0.021% -/features/features.14/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.7/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.5/conv/conv.2/Conv: | | 0.019% -/features/features.8/conv/conv.2/Conv: | | 0.018% -/features/features.12/conv/conv.2/Conv: | | 0.017% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.017% -/features/features.6/conv/conv.2/Conv: | | 0.014% -/features/features.7/conv/conv.0/conv.0.0/Conv: | | 0.014% -/features/features.3/conv/conv.0/conv.0.0/Conv: | | 0.013% -/features/features.12/conv/conv.0/conv.0.0/Conv: | | 0.009% -/features/features.15/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.5/conv/conv.0/conv.0.0/Conv: | | 0.006% -/features/features.6/conv/conv.0/conv.0.0/Conv: | | 0.005% -/features/features.9/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.18/features.18.0/Conv: | | 0.002% -/features/features.10/conv/conv.0/conv.0.0/Conv: | | 0.002% -/features/features.8/conv/conv.0/conv.0.0/Conv: | | 0.002% - -* Prec@1 69.550 Prec@5 88.450* -``` - -- **量化误差分析:** - -将之前误差最大的层替换为16-bits量化后,可以观察到模型准确度明显提升,量化后的top1准确率为69.550%,和float模型的准确率(71.878%0)比较接近。 - -该模型的最后一层/classifier/classifier.1/Gemm 的累计误差为9.117%。 - -#### 层间均衡量化测试 - -- **量化设置:** - -``` -import torch.nn as nn -def convert_relu6_to_relu(model): - for child_name, child in model.named_children(): - if isinstance(child, nn.ReLU6): - setattr(model, child_name, nn.ReLU()) - else: - convert_relu6_to_relu(child) - return model -# 将ReLU6 替换为 ReLU -model = convert_relu6_to_relu(model) -# 使用层间均衡 -quant_setting = QuantizationSettingFactory.espdl_setting() -quant_setting.equalization = True -quant_setting.equalization_setting.iterations = 4 -quant_setting.equalization_setting.value_threshold = .4 -quant_setting.equalization_setting.opt_level = 2 -quant_setting.equalization_setting.interested_layers = None -``` - -- **量化结果:** - -``` -Layer | NOISE:SIGNAL POWER RATIO -/features/features.16/conv/conv.2/Conv: | ████████████████████ | 34.497% -/features/features.15/conv/conv.2/Conv: | ██████████████████ | 30.813% -/features/features.14/conv/conv.2/Conv: | ███████████████ | 25.876% -/features/features.17/conv/conv.0/conv.0.0/Conv: | ██████████████ | 24.498% -/features/features.17/conv/conv.2/Conv: | ████████████ | 20.290% -/features/features.13/conv/conv.2/Conv: | ████████████ | 20.177% -/features/features.16/conv/conv.0/conv.0.0/Conv: | ████████████ | 19.993% -/features/features.18/features.18.0/Conv: | ███████████ | 19.536% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ██████████ | 17.879% -/features/features.12/conv/conv.2/Conv: | ██████████ | 17.150% -/features/features.15/conv/conv.0/conv.0.0/Conv: | █████████ | 15.970% -/features/features.15/conv/conv.1/conv.1.0/Conv: | █████████ | 15.254% -/features/features.1/conv/conv.1/Conv: | █████████ | 15.122% -/features/features.10/conv/conv.2/Conv: | █████████ | 14.917% -/features/features.6/conv/conv.2/Conv: | ████████ | 13.446% -/features/features.11/conv/conv.2/Conv: | ███████ | 12.533% -/features/features.9/conv/conv.2/Conv: | ███████ | 11.479% -/features/features.14/conv/conv.1/conv.1.0/Conv: | ███████ | 11.470% -/features/features.5/conv/conv.2/Conv: | ██████ | 10.669% -/features/features.3/conv/conv.2/Conv: | ██████ | 10.526% -/features/features.14/conv/conv.0/conv.0.0/Conv: | ██████ | 9.529% -/features/features.7/conv/conv.2/Conv: | █████ | 9.500% -/classifier/classifier.1/Gemm: | █████ | 8.965% -/features/features.4/conv/conv.2/Conv: | █████ | 8.674% -/features/features.12/conv/conv.1/conv.1.0/Conv: | █████ | 8.349% -/features/features.13/conv/conv.1/conv.1.0/Conv: | █████ | 8.068% -/features/features.8/conv/conv.2/Conv: | █████ | 7.961% -/features/features.13/conv/conv.0/conv.0.0/Conv: | ████ | 7.451% -/features/features.10/conv/conv.1/conv.1.0/Conv: | ████ | 6.714% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ████ | 6.399% -/features/features.8/conv/conv.1/conv.1.0/Conv: | ████ | 6.369% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ████ | 6.222% -/features/features.2/conv/conv.2/Conv: | ███ | 5.867% -/features/features.5/conv/conv.1/conv.1.0/Conv: | ███ | 5.719% -/features/features.12/conv/conv.0/conv.0.0/Conv: | ███ | 5.546% -/features/features.6/conv/conv.1/conv.1.0/Conv: | ███ | 5.414% -/features/features.10/conv/conv.0/conv.0.0/Conv: | ███ | 5.093% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ███ | 4.951% -/features/features.11/conv/conv.0/conv.0.0/Conv: | ███ | 4.941% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ███ | 4.825% -/features/features.7/conv/conv.0/conv.0.0/Conv: | ██ | 4.330% -/features/features.2/conv/conv.0/conv.0.0/Conv: | ██ | 4.299% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ██ | 4.283% -/features/features.4/conv/conv.0/conv.0.0/Conv: | ██ | 3.477% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ██ | 3.287% -/features/features.8/conv/conv.0/conv.0.0/Conv: | ██ | 2.787% -/features/features.9/conv/conv.0/conv.0.0/Conv: | ██ | 2.774% -/features/features.6/conv/conv.0/conv.0.0/Conv: | ██ | 2.705% -/features/features.7/conv/conv.1/conv.1.0/Conv: | ██ | 2.636% -/features/features.5/conv/conv.0/conv.0.0/Conv: | █ | 1.846% -/features/features.3/conv/conv.0/conv.0.0/Conv: | █ | 1.170% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.389% -/features/features.0/features.0.0/Conv: | | 0.025% -Analysing Layerwise quantization error:: 100%|██████████| 53/53 [07:46<00:00, 8.80s/it] -Layer | NOISE:SIGNAL POWER RATIO -/features/features.1/conv/conv.0/conv.0.0/Conv: | ████████████████████ | 0.989% -/features/features.0/features.0.0/Conv: | █████████████████ | 0.845% -/features/features.16/conv/conv.2/Conv: | █████ | 0.238% -/features/features.17/conv/conv.2/Conv: | ████ | 0.202% -/features/features.14/conv/conv.2/Conv: | ████ | 0.198% -/features/features.1/conv/conv.1/Conv: | ████ | 0.192% -/features/features.15/conv/conv.2/Conv: | ███ | 0.145% -/features/features.4/conv/conv.2/Conv: | ██ | 0.120% -/features/features.2/conv/conv.2/Conv: | ██ | 0.111% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ██ | 0.079% -/classifier/classifier.1/Gemm: | █ | 0.062% -/features/features.13/conv/conv.2/Conv: | █ | 0.050% -/features/features.3/conv/conv.2/Conv: | █ | 0.050% -/features/features.12/conv/conv.2/Conv: | █ | 0.050% -/features/features.5/conv/conv.1/conv.1.0/Conv: | █ | 0.047% -/features/features.3/conv/conv.1/conv.1.0/Conv: | █ | 0.046% -/features/features.7/conv/conv.2/Conv: | █ | 0.045% -/features/features.5/conv/conv.2/Conv: | █ | 0.030% -/features/features.11/conv/conv.2/Conv: | █ | 0.028% -/features/features.6/conv/conv.2/Conv: | █ | 0.027% -/features/features.6/conv/conv.1/conv.1.0/Conv: | █ | 0.026% -/features/features.4/conv/conv.0/conv.0.0/Conv: | | 0.025% -/features/features.15/conv/conv.1/conv.1.0/Conv: | | 0.023% -/features/features.8/conv/conv.1/conv.1.0/Conv: | | 0.021% -/features/features.10/conv/conv.2/Conv: | | 0.020% -/features/features.11/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.16/conv/conv.1/conv.1.0/Conv: | | 0.017% -/features/features.14/conv/conv.0/conv.0.0/Conv: | | 0.016% -/features/features.4/conv/conv.1/conv.1.0/Conv: | | 0.012% -/features/features.13/conv/conv.1/conv.1.0/Conv: | | 0.012% -/features/features.13/conv/conv.0/conv.0.0/Conv: | | 0.012% -/features/features.12/conv/conv.1/conv.1.0/Conv: | | 0.012% -/features/features.17/conv/conv.0/conv.0.0/Conv: | | 0.011% -/features/features.12/conv/conv.0/conv.0.0/Conv: | | 0.011% -/features/features.2/conv/conv.0/conv.0.0/Conv: | | 0.010% -/features/features.9/conv/conv.2/Conv: | | 0.008% -/features/features.8/conv/conv.2/Conv: | | 0.008% -/features/features.10/conv/conv.1/conv.1.0/Conv: | | 0.008% -/features/features.16/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.7/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.10/conv/conv.0/conv.0.0/Conv: | | 0.006% -/features/features.15/conv/conv.0/conv.0.0/Conv: | | 0.005% -/features/features.3/conv/conv.0/conv.0.0/Conv: | | 0.004% -/features/features.11/conv/conv.0/conv.0.0/Conv: | | 0.004% -/features/features.18/features.18.0/Conv: | | 0.003% -/features/features.5/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.9/conv/conv.1/conv.1.0/Conv: | | 0.003% -/features/features.6/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.7/conv/conv.1/conv.1.0/Conv: | | 0.003% -/features/features.17/conv/conv.1/conv.1.0/Conv: | | 0.002% -/features/features.14/conv/conv.1/conv.1.0/Conv: | | 0.002% -/features/features.8/conv/conv.0/conv.0.0/Conv: | | 0.001% -/features/features.9/conv/conv.0/conv.0.0/Conv: | | 0.001% - - * Prec@1 69.800 Prec@5 88.550 -``` - -- **量化误差分析:** - -注意到对8-bit量化应用层间均衡有助于降低量化损失。模型最后一层,/classifier/classifier.1/Gemm的累积误差为8.965%。量化后的top1准确率为69.800%,和float模型的准确率(71.878%)更加接近,比混合精度量化的量化精度更高。 - -如果想进一步降低量化误差,可以尝试使用QAT(Quantization Aware Training)。具体方法请参考[ppq QAT example](https://github.com/OpenPPL/ppq/blob/master/ppq/samples/TensorRT/Example_QAT.py)。 - - - > Note: [examples/mobilenet_v2](../examples/mobilenet_v2/)来自8-bit量化测试。 16-bit 的卷积算子还在开发中,完成后可以部署混合精度量化模型。 \ No newline at end of file diff --git a/tutorial/how_to_quantize_model_en.md b/tutorial/how_to_quantize_model_en.md deleted file mode 100644 index b6a7b288..00000000 --- a/tutorial/how_to_quantize_model_en.md +++ /dev/null @@ -1,549 +0,0 @@ -# Using ESP-PPQ for Model Quantization (PTQ) - -In this tutorial, we will guide you through the process of quantizing a pre-trained model using ESP-PPQ and analyzing the quantization error. The quantization method used is Post Training Quantization (PTQ). ESP-PPQ builds upon [PPQ](https://github.com/OpenPPL/ppq) and adds Espressif-customized quantizers and exporters, allowing users to select quantization rules that match different chips and export them as standard model files that ESP-DL can directly load. ESP-PPQ is compatible with all PPQ APIs and quantization scripts. For more details, please refer to the [PPQ documentation and videos](https://github.com/OpenPPL/ppq). - -## Prerequisites - -### 1. Install ESP-PPQ. Note that PPQ needs to be uninstalled before installing ESP-PPQ to avoid conflicts: - -```bash -pip uninstall ppq -pip install git+https://github.com/espressif/esp-ppq.git -``` - -### 2. Model Files -Currently, ESP-PPQ supports ONNX, PyTorch, and TensorFlow models. During the quantization process, PyTorch and TensorFlow models are first converted to ONNX models, so ensure that your model can be converted to an ONNX model. We provide the following quantization script templates to help users modify them according to their models: - -For ONNX models, refer to the script [quantize_onnx_model.py](../tools/quantization/quantize_onnx_model.py) -For PyTorch models, refer to the script [quantize_pytorch_model.py](../tools/quantization/quantize_torch_model.py) -For TensorFlow models, refer to the script [quantize_tf_model.py](../tools/quantization/quantize_tf_model.py) - -## Model Quantization Example - -We will use the [MobileNet_v2](https://arxiv.org/abs/1801.04381) model as an example to demonstrate how to use the [quantize_torch_model.py](../tools/quantization/quantize_torch_model.py) script to quantize the model. - -### 1. Prepare the Pre-trained Model -Load the pre-trained MobileNet_v2 model from torchvision. You can also download it from [ONNX models](https://github.com/onnx/models) or [TensorFlow models](https://github.com/tensorflow/models): -```python -import torchvision -from torchvision.models.mobilenetv2 import MobileNet_V2_Weights - -model = torchvision.models.mobilenet.mobilenet_v2(weights=MobileNet_V2_Weights.IMAGENET1K_V1) -``` - -### 2. Prepare the Calibration Dataset - -The calibration dataset needs to match the input format of your model. The calibration dataset should cover all possible input scenarios to better quantize the model. Here, we use the ImageNet dataset as an example to demonstrate how to prepare the calibration dataset. - -- Load the ImageNet dataset using torchvision: - -```python -from torchvision import datasets, transforms -from torch.utils.data import DataLoader - -transform = transforms.Compose([ - transforms.Resize(256), - transforms.CenterCrop(224), - transforms.ToTensor(), - transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), -]) - -calib_dataset = datasets.ImageNet(root=CALIB_DIR, split='val', transform=transform) -dataloader = DataLoader(calib_dataset, batch_size=BATCH_SIZE, shuffle=false) -``` - -- Use the provided [imagenet_util.py](../tools/quantization/datasets/imagenet_util.py) script and the [ImageNet calibration dataset](https://dl.espressif.com/public/imagenet_calib.zip) to quickly download and test. - -```python -# Load -from datasets.imagenet_util import load_imagenet_from_directory -dataloader = load_imagenet_from_directory( - directory=CALIB_DIR, - batchsize=BATCH_SIZE, - shuffle=False, - subset=1024, - require_label=False, - num_of_workers=4, - ) -``` - -### 3. Quantize the Model and Export the ESPDL Model - -Use the `espdl_quantize_torch` API to quantize the model and export the ESPDL model file. After quantization, three files will be exported: -``` -**.espdl: The ESPDL model binary file, which can be directly used for inference on the chip. -**.info: The ESPDL model text file, used for debugging and verifying that the ESPDL model was correctly exported. -**.json: The quantization information file, used for saving and loading quantization information. -``` - -The function parameters are described as follows: -``` -from ppq.api import espdl_quantize_torch - -def espdl_quantize_torch( - model: torch.nn.Module, - espdl_export_file: str, - calib_dataloader: DataLoader, - calib_steps: int, - input_shape: List[Any], - inputs: Union[dict, list, torch.Tensor, None] = None, - target:str = "esp32p4", - num_of_bits:int = 8, - collate_fn: Callable = None, - setting: QuantizationSetting = None, - device: str = "cpu", - error_report: bool = True, - test_output_names: List[str] = None, - skip_export: bool = False, - export_config: bool = True, - verbose: int = 0, -) -> BaseGraph: - - """Quantize ONNX model and return quantized ppq graph and executor . - - Args: - model (torch.nn.Module): torch model - calib_dataloader (DataLoader): calibration data loader - calib_steps (int): calibration steps - input_shape (List[int]):a list of ints indicating size of inputs and batch size must be 1 - inputs (List[str]): a list of Tensor and batch size must be 1 - target: target chip, support "esp32p4" and "esp32s3" - num_of_bits: the number of quantizer bits, 8 or 16 - collate_fn (Callable): batch collate func for preprocessing - setting (QuantizationSetting): Quantization setting, default espdl setting will be used when set None - device (str, optional): execution device, defaults to 'cpu'. - error_report (bool, optional): whether to print error report, defaults to True. - test_output_names (List[str], optional): tensor names of the model want to test, defaults to None. - skip_export (bool, optional): whether to export the quantized model, defaults to False. - export_config (bool, optional): whether to export the quantization configuration, defaults to True. - verbose (int, optional): whether to print details, defaults to 0. - - Returns: - BaseGraph: The Quantized Graph, containing all information needed for backend execution - """ -``` - - -#### 8-bit Quantization Test - -- **Quantization Settings:** -``` -target="esp32p4" -num_of_bits=8 -batch_size=32 -setting=None -``` - -- **Quantization Results:** - -``` -Analysing Graphwise Quantization Error:: -Layer | NOISE:SIGNAL POWER RATIO -/features/features.16/conv/conv.2/Conv: | ████████████████████ | 48.831% -/features/features.15/conv/conv.2/Conv: | ███████████████████ | 45.268% -/features/features.17/conv/conv.2/Conv: | ██████████████████ | 43.112% -/features/features.18/features.18.0/Conv: | █████████████████ | 41.586% -/features/features.14/conv/conv.2/Conv: | █████████████████ | 41.135% -/features/features.13/conv/conv.2/Conv: | ██████████████ | 35.090% -/features/features.17/conv/conv.0/conv.0.0/Conv: | █████████████ | 32.895% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ████████████ | 29.226% -/features/features.12/conv/conv.2/Conv: | ████████████ | 28.895% -/features/features.16/conv/conv.0/conv.0.0/Conv: | ███████████ | 27.808% -/features/features.7/conv/conv.2/Conv: | ███████████ | 27.675% -/features/features.10/conv/conv.2/Conv: | ███████████ | 26.292% -/features/features.11/conv/conv.2/Conv: | ███████████ | 26.085% -/features/features.6/conv/conv.2/Conv: | ███████████ | 25.892% -/classifier/classifier.1/Gemm: | ██████████ | 25.591% -/features/features.15/conv/conv.0/conv.0.0/Conv: | ██████████ | 25.323% -/features/features.4/conv/conv.2/Conv: | ██████████ | 24.787% -/features/features.15/conv/conv.1/conv.1.0/Conv: | ██████████ | 24.354% -/features/features.14/conv/conv.1/conv.1.0/Conv: | ████████ | 20.207% -/features/features.9/conv/conv.2/Conv: | ████████ | 19.808% -/features/features.14/conv/conv.0/conv.0.0/Conv: | ████████ | 18.465% -/features/features.5/conv/conv.2/Conv: | ███████ | 17.868% -/features/features.12/conv/conv.1/conv.1.0/Conv: | ███████ | 16.589% -/features/features.13/conv/conv.1/conv.1.0/Conv: | ███████ | 16.143% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ██████ | 15.382% -/features/features.3/conv/conv.2/Conv: | ██████ | 15.105% -/features/features.13/conv/conv.0/conv.0.0/Conv: | ██████ | 15.029% -/features/features.10/conv/conv.1/conv.1.0/Conv: | ██████ | 14.875% -/features/features.2/conv/conv.2/Conv: | ██████ | 14.869% -/features/features.11/conv/conv.0/conv.0.0/Conv: | ██████ | 14.552% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ██████ | 14.050% -/features/features.8/conv/conv.1/conv.1.0/Conv: | ██████ | 13.929% -/features/features.8/conv/conv.2/Conv: | ██████ | 13.833% -/features/features.12/conv/conv.0/conv.0.0/Conv: | ██████ | 13.684% -/features/features.7/conv/conv.0/conv.0.0/Conv: | █████ | 12.942% -/features/features.6/conv/conv.1/conv.1.0/Conv: | █████ | 12.765% -/features/features.10/conv/conv.0/conv.0.0/Conv: | █████ | 12.251% -/features/features.5/conv/conv.1/conv.1.0/Conv: | █████ | 11.186% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ████ | 11.070% -/features/features.9/conv/conv.0/conv.0.0/Conv: | ████ | 10.371% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ████ | 10.356% -/features/features.6/conv/conv.0/conv.0.0/Conv: | ████ | 10.149% -/features/features.4/conv/conv.0/conv.0.0/Conv: | ████ | 9.472% -/features/features.8/conv/conv.0/conv.0.0/Conv: | ████ | 9.232% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ████ | 9.187% -/features/features.1/conv/conv.1/Conv: | ████ | 8.770% -/features/features.5/conv/conv.0/conv.0.0/Conv: | ███ | 8.408% -/features/features.7/conv/conv.1/conv.1.0/Conv: | ███ | 8.151% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ███ | 7.156% -/features/features.3/conv/conv.0/conv.0.0/Conv: | ███ | 6.328% -/features/features.2/conv/conv.0/conv.0.0/Conv: | ██ | 5.392% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.875% -/features/features.0/features.0.0/Conv: | | 0.119% -Analysing Layerwise quantization error:: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53/53 [08:44<00:00, 9.91s/it] -Layer | NOISE:SIGNAL POWER RATIO -/features/features.1/conv/conv.0/conv.0.0/Conv: | ████████████████████ | 14.303% -/features/features.0/features.0.0/Conv: | █ | 0.844% -/features/features.1/conv/conv.1/Conv: | █ | 0.667% -/features/features.2/conv/conv.1/conv.1.0/Conv: | █ | 0.574% -/features/features.3/conv/conv.1/conv.1.0/Conv: | █ | 0.419% -/features/features.15/conv/conv.1/conv.1.0/Conv: | | 0.272% -/features/features.9/conv/conv.1/conv.1.0/Conv: | | 0.238% -/features/features.17/conv/conv.1/conv.1.0/Conv: | | 0.214% -/features/features.4/conv/conv.1/conv.1.0/Conv: | | 0.180% -/features/features.11/conv/conv.1/conv.1.0/Conv: | | 0.151% -/features/features.12/conv/conv.1/conv.1.0/Conv: | | 0.148% -/features/features.16/conv/conv.1/conv.1.0/Conv: | | 0.146% -/features/features.14/conv/conv.2/Conv: | | 0.136% -/features/features.13/conv/conv.1/conv.1.0/Conv: | | 0.105% -/features/features.6/conv/conv.1/conv.1.0/Conv: | | 0.105% -/features/features.8/conv/conv.1/conv.1.0/Conv: | | 0.083% -/features/features.7/conv/conv.2/Conv: | | 0.076% -/features/features.5/conv/conv.1/conv.1.0/Conv: | | 0.076% -/features/features.3/conv/conv.2/Conv: | | 0.075% -/features/features.16/conv/conv.2/Conv: | | 0.074% -/features/features.13/conv/conv.0/conv.0.0/Conv: | | 0.072% -/features/features.15/conv/conv.2/Conv: | | 0.066% -/features/features.4/conv/conv.2/Conv: | | 0.065% -/features/features.11/conv/conv.2/Conv: | | 0.063% -/classifier/classifier.1/Gemm: | | 0.063% -/features/features.2/conv/conv.0/conv.0.0/Conv: | | 0.054% -/features/features.13/conv/conv.2/Conv: | | 0.050% -/features/features.10/conv/conv.1/conv.1.0/Conv: | | 0.042% -/features/features.17/conv/conv.0/conv.0.0/Conv: | | 0.040% -/features/features.2/conv/conv.2/Conv: | | 0.038% -/features/features.4/conv/conv.0/conv.0.0/Conv: | | 0.034% -/features/features.17/conv/conv.2/Conv: | | 0.030% -/features/features.14/conv/conv.0/conv.0.0/Conv: | | 0.025% -/features/features.16/conv/conv.0/conv.0.0/Conv: | | 0.024% -/features/features.10/conv/conv.2/Conv: | | 0.022% -/features/features.11/conv/conv.0/conv.0.0/Conv: | | 0.021% -/features/features.9/conv/conv.2/Conv: | | 0.021% -/features/features.14/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.7/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.5/conv/conv.2/Conv: | | 0.019% -/features/features.8/conv/conv.2/Conv: | | 0.018% -/features/features.12/conv/conv.2/Conv: | | 0.017% -/features/features.6/conv/conv.2/Conv: | | 0.014% -/features/features.7/conv/conv.0/conv.0.0/Conv: | | 0.014% -/features/features.3/conv/conv.0/conv.0.0/Conv: | | 0.013% -/features/features.12/conv/conv.0/conv.0.0/Conv: | | 0.009% -/features/features.15/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.5/conv/conv.0/conv.0.0/Conv: | | 0.006% -/features/features.6/conv/conv.0/conv.0.0/Conv: | | 0.005% -/features/features.9/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.18/features.18.0/Conv: | | 0.002% -/features/features.10/conv/conv.0/conv.0.0/Conv: | | 0.002% -/features/features.8/conv/conv.0/conv.0.0/Conv: | | 0.002% - -* Prec@1 60.500 Prec@5 83.275* -``` - -- **Quantization Error Analysis:** - -The top-1 accuracy after quantization is only 60.5%, which is significantly lower than the accuracy of the float model (71.878%). The quantization model has a substantial loss in accuracy, with: - -**Graphwise Error:** -The last layer of the model is /classifier/classifier.1/Gemm, and the cumulative error for this layer is 25.591%. Generally, if the cumulative error of the last layer is less than 10%, the loss in accuracy of the quantized model is minimal. - -**Layerwise Error:** -Observing the Layerwise error, it is found that the errors for most layers are below 1%, indicating that the quantization errors for most layers are small. Only a few layers have larger errors, and we can choose to quantize these layers using int16. -Please refer to Mixed-Precision Test for details. - -#### Mixed-Precision Quantization Test - -- **Quantization Settings:** -``` -from ppq.api import get_target_platform -target="esp32p4" -num_of_bits=8 -batch_size=32 - -# Quantize the following layers with 16-bits -quant_setting = QuantizationSettingFactory.espdl_setting() -quant_setting.dispatching_table.append("/features/features.1/conv/conv.0/conv.0.0/Conv", get_target_platform(TARGET, 16)) -quant_setting.dispatching_table.append("/features/features.1/conv/conv.0/conv.0.2/Clip", get_target_platform(TARGET, 16)) -``` - -- **Quantization Results:** - -``` -Layer | NOISE:SIGNAL POWER RATIO -/features/features.16/conv/conv.2/Conv: | ████████████████████ | 31.585% -/features/features.15/conv/conv.2/Conv: | ███████████████████ | 29.253% -/features/features.17/conv/conv.0/conv.0.0/Conv: | ████████████████ | 25.077% -/features/features.14/conv/conv.2/Conv: | ████████████████ | 24.819% -/features/features.17/conv/conv.2/Conv: | ████████████ | 19.546% -/features/features.13/conv/conv.2/Conv: | ████████████ | 19.283% -/features/features.16/conv/conv.0/conv.0.0/Conv: | ████████████ | 18.764% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ████████████ | 18.596% -/features/features.18/features.18.0/Conv: | ████████████ | 18.541% -/features/features.15/conv/conv.0/conv.0.0/Conv: | ██████████ | 15.633% -/features/features.12/conv/conv.2/Conv: | █████████ | 14.784% -/features/features.15/conv/conv.1/conv.1.0/Conv: | █████████ | 14.773% -/features/features.14/conv/conv.1/conv.1.0/Conv: | █████████ | 13.700% -/features/features.6/conv/conv.2/Conv: | ████████ | 12.824% -/features/features.10/conv/conv.2/Conv: | ███████ | 11.727% -/features/features.14/conv/conv.0/conv.0.0/Conv: | ███████ | 10.612% -/features/features.11/conv/conv.2/Conv: | ██████ | 10.262% -/features/features.9/conv/conv.2/Conv: | ██████ | 9.967% -/classifier/classifier.1/Gemm: | ██████ | 9.117% -/features/features.5/conv/conv.2/Conv: | ██████ | 8.915% -/features/features.7/conv/conv.2/Conv: | █████ | 8.690% -/features/features.3/conv/conv.2/Conv: | █████ | 8.586% -/features/features.4/conv/conv.2/Conv: | █████ | 7.525% -/features/features.13/conv/conv.1/conv.1.0/Conv: | █████ | 7.432% -/features/features.12/conv/conv.1/conv.1.0/Conv: | █████ | 7.317% -/features/features.13/conv/conv.0/conv.0.0/Conv: | ████ | 6.848% -/features/features.8/conv/conv.2/Conv: | ████ | 6.711% -/features/features.10/conv/conv.1/conv.1.0/Conv: | ████ | 6.100% -/features/features.8/conv/conv.1/conv.1.0/Conv: | ████ | 6.043% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ████ | 5.962% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ████ | 5.873% -/features/features.12/conv/conv.0/conv.0.0/Conv: | ████ | 5.833% -/features/features.7/conv/conv.0/conv.0.0/Conv: | ████ | 5.832% -/features/features.11/conv/conv.0/conv.0.0/Conv: | ████ | 5.736% -/features/features.6/conv/conv.1/conv.1.0/Conv: | ████ | 5.639% -/features/features.5/conv/conv.1/conv.1.0/Conv: | ███ | 5.017% -/features/features.10/conv/conv.0/conv.0.0/Conv: | ███ | 4.963% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ███ | 4.870% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ███ | 4.655% -/features/features.2/conv/conv.2/Conv: | ███ | 4.650% -/features/features.4/conv/conv.0/conv.0.0/Conv: | ███ | 4.648% -/features/features.1/conv/conv.1/Conv: | ███ | 4.318% -/features/features.9/conv/conv.0/conv.0.0/Conv: | ██ | 3.849% -/features/features.6/conv/conv.0/conv.0.0/Conv: | ██ | 3.712% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ██ | 3.394% -/features/features.8/conv/conv.0/conv.0.0/Conv: | ██ | 3.391% -/features/features.7/conv/conv.1/conv.1.0/Conv: | ██ | 2.713% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ██ | 2.637% -/features/features.2/conv/conv.0/conv.0.0/Conv: | ██ | 2.602% -/features/features.5/conv/conv.0/conv.0.0/Conv: | █ | 2.397% -/features/features.3/conv/conv.0/conv.0.0/Conv: | █ | 1.759% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.433% -/features/features.0/features.0.0/Conv: | | 0.119% -Analysing Layerwise quantization error:: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53/53 [08:27<00:00, 9.58s/it] -* -Layer | NOISE:SIGNAL POWER RATIO -/features/features.1/conv/conv.1/Conv: | ████████████████████ | 1.096% -/features/features.0/features.0.0/Conv: | ███████████████ | 0.844% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ██████████ | 0.574% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ████████ | 0.425% -/features/features.15/conv/conv.1/conv.1.0/Conv: | █████ | 0.272% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ████ | 0.238% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ████ | 0.214% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ███ | 0.180% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ███ | 0.151% -/features/features.12/conv/conv.1/conv.1.0/Conv: | ███ | 0.148% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ███ | 0.146% -/features/features.14/conv/conv.2/Conv: | ██ | 0.136% -/features/features.13/conv/conv.1/conv.1.0/Conv: | ██ | 0.105% -/features/features.6/conv/conv.1/conv.1.0/Conv: | ██ | 0.105% -/features/features.8/conv/conv.1/conv.1.0/Conv: | █ | 0.083% -/features/features.5/conv/conv.1/conv.1.0/Conv: | █ | 0.076% -/features/features.3/conv/conv.2/Conv: | █ | 0.075% -/features/features.16/conv/conv.2/Conv: | █ | 0.074% -/features/features.13/conv/conv.0/conv.0.0/Conv: | █ | 0.072% -/features/features.7/conv/conv.2/Conv: | █ | 0.071% -/features/features.15/conv/conv.2/Conv: | █ | 0.066% -/features/features.4/conv/conv.2/Conv: | █ | 0.065% -/features/features.11/conv/conv.2/Conv: | █ | 0.063% -/classifier/classifier.1/Gemm: | █ | 0.063% -/features/features.13/conv/conv.2/Conv: | █ | 0.059% -/features/features.2/conv/conv.0/conv.0.0/Conv: | █ | 0.054% -/features/features.10/conv/conv.1/conv.1.0/Conv: | █ | 0.042% -/features/features.17/conv/conv.0/conv.0.0/Conv: | █ | 0.040% -/features/features.2/conv/conv.2/Conv: | █ | 0.038% -/features/features.4/conv/conv.0/conv.0.0/Conv: | █ | 0.034% -/features/features.17/conv/conv.2/Conv: | █ | 0.030% -/features/features.14/conv/conv.0/conv.0.0/Conv: | | 0.025% -/features/features.16/conv/conv.0/conv.0.0/Conv: | | 0.024% -/features/features.10/conv/conv.2/Conv: | | 0.022% -/features/features.11/conv/conv.0/conv.0.0/Conv: | | 0.021% -/features/features.9/conv/conv.2/Conv: | | 0.021% -/features/features.14/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.7/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.5/conv/conv.2/Conv: | | 0.019% -/features/features.8/conv/conv.2/Conv: | | 0.018% -/features/features.12/conv/conv.2/Conv: | | 0.017% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.017% -/features/features.6/conv/conv.2/Conv: | | 0.014% -/features/features.7/conv/conv.0/conv.0.0/Conv: | | 0.014% -/features/features.3/conv/conv.0/conv.0.0/Conv: | | 0.013% -/features/features.12/conv/conv.0/conv.0.0/Conv: | | 0.009% -/features/features.15/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.5/conv/conv.0/conv.0.0/Conv: | | 0.006% -/features/features.6/conv/conv.0/conv.0.0/Conv: | | 0.005% -/features/features.9/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.18/features.18.0/Conv: | | 0.002% -/features/features.10/conv/conv.0/conv.0.0/Conv: | | 0.002% -/features/features.8/conv/conv.0/conv.0.0/Conv: | | 0.002% - -* Prec@1 69.550 Prec@5 88.450* -``` - -- **Quantization Error Analysis:** - -After replacing the layer with the highest error with 16-bits quantization, a noticeable improvement in model accuracy can be observed. The top-1 accuracy after quantization is 69.550%, which is quite close to the accuracy of the float model (71.878%). - -The graphwise error for the last layer of the model, /classifier/classifier.1/Gemm, is 9.117%. - -#### Layerwise Equalization Quantization Test - -**Quantization Settings:** - -``` -import torch.nn as nn -def convert_relu6_to_relu(model): - for child_name, child in model.named_children(): - if isinstance(child, nn.ReLU6): - setattr(model, child_name, nn.ReLU()) - else: - convert_relu6_to_relu(child) - return model -# replace ReLU6 with ReLU -model = convert_relu6_to_relu(model) -# adopt layerwise equalization -quant_setting = QuantizationSettingFactory.espdl_setting() -quant_setting.equalization = True -quant_setting.equalization_setting.iterations = 4 -quant_setting.equalization_setting.value_threshold = .4 -quant_setting.equalization_setting.opt_level = 2 -quant_setting.equalization_setting.interested_layers = None -``` - -**Quantization Results:** - -``` -Layer | NOISE:SIGNAL POWER RATIO -/features/features.16/conv/conv.2/Conv: | ████████████████████ | 34.497% -/features/features.15/conv/conv.2/Conv: | ██████████████████ | 30.813% -/features/features.14/conv/conv.2/Conv: | ███████████████ | 25.876% -/features/features.17/conv/conv.0/conv.0.0/Conv: | ██████████████ | 24.498% -/features/features.17/conv/conv.2/Conv: | ████████████ | 20.290% -/features/features.13/conv/conv.2/Conv: | ████████████ | 20.177% -/features/features.16/conv/conv.0/conv.0.0/Conv: | ████████████ | 19.993% -/features/features.18/features.18.0/Conv: | ███████████ | 19.536% -/features/features.16/conv/conv.1/conv.1.0/Conv: | ██████████ | 17.879% -/features/features.12/conv/conv.2/Conv: | ██████████ | 17.150% -/features/features.15/conv/conv.0/conv.0.0/Conv: | █████████ | 15.970% -/features/features.15/conv/conv.1/conv.1.0/Conv: | █████████ | 15.254% -/features/features.1/conv/conv.1/Conv: | █████████ | 15.122% -/features/features.10/conv/conv.2/Conv: | █████████ | 14.917% -/features/features.6/conv/conv.2/Conv: | ████████ | 13.446% -/features/features.11/conv/conv.2/Conv: | ███████ | 12.533% -/features/features.9/conv/conv.2/Conv: | ███████ | 11.479% -/features/features.14/conv/conv.1/conv.1.0/Conv: | ███████ | 11.470% -/features/features.5/conv/conv.2/Conv: | ██████ | 10.669% -/features/features.3/conv/conv.2/Conv: | ██████ | 10.526% -/features/features.14/conv/conv.0/conv.0.0/Conv: | ██████ | 9.529% -/features/features.7/conv/conv.2/Conv: | █████ | 9.500% -/classifier/classifier.1/Gemm: | █████ | 8.965% -/features/features.4/conv/conv.2/Conv: | █████ | 8.674% -/features/features.12/conv/conv.1/conv.1.0/Conv: | █████ | 8.349% -/features/features.13/conv/conv.1/conv.1.0/Conv: | █████ | 8.068% -/features/features.8/conv/conv.2/Conv: | █████ | 7.961% -/features/features.13/conv/conv.0/conv.0.0/Conv: | ████ | 7.451% -/features/features.10/conv/conv.1/conv.1.0/Conv: | ████ | 6.714% -/features/features.9/conv/conv.1/conv.1.0/Conv: | ████ | 6.399% -/features/features.8/conv/conv.1/conv.1.0/Conv: | ████ | 6.369% -/features/features.11/conv/conv.1/conv.1.0/Conv: | ████ | 6.222% -/features/features.2/conv/conv.2/Conv: | ███ | 5.867% -/features/features.5/conv/conv.1/conv.1.0/Conv: | ███ | 5.719% -/features/features.12/conv/conv.0/conv.0.0/Conv: | ███ | 5.546% -/features/features.6/conv/conv.1/conv.1.0/Conv: | ███ | 5.414% -/features/features.10/conv/conv.0/conv.0.0/Conv: | ███ | 5.093% -/features/features.17/conv/conv.1/conv.1.0/Conv: | ███ | 4.951% -/features/features.11/conv/conv.0/conv.0.0/Conv: | ███ | 4.941% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ███ | 4.825% -/features/features.7/conv/conv.0/conv.0.0/Conv: | ██ | 4.330% -/features/features.2/conv/conv.0/conv.0.0/Conv: | ██ | 4.299% -/features/features.3/conv/conv.1/conv.1.0/Conv: | ██ | 4.283% -/features/features.4/conv/conv.0/conv.0.0/Conv: | ██ | 3.477% -/features/features.4/conv/conv.1/conv.1.0/Conv: | ██ | 3.287% -/features/features.8/conv/conv.0/conv.0.0/Conv: | ██ | 2.787% -/features/features.9/conv/conv.0/conv.0.0/Conv: | ██ | 2.774% -/features/features.6/conv/conv.0/conv.0.0/Conv: | ██ | 2.705% -/features/features.7/conv/conv.1/conv.1.0/Conv: | ██ | 2.636% -/features/features.5/conv/conv.0/conv.0.0/Conv: | █ | 1.846% -/features/features.3/conv/conv.0/conv.0.0/Conv: | █ | 1.170% -/features/features.1/conv/conv.0/conv.0.0/Conv: | | 0.389% -/features/features.0/features.0.0/Conv: | | 0.025% -Analysing Layerwise quantization error:: 100%|██████████| 53/53 [07:46<00:00, 8.80s/it] -Layer | NOISE:SIGNAL POWER RATIO -/features/features.1/conv/conv.0/conv.0.0/Conv: | ████████████████████ | 0.989% -/features/features.0/features.0.0/Conv: | █████████████████ | 0.845% -/features/features.16/conv/conv.2/Conv: | █████ | 0.238% -/features/features.17/conv/conv.2/Conv: | ████ | 0.202% -/features/features.14/conv/conv.2/Conv: | ████ | 0.198% -/features/features.1/conv/conv.1/Conv: | ████ | 0.192% -/features/features.15/conv/conv.2/Conv: | ███ | 0.145% -/features/features.4/conv/conv.2/Conv: | ██ | 0.120% -/features/features.2/conv/conv.2/Conv: | ██ | 0.111% -/features/features.2/conv/conv.1/conv.1.0/Conv: | ██ | 0.079% -/classifier/classifier.1/Gemm: | █ | 0.062% -/features/features.13/conv/conv.2/Conv: | █ | 0.050% -/features/features.3/conv/conv.2/Conv: | █ | 0.050% -/features/features.12/conv/conv.2/Conv: | █ | 0.050% -/features/features.5/conv/conv.1/conv.1.0/Conv: | █ | 0.047% -/features/features.3/conv/conv.1/conv.1.0/Conv: | █ | 0.046% -/features/features.7/conv/conv.2/Conv: | █ | 0.045% -/features/features.5/conv/conv.2/Conv: | █ | 0.030% -/features/features.11/conv/conv.2/Conv: | █ | 0.028% -/features/features.6/conv/conv.2/Conv: | █ | 0.027% -/features/features.6/conv/conv.1/conv.1.0/Conv: | █ | 0.026% -/features/features.4/conv/conv.0/conv.0.0/Conv: | | 0.025% -/features/features.15/conv/conv.1/conv.1.0/Conv: | | 0.023% -/features/features.8/conv/conv.1/conv.1.0/Conv: | | 0.021% -/features/features.10/conv/conv.2/Conv: | | 0.020% -/features/features.11/conv/conv.1/conv.1.0/Conv: | | 0.020% -/features/features.16/conv/conv.1/conv.1.0/Conv: | | 0.017% -/features/features.14/conv/conv.0/conv.0.0/Conv: | | 0.016% -/features/features.4/conv/conv.1/conv.1.0/Conv: | | 0.012% -/features/features.13/conv/conv.1/conv.1.0/Conv: | | 0.012% -/features/features.13/conv/conv.0/conv.0.0/Conv: | | 0.012% -/features/features.12/conv/conv.1/conv.1.0/Conv: | | 0.012% -/features/features.17/conv/conv.0/conv.0.0/Conv: | | 0.011% -/features/features.12/conv/conv.0/conv.0.0/Conv: | | 0.011% -/features/features.2/conv/conv.0/conv.0.0/Conv: | | 0.010% -/features/features.9/conv/conv.2/Conv: | | 0.008% -/features/features.8/conv/conv.2/Conv: | | 0.008% -/features/features.10/conv/conv.1/conv.1.0/Conv: | | 0.008% -/features/features.16/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.7/conv/conv.0/conv.0.0/Conv: | | 0.008% -/features/features.10/conv/conv.0/conv.0.0/Conv: | | 0.006% -/features/features.15/conv/conv.0/conv.0.0/Conv: | | 0.005% -/features/features.3/conv/conv.0/conv.0.0/Conv: | | 0.004% -/features/features.11/conv/conv.0/conv.0.0/Conv: | | 0.004% -/features/features.18/features.18.0/Conv: | | 0.003% -/features/features.5/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.9/conv/conv.1/conv.1.0/Conv: | | 0.003% -/features/features.6/conv/conv.0/conv.0.0/Conv: | | 0.003% -/features/features.7/conv/conv.1/conv.1.0/Conv: | | 0.003% -/features/features.17/conv/conv.1/conv.1.0/Conv: | | 0.002% -/features/features.14/conv/conv.1/conv.1.0/Conv: | | 0.002% -/features/features.8/conv/conv.0/conv.0.0/Conv: | | 0.001% -/features/features.9/conv/conv.0/conv.0.0/Conv: | | 0.001% - - * Prec@1 69.800 Prec@5 88.550 -``` - -- **Quantization Error Analysis:** - -Note that applying layerwise equalization on 8-bit quantization is helpful to achieve smaller quantization error. The graphwise error of the model's last layer, /classifier/classifier.1/Gemm, is 8.965%. The top-1 accuracy after quantization is 69.800%, which is closer to the accuracy of the float model (71.878%), even compared to Mixed-Precision Test. - -If you wish to further reduce the quantization error, you can try using Quantization Aware Training (QAT). For specific methods, please refer to the [ppq QAT example](https://github.com/OpenPPL/ppq/blob/master/ppq/samples/TensorRT/Example_QAT.py). - - > Note: the model in [examples/mobilenet_v2](../examples/mobilenet_v2/) comes from 8-bit Quantization Test. The 16-bit conv operator is still under development。 \ No newline at end of file