Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP ModelExportInfra #3703

Closed
wants to merge 10 commits into from
Closed

WIP ModelExportInfra #3703

wants to merge 10 commits into from

Conversation

skanjila
Copy link
Collaborator

@skanjila skanjila commented Oct 8, 2023

Code Pull Requests

Please provide the following:

  • a set of classes to do model export in onnx and coreml

Documentation Pull Requests

Note that the documentation HTML files are in docs/ while the Markdown sources are in mkdocs/docs.

If you are proposing a modification to the documentation you should change only the Markdown files.

api.md is automatically generated from the docstrings in the code, so if you want to change something in that file, first modify ludwig/api.py docstring, then run mkdocs/code_docs_autogen.py, which will create mkdocs/docs/api.md .

@skanjila skanjila changed the title ModelExportInfra WIP ModelExportInfra Oct 8, 2023
ludwig/model_export/coreml_exporter.py Outdated Show resolved Hide resolved
ludwig/model_export/coreml_exporter.py Outdated Show resolved Hide resolved
ludwig/model_export/coreml_exporter.py Show resolved Hide resolved
ludwig/model_export/coreml_exporter.py Outdated Show resolved Hide resolved
ludwig/model_export/onnx_exporter.py Show resolved Hide resolved
ludwig/model_export/onnx_exporter.py Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Oct 8, 2023

Unit Test Results

       6 files  ±       0         6 suites  ±0   1h 4m 51s ⏱️ + 44m 7s
2 794 tests +2 782  2 770 ✔️ +2 761  23 💤 +20  1 +1 
8 382 runs  +8 322  8 310 ✔️ +8 268  69 💤 +51  3 +3 

For more details on these failures, see this check.

Results for commit 48f0c55. ± Comparison against base commit f98b8f6.

This pull request removes 4 and adds 2786 tests. Note that renamed tests count towards both.
tests.regression_tests.model.test_old_models ‑ test_model_loaded_from_old_config_prediction_works
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[respiratory]
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[titanic]
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[twitter_bots]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops0]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops1]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[None]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops1]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops4]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[random_horizontal_flip]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_load_model_with_augmentation_pipeline
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_local_model_training_with_augmentation_pipeline[preprocessing0-encoder0-False]
…
This pull request skips 2 tests.
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.gbm.yaml]

♻️ This comment has been updated with latest results.

pass

@abstractmethod
def quantize(path_fp32, path_int8):
Copy link
Contributor

@saad-palapa saad-palapa Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better design to call this:

def quantize(model_path, quantized_path):

The reason is that not all unquantized models are fp32 and not all quantized models are int8


width = ludwig_model.config["input_features"][0]["preprocessing"]["width"]
height = ludwig_model.config["input_features"][0]["preprocessing"]["height"]
example_input = torch.randn(1, 3, width, height, requires_grad=True)
Copy link
Contributor

@saad-palapa saad-palapa Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have logic to check if this is channels_first or channels_last?

Copy link
Contributor

@saad-palapa saad-palapa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left a few comments to consider.

pass

@abstractmethod
def quantize(self, path_fp32, path_int8):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable naming assumes all non-quantized models to be fp32 and quantized to be int8. This is not always the case. For example, we can have fp16 => fp4 (https://youtu.be/MK4k64vY3xo?si=ULlFZQa_2DSvcgpX&t=1970)

Perhaps the function signature should be:

def quantize(self, model_path, quantized_model_path)

def quantize(self, path_fp32, path_int8):
import coremltools.optimize.coreml as cto

ludwig_model = LudwigModel.load(path_fp32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this looks good. But can we add a @todo to add more quantization options.

def forward(self, x):
return self.model({"image_path": x})

def export_classifier(self, model_path, export_path):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing export_args_override

pass

@abstractmethod
def export_classifier(self, model_path, export_path, export_args_override):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this and only have export_args_override in OnnxExporter.

skanjila and others added 2 commits October 10, 2023 21:21
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justin Zhao <[email protected]>


class OnnxExporter(ABC):
def export_classifier(self, model_id, model_path, export_path, input_model_name, output_model_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these params are not being used. It think the function signature should look like:

def export(self, model_path, export_path)

I don't think it's necessary to include _classifier in the function name

from ludwig.model_export.base_model_exporter import BaseModelExporter


class CoreMLExporter(BaseModelExporter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave the CoreML stuff to another PR. Let's keep the task here small

pass

@abstractmethod
def quantize(self, path_fp32, path_int8):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we defer teh quantize function to another PR. Let's keep this one small

output_names=["combiner_hidden_1", "output", "combiner_hidden_2"],
)

def quantize_onnx(self, model_id, path_fp32, path_int8):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave quantization off to another PR?


quantize_dynamic(path_fp32, path_int8) # type: ignore

def check_model_export(self, model_id, export_path, output_model_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this function signature to:

def check_model_export(self, path)

I don't think it's necessary to separate folder and filename

@skanjila skanjila closed this by deleting the head repository Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants