Skip to content

vhaldemar/serialzy

 
 

Repository files navigation

Pypi version Tests Python tests coverage PyPI - Python Version

serialzy

Serialzy is a library for python objects serialization into portable and interoperable data formats (if possible).

Example

Suppose you have a catboost model:

from catboost import CatBoostClassifier

model = CatBoostClassifier()
model.fit(...)

Firstly you should find a proper serializer for the catboost model type or the corresponding data format:

from serialzy.registry import DefaultSerializerRegistry

registry = DefaultSerializerRegistry()
serializer = registry.find_serializer_by_type(type(model)) # registry.find_serializer_by_data_format("cbm")

Serializers have several properties:

serializer.available()      # can be used in the current environment
serializer.requirements()   # libraries needed to be installed to use this serializer
serializer.stable()         # has portable data format

Serializers can provide data format and schema for a type:

serializer.data_format()
serializer.schema(type(model))

Serialization:

with open('model.cbm', 'wb') as file:
    serializer.serialize(model, file)

Deserialization:

with open('result', 'rb') as file:
    deserialized_obj = serializer.deserialize(file)

List of supported libraries for stable serialization:

Library Types Data format
Python std lib int, str, float, bool, None string representation
Python std lib List, Tuple custom format
CatBoost CatBoostRegressor, CatBoostClassifier, CatBoostRanker cbm
CatBoost Pool quantized pool
Tensorflow.Keras Sequential, Model with subclasses tf_keras
Tensorflow Checkpoint, Module with subclasses tf_pure
LightGBM LGBMClassifier, LGBMRegressor, LGBMRanker lgbm
XGBoost XGBClassifier, XGBRegressor, XGBRanker xgb
Torch Module with subclasses pt

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.5%
  • Shell 1.5%