Skip to content

Commit

Permalink
add rubicon schema support (#393)
Browse files Browse the repository at this point in the history
* port `rubicon_schema` source over
* formatting
* add tests
* add notebooks
* update docs
* add recent XGB changes
* add recent LGBM changes
* reset versions
* linting & formatting
  • Loading branch information
ryanSoley authored Oct 12, 2023
1 parent ff9e9e2 commit 484ba18
Show file tree
Hide file tree
Showing 28 changed files with 2,921 additions and 7 deletions.
3 changes: 3 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
graft rubicon_ml/viz/assets
graft rubicon_ml/viz/assets/css

include versioneer.py
include rubicon_ml/_version.py

recursive-include rubicon_ml/schema *.yaml
9 changes: 9 additions & 0 deletions docs/source/api_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,15 @@ RubiconJSON

.. _library-reference-sklearn:

schema
======

.. automodule:: rubicon_ml.schema.logger
:members:

.. automodule:: rubicon_ml.schema.registry
:members:

sklearn
=======
``rubicon_ml`` offers direct integration with **Scikit-learn** via our
Expand Down
67 changes: 67 additions & 0 deletions docs/source/contribute-schema.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
.. _contribute-schema:

Contribute a schema
*******************

Consider the following schema that was created in the "Register a custom schema" section:

.. code-block:: python
extended_schema = {
"name": "sklearn__RandomForestClassifier__ext",
"extends": "sklearn__RandomForestClassifier",
"parameters": [
{"name": "runtime_environment", "value_env": "RUNTIME_ENV"},
],
}
To contribute "sklearn__RandomForestClassifier__ext" to the ``rubicon_ml.schema`` registry,
first write the dictionary out to a YAML file.

.. code-block:: python
import yaml
schema_filename = "sklearn__RandomForestClassifier__ext.yaml"
with open(schema_filename, "w") as file:
file.write(yaml.dump(extended_schema))
Once "sklearn__RandomForestClassifier__ext.yaml" is created, follow the "Developer
instructions" to fork the rubicon-ml GitHub repository and prepare to make a contribution.

From the root of the forked repository, copy the new schema into the library's schema directory:

.. code-block:: bash
cp [PATH_TO]/sklearn__RandomForestClassifier__ext.yaml rubicon_ml/schema/schema/
Then update **rubicon_ml/schema/registry.py**, adding the new schema to the
``RUBICON_SCHEMA_REGISTRY``:

.. code-block:: python
RUBICON_SCHEMA_REGISTRY = {
# other schema entries...
"sklearn__RandomForestClassifier__ext": lambda: _load_schema(
os.path.join("schema", "sklearn__RandomForestClassifier__ext.yaml")
),
}
Finally refer back to the "Contribute" section of the "Developer instructions" to push your
changes to GitHub and open a pull request. Once the pull request is merged,
"sklearn__RandomForestClassifier__ext" will be available in the next release of
``rubicon_ml``.

Schema naming conventions
=========================

When naming a schema that extends a schema already made available by ``rubicon_ml.schema``, simply
append a double-underscore and a unique identifier. The "sklearn__RandomForestClassifier__ext"
above is named following this convention.

When naming a schema that represents an object that is not yet present in schema,
leverage the ``registry.get_schema_name`` function to generate a name. For example, if
you are making a schema for an object ``my_obj`` of class ``Model`` from a module ``my_model``,
``registry.get_schema_name(my_obj)`` will return the name "my_model__Model".
5 changes: 5 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ To install all extra modules, use the ``all`` extra.
logging-examples/logging-training-metadata
logging-examples/logging-plots
logging-examples/logging-concurrently
logging-examples/log-with-schema
logging-examples/tagging
logging-examples/rubiconJSON-querying
visualizations.rst
Expand All @@ -136,6 +137,8 @@ To install all extra modules, use the ``all`` extra.
integrations/integration-sklearn
logging-examples/logging-feature-plots
logging-examples/multiple-backend
logging-examples/register-custom-schema
logging-examples/set-schema
logging-examples/visualizing-logged-dataframes

.. toctree::
Expand All @@ -152,13 +155,15 @@ To install all extra modules, use the ``all`` extra.
:caption: Reference

api_reference.rst
schema-representation.rst

.. toctree::
:maxdepth: 2
:hidden:
:caption: Community

contributing.rst
contribute-schema.rst
Changelog<https://github.com/capitalone/rubicon-ml/releases>
Feedback<https://github.com/capitalone/rubicon-ml/issues/new/choose>
GitHub<https://github.com/capitalone/rubicon-ml>
Expand Down
Loading

0 comments on commit 484ba18

Please sign in to comment.