Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating mlflow #9497

Open
wants to merge 1 commit into
base: release/2.6
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion doc/doc_en/config_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Take rec_chinese_lite_train_v2.0.yml as an example
| checkpoints | set model parameter path | None | Used to load parameters after interruption to continue training|
| use_visualdl | Set whether to enable visualdl for visual log display | False | [Tutorial](https://www.paddlepaddle.org.cn/paddle/visualdl) |
| use_wandb | Set whether to enable W&B for visual log display | False | [Documentation](https://docs.wandb.ai/)
| use_mlflow | Set whether to enable MLFlow for visual log display | False | [Documentation](https://mlflow.org/docs/latest)
| infer_img | Set inference image path or folder path | ./infer_img | \||
| character_dict_path | Set dictionary path | ./ppocr/utils/ppocr_keys_v1.txt | If the character_dict_path is None, model can only recognize number and lower letters |
| max_text_length | Set the maximum length of text | 25 | \ |
Expand Down Expand Up @@ -141,6 +142,13 @@ In PaddleOCR, the network is divided into four stages: Transform, Backbone, Neck
| save_dir | local directory in which all the models and other data is saved | wandb | \
| config | model configuration | None | \

### mlflow ([mlflow](../../ppocr/utils/loggers/mlflow_logger.py))
| Parameter | Use | Defaults | Note |
| :---------------------: | :---------------------: | :--------------: | :--------------------: |
| tracking_uri | path of tracking_uri or local directory path | if not set, all logs will be saved inside local directory save_model_dir/mlflow | \
| exp_name | Name of the project | Randomly generated by mlflow | \
| run_name | Name of the run | Randomly generated by mlflow | \


<a name="3-multilingual-config-file-generation"></a>

Expand Down Expand Up @@ -245,4 +253,4 @@ For more supported languages, please refer to : [Multi-language model](https://g

The multi-language model training method is the same as the Chinese model. The training data set is 100w synthetic data. A small amount of fonts and test data can be downloaded using the following two methods.
* [Baidu Netdisk](https://pan.baidu.com/s/1bS_u207Rm7YbY33wOECKDA),Extraction code:frgi.
* [Google drive](https://drive.google.com/file/d/18cSWX7wXSy4G0tbKJ0d9PuIaiwRLHpjA/view)
* [Google drive](https://drive.google.com/file/d/18cSWX7wXSy4G0tbKJ0d9PuIaiwRLHpjA/view)
41 changes: 41 additions & 0 deletions doc/doc_en/logging_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,5 +57,46 @@ For more advanced usage to log images, audios, videos or any other form of data,

To view the dashboard, the link to the dashboard is printed to the console at the beginning and end of every training job and you can also access it by logging into your W&B account on your browser.

### mlflow
mlflow is a MLOps tool that can be used for experiment tracking, dataset/model versioning, visualizing results and collaborating with colleagues. A mlflow logger is integrated directly into PaddleOCR and to use it, first you need to install the mlflow package.

```shell
# Install MLflow
pip install mlflow

# Install MLflow with extra ML libraries and 3rd-party tools
pip install mlflow[extras]

# Start a local server
mlflow server --backend-store-uri /datadrive/mlflow --default-artifact-root /datadrive/mlflow --host 0.0.0.0
```

To visualize and track your model training add the following flag to your config yaml file under the `Global` section -

```
Global:
use_mlflow: True
```

To add more arguments to the `MlflowLogger` listed [here](./config_en.md) add the header `mlflow` to the yaml file and add the arguments under it -

```
mlflow:
tracking_uri: http://localhost:5000
exp_name: paddleocr
run_name: icdar2015.det_mv3_db
```



These config variables from the yaml file are used to instantiate the `MlflowLogger` object with the project name. During the training process, the `log_metrics` function is called to log training and evaluation metrics at the training and evaluation steps respectively from the rank 0 process only.
At every model saving step, the MlflowLogger, logs the model using the `log_model` function along with relavant metadata and tags showing the epoch in which the model is saved, the model is best or not and so on.

All the logging mentioned above is integrated into the `program.train` function and will generate dashboards like this -

![mlflow Dashboard](../imgs_en/mlflow_metrics.png)

![mlflow Models](../imgs_en/mlflow_models.png)

### Using Multiple Loggers
Both VisualDL and W&B can also be used simultaneously by just setting both the aforementioned flags to True.
Binary file added doc/imgs_en/mlflow_metrics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/imgs_en/mlflow_models.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions ppocr/utils/loggers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from .vdl_logger import VDLLogger
from .wandb_logger import WandbLogger
from .mlflow_logger import MlflowLogger
from .loggers import Loggers
2 changes: 1 addition & 1 deletion ppocr/utils/loggers/loggers.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from .wandb_logger import WandbLogger

from .mlflow_logger import MlflowLogger
class Loggers(object):
def __init__(self, loggers):
super().__init__()
Expand Down
84 changes: 84 additions & 0 deletions ppocr/utils/loggers/mlflow_logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import os
from .base_logger import BaseLogger
from typing import Dict, Optional

class MlflowLogger(BaseLogger):
def __init__(self,
save_dir: str,
exp_name: Optional[str] = None,
run_name: Optional[str] = None,
tags: Optional[dict] = None,
params: Optional[dict] = None,
tracking_uri: Optional[str] = None,
**kwargs):

self.import_mlflow()
self.save_dir = save_dir
self.exp_name = exp_name
self.run_name = run_name
self.tags = tags
self.params = params
self.tracking_uri = tracking_uri

if self.tracking_uri is not None:
self.mlflow.set_tracking_uri(self.tracking_uri)
else:
mlflow_writer_path = "{}/mlflow".format(self.save_dir)
file_url = os.path.abspath(mlflow_writer_path)
self.mlflow.set_tracking_uri(file_url)

if self.mlflow.get_experiment_by_name(self.exp_name) is None:
self.mlflow.create_experiment(self.exp_name)

self.mlflow.set_experiment(self.exp_name)

self._run = None
_ = self.run

if self.run_name is not None:
self._run = self.mlflow.start_run(run_name=self.run_name)
self.mlflow.set_tag('mlflow.runName', self.run_name)
if self.tags is not None:
self.mlflow.set_tags(self.tags)
if self.params is not None:
self.mlflow.log_params(self.params)


def import_mlflow(self) -> None:
try:
import mlflow
import mlflow.paddle as mlflow_paddle
except ImportError:
raise ImportError(
'Please run "pip install mlflow" to install mlflow')
self.mlflow = mlflow
self.mlflow_paddle = mlflow_paddle

@property
def run(self):
return

def log_metrics(self, metrics, prefix=None, step=None):
if not prefix:
prefix = ""
updated_metrics = {prefix.lower() + "/" + k: v for k, v in metrics.items()}
self.mlflow.log_metrics(updated_metrics, step=step)

def log_model(self, is_best, prefix, metadata=None):
model_path = os.path.join(self.save_dir, prefix + '.pdparams')
config_path = os.path.join(self.save_dir, "config.yml")
if is_best:
artifact_path = '{}'.format(prefix)
else:
artifact_path = '{}'.format(prefix)

self.mlflow.log_artifact(model_path, artifact_path=artifact_path)
self.mlflow.log_artifact(config_path, artifact_path=artifact_path)

if metadata is not None:
metadata_path = os.path.join(artifact_path, "epoch_" + str(metadata["best_epoch"]) + ".json")
self.mlflow.log_dict(metadata, artifact_file=metadata_path)

def close(self):
self.mlflow.end_run()
return
12 changes: 11 additions & 1 deletion tools/program.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
from ppocr.utils.save_load import save_model
from ppocr.utils.utility import print_dict, AverageMeter
from ppocr.utils.logging import get_logger
from ppocr.utils.loggers import VDLLogger, WandbLogger, Loggers
from ppocr.utils.loggers import VDLLogger, WandbLogger, Loggers, MlflowLogger
from ppocr.utils import profiler
from ppocr.data import build_dataloader

Expand Down Expand Up @@ -679,6 +679,16 @@ def preprocess(is_train=False):
wandb_params.update({'save_dir': save_model_dir})
log_writer = WandbLogger(**wandb_params, config=config)
loggers.append(log_writer)
if ('use_mlflow' in config['Global'] and
config['Global']['use_mlflow']) or 'mlflow' in config:
save_model_dir = config['Global']['save_model_dir']
if "mlflow" in config:
mlflow_params = config['mlflow']
else:
mlflow_params = dict()
mlflow_params.update({'save_dir': save_model_dir})
log_writer = MlflowLogger(**mlflow_params, config=config)
loggers.append(log_writer)
else:
log_writer = None
print_dict(config, logger)
Expand Down