MLflow — how to train, manage and deploy your model

3 min readJun 30, 2020

Training a machine learning model is not tough, scikit-learn can help.

Deploying a model is not hard, API server can do it.

However, when it comes to management or integration of the whole life cycle of machine learning model, there is no simple solution in production. MLflow is one of handy tools for this kind of problems.

We will go through how to setup a MLflow server with database and artifact store, log training hyper-parameters and metrics, register a model and serve it.

Setup

Use docker-compose to setup a remote tracking server with ftp as artifacts location and PostgreSQL as backend storage.

Command to start the tracking server.

mlflow server \
       --backend-store-uri <database> \
       --default-artifact-root <ftp> \
       -h 0.0.0.0

2. Log training

Add training snippets into your training code with MLflow API.

mlflow.log_param: useful to record the result of an experiment.mlflow.log_metric: useful to record the result of an experiment.mlflow.log_artifact: anything else you want to save and version with the experiment, like visualizations, datas, prediction outputs, etc.mlflow.sklearn.log_model: of course, the most important one.

Model Logging depends on what kinds of flavors(frameworks) you use, some of them support automatically logging.

Tracking UI

Comparison between the results of the experiments.

3. Model Registry

and label it for staging, production or archived.

4. Serving

mlflow models serve -m ftp://<ftp_user>:<ftp_pass>@ftp/home/runs/<uuid>/model -p <port>

Request body can be in the form of a pandas dataframe in csv, json(orient=’split’) or json(orient=’records’) format.

Advantages:

A team can follow the standard project structure.
Experiment is reproducible and tracked automatically.
Model registry and versioning are out-of-the-box.
It supports one model one Conda environment.

Drawbacks:

no UI for the management of model deployment.
It doesn’t support multiple model deployment at once, each model will take up a port.

Conclusions

MLflow provides a solution to manage the whole life cycle of machine learning models, it has many great features and some are really handy.

Compared to other tools which are trying to solve different problems during different periods of life cycle, like

inference: bentoml、clipper,

training logging: TRAINS,

model versioning: modeldb,

data versioning: dvc、pachyderm,

data science pipeline: kedro,

each of them has their own merits and disadvantages in situations we are dealing with in the real world. I would suggest that choose the most suitable combination of tools depends on the application you are developing.

MLflow — how to train, manage and deploy your model

Conclusions

Written by mz bai

No responses yet