Manage your machine learning life cycle with MLflow in Python

Published in

Analytics Vidhya

9 min readJan 29, 2021

MLflow main components. Image by Databricks¹.

In this post, we are going through the central aspect of MLflow, an open-source platform to manage the life cycle of machine learning models.

We'll cover:

The basic setup of mlflow
Experiments
Logging metrics and parameters
Custom artifacts
Models registry
Model predictions

MLOps is a methodology for enabling collaboration across data scientists; it helps to gain control over different models versions, multiple experiments within the same problem, and models management and deployment. There are several open-source and commercial solutions to approach this problem; we will take a look at MLflow.

According to the MLflow's site:

MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud).²

The Setup:

First, let's start by installing mlflow; it's recommended to use a virtual environment.

pip install mlflow

To run MLflow, you are going to need:

A tracking server will allow us to see a UI with our model's life cycle.
Backend storage, here is where we will record our metrics, parameters, and other metadata.
An artifact root to store our models and custom objects that we choose.

For demonstration purposes, we will use local storage and a personal computer for the tracking server.

Warning: In a production environment, you should use backend storage such as Postgres or MySQL database.

You can use a bucket-like system like Amazon S3, Azure Blob Storage, or GCP cloud storage for artifact storage.

And for tracking server, you can use Kubernetes, a VM, or any compute engine system that you can expose.

To start the server locally, run:

mlflow server --backend-store-uri mlflow_db \
              --default-artifact-root ./mlflowruns \
              --host 0.0.0.0 \
              --port 5000

You should see a folder named mlflow_db created and a message like this in your console.

Tracking server log. Image by the author.

Now, go to http://localhost:5000/, and you should see the mlflow's UI.

We have several options which we are going to take a look at:

At the left panel, we are going to see our experiments, which we'll help us to group runs of the same problem; there is an experiment Called "Default," let us edit it to rename it "Wine Regression," and we can create another one called Iris.

Training and Logging:

Now, create a file named train.py and put this code, which is a modified version of the mlflow's tutorial³:

import warnings
import sys
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
import mlflow.sklearn

import logging

logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)

mlflow.set_tracking_uri('http://localhost:5000')
mlflow.set_experiment(experiment_name='Wine Regression')

tags = {"team": "Analytics Principal",
        "dataset": "Wine",
        "release.version": "2.2.2"}


def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2


if __name__ == "__main__":
    warnings.filterwarnings("ignore")

    # Read the wine-quality csv file from the URL
    csv_url = (
        "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
    )
    try:
        data = pd.read_csv(csv_url, sep=";")
    except Exception as e:
        logger.exception(
            "Unable to download training & test CSV, check your internet connection. Error: %s", e
        )

    # Split the data into training and test sets. (0.75, 0.25) split.
    train, test = train_test_split(data)

    # The predicted column is "quality" which is a scalar from [3, 9]
    train_x = train.drop(["quality"], axis=1)
    test_x = test.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    test_y = test[["quality"]]

    alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
    l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5

    with mlflow.start_run(run_name='Sk_Elasticnet'):

        mlflow.set_tags(tags)

        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
        lr.fit(train_x, train_y)

        predicted_qualities = lr.predict(test_x)

        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
        print("  RMSE: %s" % rmse)
        print("  MAE: %s" % mae)
        print("  R2: %s" % r2)

        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        mlflow.log_metric("rmse", rmse)
        mlflow.log_metric("r2", r2)
        mlflow.log_metric("mae", mae)

        mlflow.sklearn.log_model(lr, "model")
        mlflow.log_artifact(local_path='./train.py', artifact_path='code')

I'll assume that you are already familiar with machine learning and sklearn, so I'll only explain the MLflow related code; first, you see these lines of code:

mlflow.set_tracking_uri('http://localhost:5000')
mlflow.set_experiment(experiment_name='Wine Regression')
tags = {"team": "Analytics Principal",
        "dataset": "Wine",
        "release.version": "2.2.2"}

We are telling mlflow that the server that is up and running is at our localhost and port 5000 and that we want to use an experiment space named Wine Regression. We also have a tags dict which will work as extra metadata that we want to relate to the experiment run.

Then we start our run as a context manager so we can use several methods inside while the run is active; we are going to name it Sk_Elasticnet so we can remember what are we running:

with mlflow.start_run(run_name='Sk_Elasticnet'):

        mlflow.set_tags(tags)
        #.....        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        mlflow.log_metric("rmse", rmse)
        mlflow.log_metric("r2", r2)
        mlflow.log_metric("mae", mae)

        mlflow.sklearn.log_model(lr, "model")
        mlflow.log_artifact(local_path='./train.py',    
                            artifact_path='code')

with the .set_tags method, we link the current run to the tags we defined; using .log_params, we can pass the current parameters tuple (name, value), similar to the .log_metric for our model's metrics result.

We also want to save the trained model to use it later; for this, we use .log_model, mlflow will create some metadata around it and export it as a cloudpickle file (or just .pickle).

And one of my favorite things about mlflow, we can log custom artifacts using .log_artifact; we can use it to store images related to the training phase, external resources, datasets, and even a copy of the code used to generate this run. We are going to use it with that last use case.

When we run the code and refresh the mlflow UI, you should see something like this.

Elasticnet running logs. Image by the author.

The UI lets us see the metrics, parameters, and tags we set, it also includes other features automatically, such as the user that ran the code, the start time at local time, the type of model used, and the git commit under the code was (if you have a git repo setup).

If we click the model, we can see additional information like the saved model and our custom artifacts, in our case, the code we used. It also has a preview of those objects, and you can download them locally.

Model Registry:

As you can see, there is a button where we can register the model so we can serve it as an API; if you try it, you can create a name for your model, but you probably see an error saying that the is not a supported URI for model registry data storage, this is because you have to set up your mlflow backend with some of the supported databases, which are PostgreSQL, MySQL, MS-SQL, and SQLite, if you have one of those, you only have to change in the startup command the option for an SQLAlchemy database URI.

export DB_URI = <dialect>+<driver>://<username>:<password>@<host>:<port>/<database>mlflow server --backend-store-uri $DB_URI \
              --default-artifact-root ./mlflowruns \
              --host 0.0.0.0 \
              --port 5000

So I'll show you how it would look once you change this setup (more on this at the end).

After we rename it, we click on the URL of the model and set its stage to production, so when we use the prediction API, we make sure we are requesting the current production model and not a random model run; by default, it would be the version 1 of the model, and it should look like this:

Now, I'm going to the train.py file that we created and rerun it with some different hyperparameters to have different model versions of the same problem; after running them, the Wine Regression experiment should look like this.

Several runs under the same experiment. Image by the author.

Models Comparison:

We probably want to see if any of the more recent runs have a better model performance using the logged metrics and maybe custom images that we saved as artifacts.

MLflow comes with a comparison option, that at this time of writing, is still a little bare, but can give us some straightforward comparison, lets select all models with the left-side checkbox and click on compare.

The first plot we should see is a plotly scatter plot; we can select which metrics and parameters want to plot and compare.

Metrics scatter plot. Image by the author.

If you go over each dot, I will show you the run name and parameters/metrics associated with that run.

You can also see a contour plot to see how metrics changes under the different combinations of pairs of parameters.

Parameters contour plot. Image by the author

And at the end, we have a parallel coordinates plot where we can put all our metrics and parameters.

Parallel coordinates plot. Image by the author.

After comparing the different models through the UI, we can choose which model we will move to production. Then, go to the experiment tab and register the model you decided under the exact model name, lets move it to the staging step.

Note: You can get all this metadata with the MLflow API to automate or/and make a better comparison process if you want.

Now, on the Model tab, it should look like this:

Model registry with stages. Image by the author.

And if you hit version 2, I can move it into production and set version 1 as archived.

Making Predictions:

As our last step, we want to make some predictions; for this, MLflow already gives us an API to make the requests; if you go to the model registry, you should already notice that there is a run id, and you can use it to make requests to that particular model that we ran. Still, as we want to use our production model, without constantly checking if the run id may change, we can use the mlflow syntax with the model name and stage; the code looks like this:

import mlflow
import pandas as pd
logged_model = 'models:/Wine_Elasticnet_Sklearn_Regressor/Production'

data = pd.read_csv('winequality-red.csv')

# Load model as a PyFuncModel.
loaded_model = mlflow.pyfunc.load_model(logged_model)


print(loaded_model.predict(data))

The wine quality-red.csv has two rows to predict; after running, we get an array with each prediction.

MLflow prediction response. Image by the author.

You can also create a new server to deploy the model as an API using the mlflow's CLI and make HTTP requests, more on this: https://www.mlflow.org/docs/latest/cli.html

Conclusions: As you can see, MLflow has several easy and ready to use functionalities to manage the whole life cycle of the machine learning models; there are several topics we didn't cover in this post, such as deploying mlflow using a database system, remote storage, and external server, also other advanced API use cases and options to automate the steps made in the UI.

I also have to point out that even if MLflow is a great open source project, it still has some drawbacks that you must take care of; I'd like to mention a few:

There is no built-in authentication mechanism, although you can set up an Nginx proxy with basic auth or OAuth 2.0.
There is not an integrated user role-based management, so every person has the same permissions over the models.
Once you set up a cloud-based deployment, you must save both on the server and client-side the connection secret associated with the storage provider, which again can have some security problems.

Commercial versions of MLflow as the one offered by Databricks, take care of those kinds of issues.

Let me know in the comments if you'd like to know how to set up MLflow with Docker in a cloud environment such as Azure or Amazon or more advanced use cases.

The whole code is in my Github: https://github.com/rodrigo-arenas/mlflow-basics

References:

[1] Databricks Introducing to MLflow: https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html

[2] MLflow Github: https://github.com/mlflow/mlflow

[3]MLflow tutorial: https://www.mlflow.org/docs/latest/tutorials-and-examples/tutorial.html