MLOps 10: MLflow Part 2

MLflow models, REST, MLflow registry, Deploying a Machine Learning Model

Oct 29, 2023

Let’s focus on wrapping up MLflow.

For those that are curious, depending on the complexity of the tasks involved when it comes to deploying your model in MLflow, this entire part 1 & part 2 could be assigned to you, and you could be given 2 months, or you could be expected to do this entire thing in 1 sprint. Just depends on the company (ie, how hard they want you to work) & the complexity (what are all of the things you should account for).

If you’d like to look at the official MLflow documentation on what we are talking about, you can hop on to this link.

MLflow models
REST
MLflow registry
Deploying a Machine Learning Model

1 - MLflow models

1.1 Intro to MLflow models

MLflow Models is a submodule of MLflow focused on streamlining the process of managing machine learning models. It supports a variety of machine learning frameworks, ensuring that no matter what tools you’ve used to build your model, it can be logged, versioned, and served consistently.

Logging a model in MLflow provides a standardized format for saving models, facilitating easier sharing, version control, and deployment. It ensures that all the necessary components of your model, including dependencies and environment specifications, are captured, making it straightforward to move from development to production.

1.2 How to log a model

Here is a simple piece of code that loads some dataset, trains a random forest model, and then most importantly… logs the model.

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
import pandas as pd

# Loading dataset
diabetes = load_diabetes()
X = pd.DataFrame(diabetes.data, columns=diabetes.feature_names)
y = pd.Series(diabetes.target, name='target')

# Splitting dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Training a model
model = RandomForestRegressor()
with mlflow.start_run():
    model.fit(X_train, y_train)
    mlflow.sklearn.log_model(model, "model")

The most important line is the last one mlflow.sklearn.log_model(model, "model")

Upon logging a model, MLflow captures and stores several key components:

conda.yaml: This YAML file specifies the conda environment, ensuring that the same dependencies and Python version are available when loading or serving the model later.
MLmodel: A metadata file that includes essential information like the model's flavor, which denotes the machine learning library used to create the model.
model.pkl: The serialized version of your machine learning model, ready to be loaded for predictions or further analysis.
requirements.txt: A list of Python packages required to run the model, ensuring all dependencies are met.

_images/oss_registry_1_register.png — You can find all of the logged information in the Experiments tab.

1.3 Making simple predictions with a logged model

Once a model is logged, you can load it and make predictions on new data. Here’s a quick code snippet that shows how:

loaded_model = mlflow.pyfunc.load_model(model_uri="runs:/<RUN_ID>/model")
predictions = loaded_model.predict(X_test)

Here we use a model’s unique Run_ID to load it up, and then use it to make predictions.

2 - REST & MLflow models

Machine learning models are most valuable when they're accessible and usable in real-world applications. This is where deploying your models as a service comes into play, turning your trained algorithms into tools that can be easily integrated into various software environments. MLflow models, when coupled with REST APIs, offer a powerful solution to bring machine learning to the forefront of production environments.

2.1 REST servers & APIs

A REST server acts as a bridge between your machine learning model and the outside world. It exposes endpoints, which are specific URLs, that accept data, process it using your model, and return the predictions. REST, standing for Representational State Transfer, is a set of guidelines that developers follow to create these web services, ensuring they are scalable, stateless, and can be easily consumed by various clients.

Deploying your machine learning model as a REST API provides numerous benefits:

Interoperability: Your model can be accessed and used by any system that can make HTTP requests, regardless of the programming language or platform.
Scalability: As the demand for predictions grows, you can scale your deployment to handle more requests.
Ease of Integration: REST APIs can be easily integrated into web applications, mobile apps, and even other services.

Here is a link to a great video on REST APIs. Keep in mind that as Machine Learning Engineers, we don’t need to specialize in REST APIs, we just need to know the basics so we can use them.

2.2 Starting your local REST server with MLflow

MLflow simplifies the process of deploying your model as a REST API. With a single command, you can start a local server that serves your model:

mlflow models serve -m "runs:/<RUN_ID>/model" -p 1234

In this command:

mlflow models serve: This tells MLflow to serve a model.
-m "runs:/<RUN_ID>/model": Specifies the model to serve, using the run ID from your MLflow experiment.
-p 1234: Sets the port number for the server, allowing you to access it at http://localhost:1234

After running this command, your model is live and waiting for data to process.

2.3 CURL

CURL is a command-line tool for making HTTP requests. It's incredibly versatile and supports a wide array of HTTP methods, making it a popular choice for interacting with REST APIs.

When your machine learning model is deployed as a REST API, you can use CURL to send data to the model and receive predictions back. This allows you to test and interact with your model directly from the command line, ensuring everything is working as expected before integrating it into a larger application.

Here is a nice video that introduces how to use REST API and CURL together.

2.4 Putting it all together

Now that we got all of the introductions & basics out of the way. Let’s put all together:

Prepare Your Data: Before you can make a prediction, you need to prepare your input data in a format that your model expects. This might involve transforming raw data into a structured format, normalizing values, or encoding categorical variables.
Craft Your CURL Command: With your data ready, you can now create a CURL command to send a request to your model. The command will look something like this:
```
curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["feature1", "feature2", "feature3"],"data":[[value1, value2, value3]]}' http://127.0.0.1:1234/invocations
```
Let’s break down the above to see what’s actually going on:
- -X POST: Specifies that you are making a POST request, which is used to submit data to be processed.
- -H "Content-Type:application/json; format=pandas-split": Sets the content type of your request, telling the server you are sending JSON data in a specific format.
- --data '{"columns":["feature1", "feature2", "feature3"],"data":[[value1, value2, value3]]}': The actual data you are sending to the model.
- http://127.0.0.1:1234/invocations: The URL of your model's endpoint, ready to receive and process data.
Receive and Interpret the Prediction: After sending your request, the model will process the data and return a prediction. This will be displayed in your command line, and you can interpret the results to make informed decisions or further analyze the model’s performance. Using CURL, you can also save the output of the model to a JSON file.

3 - MLflow registry

MLflow Model Registry is an integral component of the MLflow ecosystem, designed to provide a centralized repository for storing, annotating, managing, and deploying machine learning models. It acts as a single source of truth for all your ML models, ensuring that teams can collaboratively work on models, maintain version control, and seamlessly transition from development to production.

3.1 Information tracked by MLflow registry

Registered Model: This represents a unique model in the registry, identified by its name. Each registered model can have multiple versions, capturing the different iterations and improvements made over time.
Model Version: For every iteration or update made to a model, a new version is created. This ensures that you can track changes, compare performance, and roll back if necessary.
Model Stage: This denotes the lifecycle stage of a model version, such as “Staging”, “Production”, or “Archived”. This helps in identifying which version of a model is currently in use, under testing, or deprecated.
Annotations and Descriptions: MLflow allows you to add rich context to your models by attaching annotations and descriptions. This could include details about the model’s purpose, performance metrics, or any other relevant information that aids in understanding and managing the model effectively.

3.2 Why Store registration information on SQL?

When managing machine learning models and their lifecycle, the way in which data is stored plays a critical role in the efficiency and effectiveness of the entire process. Storing model registry information locally, although it might seem straightforward, presents several challenges, particularly as the scale and complexity of machine learning projects grow.

Lack of Collaboration: Local storage means that model information and metadata are stored on an individual’s computer or a specific server. This creates barriers for team collaboration as access to the data is restricted. Team members cannot easily share models, view changes, or collaborate on model improvement, leading to silos of information.
Data Consistency Issues: With multiple copies of model metadata and information scattered across different locations, ensuring data consistency becomes a formidable challenge. There’s a higher risk of working with outdated information, making mistakes in model deployment, or losing critical data.
Limited Accessibility: Accessibility is restricted to the network or machine where the data is stored. For remote teams or cloud-based deployment environments, this poses significant limitations.
Scalability Concerns: As the number of models, experiments, and runs grow, local storage solutions can quickly become overwhelmed, leading to performance issues and making data management cumbersome.
Security and Backup Complications: Ensuring that your model registry information is secure and regularly backed up is more complicated with a local storage approach. There’s a higher risk of data loss due to hardware failure, accidental deletion, or security breaches.

3.3 Initialize a MLflow server with a MySQL backend

With the clear advantages of using a SQL server for model registry, setting up MLflow to leverage this capability is the next step. Below is an example of how to start the MLflow server with a MySQL database as the backend:

mlflow server --backend-store-uri mysql+pymysql://<username>:<password>@<host>:<port>/<database> --default-artifact-root <artifact_location> --host 0.0.0.0

The most important thing for us is the mlflow server --backend-store-uri mysql+pymysql://<username>:<password>@<host>:<port>/<database>

lines. It basically says to initialize a mlflow server, and then store all of the backend information on a MySQL server, and then we give it the username/password, and some hosting information that our data engineer will give us :)

The --host 0.0.0.0 basically says where the MLflow server should be accessible by others, once it has come online.

3.4 Registering Your ML model

Registering a machine learning model in MLflow is a straightforward yet crucial part of managing the lifecycle of your models. This process allows you to store, version, and manage models centrally, facilitating easier deployment, collaboration, and monitoring.

Registration: To register a model, navigate to the MLflow UI, find the specific run that produced the model, and use the "Register Model" button. You'll need to provide a unique name for the model or select an existing name to create a new version of that model.

_images/oss_registry_2_dialog.png — In the experiments tab, click on Register Model to register your model

Versioning: Every time you register a model under the same name, MLflow will automatically create a new version, making it easier to manage and roll back to previous versions if necessary.
Annotations and Descriptions: You can add annotations and descriptions to your registered models and their versions, providing context, details about changes, or any other relevant information.

3.5 Viewing all Registered models

Having a centralized repository for all your machine learning models is beneficial, but it’s equally important to be able to easily view and access these models. MLflow provides functionalities to list all registered models and view their details.

MLflow UI: The MLflow UI provides a “Models” tab, where you can see a list of all registered models, their versions, and current stages. This graphical interface makes it easy to navigate through your models, view their history, and manage them.

_images/oss_registry_3_overview.png — You can see all of the currently registered models in the Models tab

4 - Deploying a machine learning model

The MLflow model registry provides a robust and straightforward way to manage the lifecycle of your machine learning models, and deploying models from the registry is a critical aspect of this process. Let's delve into a step-by-step guide to understand how to efficiently deploy a model from MLflow's registry.

4.1 Identify the model you want to deploy

Before you can deploy a model, you need to determine which model and version you intend to use. MLflow makes this easier through its versioning system and stage transitions.

Navigate to MLflow UI: Access the MLflow UI and click on the “Models” tab. Here, you will find a list of all registered models.
Select Your Model: Click on the name of the model you intend to deploy. You will be taken to a page showing all the versions of that model.
Choose the Version: Look for the version of the model you want to deploy. You might want to choose the version in the “Production” stage if you are deploying for a live application, or “Staging” for pre-production testing.

Quickstart: Compare runs, choose a model, and deploy it to a REST API — MLflow 2.7.1 documentation — Click on the Stage button, and then choose where to deploy this ML model to.

4.2 Transitioning model stages

Model stages in MLflow help you to categorize models based on their readiness for deployment. Before deploying, make sure your model is in the appropriate stage.

Changing Model Stage: Use the MLflow UI or the MLflow Python API to transition your model to the “Production” stage.

from mlflow.tracking import MlflowClient

client = MlflowClient()
client.transition_model_version_stage(
  name="your_model_name",
  version="model_version",
  stage="Production"
)

Replace "your_model_name" and "model_version" with the actual model name and version you are working with.

4.3 Deploying the model

Once your model is in the right stage, you can proceed to deploy it.

MLflow REST API: MLflow allows you to serve registered models as a REST API, enabling you to make HTTP requests for predictions.
- Starting the REST Server: Use the mlflow.pyfunc.serve command to start a REST server that serves your model.

mlflow pyfunc serve -m models:/your_model_name/Production -h 0.0.0.0 -p 5001

Making Predictions: With the REST server running, you can now make HTTP requests to your model for predictions.

Using curl: Make a prediction by sending a JSON payload with your input data using curl.

curl -X POST -H "Content-Type:application/json; format=pandas-records" --data '{"data":[your_input_data]}' http://127.0.0.1:5001/invocations

4.4 Utilizing the model

With the model deployed and accessible via REST API, it’s ready to be integrated into your application or workflow.

Integration: Update your application’s code to make HTTP requests to the model’s endpoint for predictions.
Monitoring: Keep an eye on the model’s performance and logs to ensure everything is running smoothly.

When it comes to utilizing the deployed model, I personally prefer to work with Postman. You can take a look at the video below on how to use POSTMAN to call (and test) your ML model to see if it works.

Once we have confirmed the ML model works, then our part is basically done, and the front end (web developers), and software engineers such as

The Bit Shift

(BowTiedCrocodile) take over.

Data Science & Machine Learning 101