Deploy a Machine Learning Model with Docker

Deploying a Machine Learning model is the last mile problem of AI, and solving it is one of the most valuable skills you can learn. You must have heard about Docker. In this article, I’ll guide you through the process of taking a trained model, wrapping it in a robust API, and then containerize and deploy the Machine Learning model with Docker.

Deploy a Machine Learning Model with Docker: Our Strategy

Before we dive into the code, let’s understand our toolkit and strategy:

We will train a Machine Learning model using Scikit-learn. It will be our brain.
Next, we will use FastAPI. It will create a web API so other applications can communicate with our model using standard HTTP requests, the language of the internet.
Next, we will create a Dockerfile. It will package our model, our API, and all their dependencies into a single, isolated unit that can run anywhere.

In simple terms, we’re taking the model, giving it a voice with FastAPI, and then packaging it in a self-sufficient container with Docker, so anyone can use it.

Step 1: Train and Save Your Model

Create a file named model.py inside a directory (in my case, I am naming my directory “dock”).

This is the part you’re likely most familiar with. The goal here isn’t just to train a model, but to serialize it, saving its learned state to a file:

# model.py

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.datasets import fetch_california_housing
import joblib

def train_and_save_model():    
    print("Loading data and training the model...")

    # 1. Load the California Housing dataset
    housing = fetch_california_housing()
    X, y = housing.data, housing.target

    # 2. Initialize and train the model
    # We're using Gradient Boosting, a powerful ensemble method for regression.
    # The parameters here are just examples; in a real project, you'd tune them.
    model = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3)
    model.fit(X, y)

    # 3. Save the trained model to a file
    joblib.dump(model, 'housing_model.pkl')

    print("Model trained and saved as housing_model.pkl!")
    print("Feature names:", housing.feature_names) # To help us with testing later

if __name__ == "__main__":
    train_and_save_model()

The key here is joblib.dump(model, ‘housing_model.pkl’). We use joblib (or pickle) to convert our trained Python object into a byte stream that can be saved to a file. This file, housing_model.pkl, now contains all the knowledge our model gained during training. We only need to run this script once. The resulting .pkl file is the asset we’ll deploy.

Step 2: Build the API Endpoint

Now, create a file named main.py. Now we need a way for the outside world to send data to our model and get a prediction back.

This is where FastAPI comes in. It’s incredibly fast and creates interactive documentation for us automatically:

# main.py

from fastapi import FastAPI
import joblib
from pydantic import BaseModel
import numpy as np

# 1. Create a FastAPI app instance
app = FastAPI(title="California Housing Price Predictor")

# 2. Load the trained model
model = joblib.load('housing_model.pkl')

# 3. Define the request body structure using Pydantic
# These are the features our model was trained on.
class HousingFeatures(BaseModel):
    MedInc: float
    HouseAge: float
    AveRooms: float
    AveBedrms: float
    Population: float
    AveOccup: float
    Latitude: float
    Longitude: float

# 4. Define the prediction endpoint
@app.post("/predict", tags=["Predictions"])
async def predict_price(features: HousingFeatures):
    
    # Convert the input data from the request into a NumPy array
    # The order of features must be the same as during training!
    data = np.array([[
        features.MedInc,
        features.HouseAge,
        features.AveRooms,
        features.AveBedrms,
        features.Population,
        features.AveOccup,
        features.Latitude,
        features.Longitude
    ]])

    # Make a prediction
    # The output is a price in units of 100,000 USD
    prediction = model.predict(data)[0]

    # Return the prediction
    return {"predicted_median_value": f"${prediction * 100000:.2f}"}

# Root endpoint
@app.get("/", tags=["General"])
async def read_root():
    return {"message": "Welcome to the California Housing Price Predictor API!"}

Here’s what we are doing in this Python file:

model = joblib.load(…): We load our saved model into memory the moment the API starts.
class HousingFeatures(BaseModel): Here, we’re creating a strict data contract. Our API will automatically validate incoming requests.
@app.post(“/predict”): This decorator tells FastAPI that any POST request to the /predict URL should be handled by the predict_price function.

The function takes the validated features, converts them into the exact NumPy array format our Scikit-Learn model expects, calls model.predict(), and returns the result as a clean JSON object.

Before moving forward, make sure to create a text file named requirements.txt. Write this information in that file:

fastapi
uvicorn[standard]
scikit-learn
joblib
numpy

Step 3: Containerize with Docker

Now, make sure you have Docker installed on your system. You can download it from here.

Once you have installed Docker, create a file named “Dockerfile” (no extension needed). This is where the real deployment happens. A Dockerfile is a set of instructions for building a Docker image. It’s like a recipe for creating our self-contained application environment.

Write this in your Dockerfile:

# Dockerfile

# 1. Use an official Python runtime as a parent image
FROM python:3.9-slim

# 2. Set the working directory inside the container
WORKDIR /app

# 3. Copy the dependencies file and install them
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 4. Copy the rest of the application code into the container
COPY . .

# 5. Expose the port the app runs on
EXPOSE 80

# 6. Define the command to run the application
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]

Notice we copy requirements.txt and run pip install before copying the rest of our code. Docker builds in layers. If our Python code (main.py, model.py) changes but our dependencies (requirements.txt) don’t, Docker will reuse the already-installed layer, making our subsequent builds much faster!

Now, before moving forward, make sure your directory looks like this:

directory of my project on deploying a machine learning model with docker

Final Step: Build and Run

With these four files in a single directory, you are ready to launch. Open your terminal in that directory and run this command:

docker build -t dock .

This command reads your Dockerfile and builds the self-contained image. We’ll tag (-t) it with the name dock. Here, dock is the name of my directory.

That’s it! Your machine learning model is now a live, running API. You can also see it in your Docker application. It will look like this:

You can now open your browser and navigate to http://localhost:8081/docs. You’ll see FastAPI’s beautiful, interactive API documentation, where you can even test your prediction endpoint directly. It will look like this:

Final Words

You’ve just gone from being a Machine Learning Engineer to an architect. You’ve taken a static model and transformed it into a dynamic, reliable, and distributable service. This process, Train, Serialize, Build the API, Containerize, is the fundamental workflow of modern MLOps.

I hope you liked this article on how to containerize and deploy a Machine Learning model with Docker. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.