Deploy Your First ML Model as a REST API

The gap between an ML model in your .ipynb file and a real, production-ready model is where your career starts as an ML Engineer. Deploying your ML model as a REST API is the standard way to make your model’s intelligence available to websites, apps, and other services. In this article, I will walk you through, step by step, how to take a trained model and deploy it as a powerful REST API using Python.

First, What Exactly is an API?

Forget the technical definitions for a second. Imagine you’re at a restaurant. You (the client) want to order a burger. You don’t go into the kitchen (the backend server) and cook it yourself. So, you talk to a waiter (the API):

You give the waiter a clear, structured request: “I’d like the cheeseburger, medium-rare, with no onions.”
The waiter takes your request to the kitchen, where all the complexity is handled, including grilling the patty, melting the cheese, and assembling the burger.
The waiter then brings the finished burger (the response) back to you.

An API (Application Programming Interface) works the same way. It’s a messenger that takes a request from a client application (like a website or mobile app), sends it to the server where your ML model lives, and returns the model’s prediction as a response.

Now, Let’s Get Started: From an ML Model to a REST API

We’ll build a simple API that predicts median house values in California. Here’s how we’ll do it:

We’ll use the classic California Housing dataset to train a RandomForestRegressor model.
We’ll save our trained model into a single file.
We’ll use FastAPI to create a web server and an endpoint for our predictions.
We’ll run the server and make live prediction requests.

Let’s get our hands dirty.

Step 1: Training and Saving Our Model

The focus here isn’t on building the world’s best model, but on building a deployable one. We’ll keep it simple.

First, create a new project folder and install the necessary libraries:

pip install scikit-learn pandas joblib

Now, create a Python file named train.py. This script will handle loading the data, training the model, and saving it:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import fetch_california_housing
import joblib

print("Starting the training script...")

housing = fetch_california_housing()
X = pd.DataFrame(housing.data, columns=housing.feature_names)
y = pd.Series(housing.target, name='MedHouseVal')

print("Dataset loaded successfully.")

features_to_use = ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population']
X = X[features_to_use]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)
print("Data split into training and testing sets.")

model = RandomForestRegressor(n_estimators=100, random_state=42)
print("Training the RandomForestRegressor model...")
model.fit(X_train, y_train)
print("Model training complete.")

model_filename = 'california_housing_model.joblib'
joblib.dump(model, model_filename)

print(f"Model saved as {model_filename}. Training script finished.")

Run this script from your terminal with this command:

python train.py

After it finishes, you’ll see a new file in your folder as a joblib file. That file is your trained model, serialized and ready to be used.

Step 2: Building the API with FastAPI

Now, we’re going to build the API. FastAPI is my go-to for this because it’s incredibly fast, easy to learn, and automatically generates interactive documentation for your API.

First, install FastAPI and an ASGI server to run it, called uvicorn:

pip install "fastapi[all]"

Now, create a new file named main.py. This is where our API logic will be written:

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import pandas as pd

app = FastAPI(title="California Housing Price Prediction API")

model = joblib.load('california_housing_model.joblib')

class HouseFeatures(BaseModel):
    MedInc: float      # Median income in block group
    HouseAge: float    # Median house age in block group
    AveRooms: float    # Average number of rooms per household
    AveBedrms: float   # Average number of bedrooms per household
    Population: float  # Block group population
    
    class Config:
        schema_extra = {
            "example": {
                "MedInc": 8.3252,
                "HouseAge": 41.0,
                "AveRooms": 6.9841,
                "AveBedrms": 1.0238,
                "Population": 322.0
            }
        }

@app.post("/predict")
def predict_price(features: HouseFeatures):
    """
    Predicts the median house value based on input features.
    """
    input_data = pd.DataFrame([features.dict()])
    
    prediction = model.predict(input_data)
    
    predicted_value = prediction[0]
    
    return {"predicted_median_house_value": predicted_value}

@app.get("/")
def read_root():
    return {"message": "Welcome to the Housing Price Prediction API!"}

Let’s quickly break down what we did:

app = FastAPI(): We created our main application object.
model = joblib.load(…): We loaded our pre-trained model into memory.
class HouseFeatures(BaseModel): This is the genius of FastAPI and Pydantic. We define the exact shape and data types of our input.
@app.post: We defined our prediction endpoint. It listens for POST requests at the /predict URL.

Inside predict_price, we convert the incoming data to a pandas DataFrame, feed it to our model’s .predict() method, and return the result as a JSON object.

Step 3: Running and Testing Your API

Now, open your terminal in the same directory and run the following command:

uvicorn main:app --reload

You should see an output like this:

Now for the best part. Open your web browser and go to:

http://127.0.0.1:8000/docs

You will see an interactive API documentation page (as shown below), generated for you automatically by FastAPI. No extra work required. It shows all your endpoints, their parameters, and expected responses.

Click on the /predict endpoint, then “Try it out.” You’ll see the example JSON we defined in our Pydantic model. You can change the values and click “Execute.”

So, you just deployed a machine learning model as a fully functional, documented REST API. You can move ahead by deploying it using Docker. Here’s a complete guide.

Final Words

Now, you’re no longer just someone who trains models; you’re someone who can build AI-powered services. So, the next time you’ve built a model you’re proud of, don’t stop at the last cell of your notebook. Take that extra step. Wrap it in an API. Because a model that works on your machine is an experiment, but a model that works as an API is a solution.

I hope you liked this article on how to take a trained model and deploy it as a powerful REST API using Python. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.