Many beginners train a Machine Learning model, evaluate its performance, and finish their project after evaluating the model. What they don’t know is how to actually test and use the model to make predictions in real time. So, in this article, I’ll take you through a step-by-step guide on how to test and use a Machine Learning model to make predictions using Python.
How to Test and Use a Machine Learning Model
In this guide, I’ll walk through how to test a machine learning model by making predictions in real time using the California Housing dataset from sklearn. This dataset contains information about California’s housing prices and related factors, which makes it a great choice for building a regression model.
Step 1: Load and Explore the Dataset
The first step in any machine learning project is to understand the dataset. The California Housing dataset includes features such as MedInc (median income) and AveRooms (average number of rooms), and a target variable MedHouseVal (median house value). Here’s how to load and explore the data:
from sklearn.datasets import fetch_california_housing import pandas as pd # load the California Housing dataset housing = fetch_california_housing(as_frame=True) # convert to DataFrame for exploration df = housing.frame # display the first few rows of the dataset print(df.head())
MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude \
0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88
1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86
2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85
3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85
4 3.8462 52.0 6.281853 1.081081 565.0 2.181467 37.85
Longitude MedHouseVal
0 -122.23 4.526
1 -122.22 3.585
2 -122.24 3.521
3 -122.25 3.413
4 -122.25 3.422
Step 2: Split the Dataset into Train and Test Sets
Next, we will divide the data into training and testing sets to ensure the model is trained on one part and evaluated on unseen data:
from sklearn.model_selection import train_test_split
# features (X) and target (y)
X = df.drop(columns=['MedHouseVal']) # drop the target column
y = df['MedHouseVal'] # target column
# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(f"Training set size: {X_train.shape[0]}")
print(f"Testing set size: {X_test.shape[0]}")Training set size: 16512
Testing set size: 4128
Step 3: Train the Machine Learning Model
Next, we will use a Random Forest Regressor, a popular ensemble learning method, to build our regression model:
from sklearn.ensemble import RandomForestRegressor # initialize the Random Forest Regressor model = RandomForestRegressor(random_state=42) # train the model model.fit(X_train, y_train)
Step 4: Evaluate the Model
Before making real-time predictions, it’s essential to check how well the model performs on unseen data using metrics like Mean Squared Error (MSE):
from sklearn.metrics import mean_squared_error
# make predictions on the test set
y_pred = model.predict(X_test)
# evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error on the Test Set: {mse}")Mean Squared Error on the Test Set: 0.2553684927247781
The MSE measures the average squared difference between actual and predicted values. A lower MSE indicates better model performance.
Step 5: Make Real-Time Predictions
This is the step that most of the beginners skip. To simulate real-world usage, we will provide a new data point to the trained model for prediction. The model input should have the same structure as the training features:
import numpy as np
# define a new data point (real-time input)
new_data = np.array([[8.3252, 41.0, 6.9841, 1.0238, 322.0, 2.5556, 37.88, -122.23]]) # example values
# make predictions for the new data point
new_prediction = model.predict(new_data)
print(f"Predicted Median House Value: {new_prediction[0]}")Predicted Median House Value: 4.265793
Always make sure your input data aligns with the model’s training features to make accurate predictions.
By following these steps, you can effectively test and use a Machine Learning model to make reliable predictions. You can learn about packaging Machine Learning models for deployment from here.
Summary
To test and use a machine learning model, start by loading and exploring the dataset for understanding. Split the data into training and testing sets to evaluate the model effectively. Train the model using a suitable algorithm for accurate learning. Evaluate the model’s performance using appropriate metrics to ensure reliability. Provide new data points in the same format for real-time predictions and deployment readiness.
I hope you liked this article on how to test and use a Machine Learning model. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.






👍 Truly simplifies the process.