SQL aggregation functions allow you to perform calculations on data to summarise it in various ways. Data Analysts spend a lot of time using SQL aggregation functions while working with databases. So, if you want to learn about SQL aggregations, this article is for you. In this article, I’ll take you through a detailed practical guide on SQL aggregation functions you should know for Data Science.
SQL Aggregation Functions
Below are the essential SQL aggregation methods you should know for Data Science:
- COUNT()
- SUM()
- AVG()
- MIN()
- MAX()
- GROUP BY
- HAVING
Let’s understand all these practically one by one. I’ll be using the table shown below for the implementation of all these aggregation functions.
| OrderID | CustomerID | OrderDate | Amount |
|---|---|---|---|
| 1 | 101 | 2024-08-01 | 250 |
| 2 | 102 | 2024-08-02 | 450 |
| 3 | 101 | 2024-08-03 | 300 |
| 4 | 103 | 2024-08-04 | 500 |
| 5 | 102 | 2024-08-05 | 150 |
COUNT() – Counting Rows
The COUNT() function returns the number of rows that match a specified condition. Here’s an example:
SELECT COUNT(*) AS TotalOrders FROM Orders;
Output: 5
Here, the COUNT(*) function counts all rows in the Orders table.
SUM() – Summing Values
The SUM() function returns the total sum of a numeric column. Here’s an example:
SELECT SUM(Amount) AS TotalAmount FROM Orders;
Output: 1650
Here, the SUM(Amount) function calculates the total amount of all orders.
AVG() – Calculating the Average
The AVG() function returns the average value of a numeric column. Here’s an example:
SELECT AVG(Amount) AS AverageAmount FROM Orders;
Output: 330
Here, the AVG(Amount) function calculates the average order amount.
MIN() – Finding the Minimum Value
The MIN() function returns the smallest value in a column. Here’s an example:
SELECT MIN(Amount) AS MinAmount FROM Orders;
Output: 150
Here, the MIN(Amount) function returns the smallest order amount.
MAX() – Finding the Maximum Value
The MAX() function returns the largest value in a column. Here’s an example:
SELECT MAX(Amount) AS MaxAmount FROM Orders;
Output: 500
Here, the MAX(Amount) function returns the largest order amount.
GROUP BY – Grouping Data
The GROUP BY clause is used in conjunction with aggregation functions to group the result set by one or more columns. Here’s an example:
SELECT CustomerID, SUM(Amount) AS TotalAmountPerCustomer FROM Orders GROUP BY CustomerID;
| CustomerID | TotalAmountPerCustomer |
|---|---|
| 101 | 550 |
| 102 | 600 |
| 103 | 500 |
Here, the GROUP BY CustomerID groups the orders by each customer, and the SUM(Amount) calculates the total amount for each customer.
HAVING – Filtering Groups
The HAVING clause is used to filter groups based on a condition, typically used with GROUP BY. Here’s an example:
SELECT CustomerID, SUM(Amount) AS TotalAmountPerCustomer FROM Orders GROUP BY CustomerID HAVING SUM(Amount) > 500;
| CustomerID | TotalAmountPerCustomer |
|---|---|
| 101 | 550 |
| 102 | 600 |
Here, the HAVING clause filters customers whose total order amount is greater than 500.
Summary
So, below are the essential SQL aggregation methods you should know for Data Science:
- COUNT(): Counts the number of rows in a result set.
- SUM(): Calculates the total sum of a numeric column.
- AVG(): Computes the average value of a numeric column.
- MIN(): Finds the smallest value in a column.
- MAX(): Finds the largest value in a column.
- GROUP BY: Groups rows that have the same values in specified columns.
- HAVING: Filters groups after aggregation based on a specified condition.
I hope you liked this article on SQL aggregation functions. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.





