SQL Aggregation Functions

SQL aggregation functions allow you to perform calculations on data to summarise it in various ways. Data Analysts spend a lot of time using SQL aggregation functions while working with databases. So, if you want to learn about SQL aggregations, this article is for you. In this article, I’ll take you through a detailed practical guide on SQL aggregation functions you should know for Data Science.

SQL Aggregation Functions

Below are the essential SQL aggregation methods you should know for Data Science:

  1. COUNT()
  2. SUM()
  3. AVG()
  4. MIN()
  5. MAX()
  6. GROUP BY
  7. HAVING

Let’s understand all these practically one by one. I’ll be using the table shown below for the implementation of all these aggregation functions.

OrderIDCustomerIDOrderDateAmount
11012024-08-01250
21022024-08-02450
31012024-08-03300
41032024-08-04500
51022024-08-05150

COUNT() – Counting Rows

The COUNT() function returns the number of rows that match a specified condition. Here’s an example:

SELECT COUNT(*) AS TotalOrders
FROM Orders;
Output: 5

Here, the COUNT(*) function counts all rows in the Orders table.

SUM() – Summing Values

The SUM() function returns the total sum of a numeric column. Here’s an example:

SELECT SUM(Amount) AS TotalAmount
FROM Orders;
Output: 1650

Here, the SUM(Amount) function calculates the total amount of all orders.

AVG() – Calculating the Average

The AVG() function returns the average value of a numeric column. Here’s an example:

SELECT AVG(Amount) AS AverageAmount
FROM Orders;
Output: 330

Here, the AVG(Amount) function calculates the average order amount.

MIN() – Finding the Minimum Value

The MIN() function returns the smallest value in a column. Here’s an example:

SELECT MIN(Amount) AS MinAmount
FROM Orders;
Output: 150

Here, the MIN(Amount) function returns the smallest order amount.

MAX() – Finding the Maximum Value

The MAX() function returns the largest value in a column. Here’s an example:

SELECT MAX(Amount) AS MaxAmount
FROM Orders;
Output: 500

Here, the MAX(Amount) function returns the largest order amount.

GROUP BY – Grouping Data

The GROUP BY clause is used in conjunction with aggregation functions to group the result set by one or more columns. Here’s an example:

SELECT CustomerID, SUM(Amount) AS TotalAmountPerCustomer
FROM Orders
GROUP BY CustomerID;
CustomerIDTotalAmountPerCustomer
101550
102600
103500

Here, the GROUP BY CustomerID groups the orders by each customer, and the SUM(Amount) calculates the total amount for each customer.

HAVING – Filtering Groups

The HAVING clause is used to filter groups based on a condition, typically used with GROUP BY. Here’s an example:

SELECT CustomerID, SUM(Amount) AS TotalAmountPerCustomer
FROM Orders
GROUP BY CustomerID
HAVING SUM(Amount) > 500;
CustomerIDTotalAmountPerCustomer
101550
102600

Here, the HAVING clause filters customers whose total order amount is greater than 500.

Summary

So, below are the essential SQL aggregation methods you should know for Data Science:

  1. COUNT(): Counts the number of rows in a result set.
  2. SUM(): Calculates the total sum of a numeric column.
  3. AVG(): Computes the average value of a numeric column.
  4. MIN(): Finds the smallest value in a column.
  5. MAX(): Finds the largest value in a column.
  6. GROUP BY: Groups rows that have the same values in specified columns.
  7. HAVING: Filters groups after aggregation based on a specified condition.

I hope you liked this article on SQL aggregation functions. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.

Aman Kharwal
Aman Kharwal

AI/ML Engineer | Published Author. My aim is to decode data science for the real world in the most simple words.

Articles: 2074

Leave a Reply

Discover more from AmanXai by Aman Kharwal

Subscribe now to keep reading and get access to the full archive.

Continue reading