Consumer Price Index Analysis with Python

Consumer Price Index (CPI) Analysis involves tracking the average price change over time for a basket of goods and services typically consumed by households. It serves as a primary measure of inflation, which helps companies and governments understand purchasing power trends, inflationary pressures, and economic stability. So, if you want to understand how to analyze the Consumer Price Index, this article is for you. In this article, I’ll take you through the task of Consumer Price Index Analysis with Python.

Consumer Price Index Analysis with Python

Let’s get started with the task of Consumer Price Index Analysis by importing the necessary Python libraries and the dataset:

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from statsmodels.tsa.seasonal import seasonal_decompose

cpi_data = pd.read_csv("/content/All India Consumer Price Index.csv")
print(cpi_data.head())
        Sector  Year     Month  Cereals and products  Meat and fish    Egg  \
0 Rural 2013 January 107.5 106.3 108.1
1 Urban 2013 January 110.5 109.1 113.0
2 Rural+Urban 2013 January 108.4 107.3 110.0
3 Rural 2013 February 109.2 108.7 110.2
4 Urban 2013 February 112.9 112.9 116.9

Milk and products Oils and fats Fruits Vegetables ... Housing \
0 104.9 106.1 103.9 101.9 ... NaN
1 103.6 103.4 102.3 102.9 ... 100.3
2 104.4 105.1 103.2 102.2 ... 100.3
3 105.4 106.7 104.0 102.4 ... NaN
4 104.0 103.5 103.1 104.9 ... 100.4

Fuel and light Household goods and services Health \
0 105.5 104.8 104.0
1 105.4 104.8 104.1
2 105.5 104.8 104.0
3 106.2 105.2 104.4
4 105.7 105.2 104.7

Transport and communication Recreation and amusement Education \
0 103.3 103.4 103.8
1 103.2 102.9 103.5
2 103.2 103.1 103.6
3 103.9 104.0 104.1
4 104.4 103.3 103.7

Personal care and effects Miscellaneous General index
0 104.7 104.0 105.1
1 104.3 103.7 104.0
2 104.5 103.9 104.6
3 104.6 104.4 105.8
4 104.3 104.3 104.7

[5 rows x 30 columns]

During the initial analysis of this dataset, I found that some of the month values contain extra whitespace, which can cause errors in parsing. So, I’ll clean up the data before the data conversion to ensure smooth analysis. I also noticed a typo in the Month column, such as “Marcrh” instead of “March”. I’ll check for such inconsistencies, correct them, and then proceed with the analysis:

cpi_data['Month'] = cpi_data['Month'].str.strip()
cpi_data['Month'] = cpi_data['Month'].replace('Marcrh', 'March')
cpi_data['Date'] = pd.to_datetime(cpi_data['Year'].astype(str) + '-' + cpi_data['Month'], format='%Y-%B')

Inflation Trend Analysis

Now, I will analyze the general CPI index over time for the Rural+Urban sector. This trend can help in identifying periods of inflationary spikes or stability:

# filter for "Rural+Urban" sector
rural_urban_cpi = cpi_data[cpi_data['Sector'] == 'Rural+Urban'].sort_values('Date')

# inflation trend analysis
fig = px.line(rural_urban_cpi, x='Date', y='General index', title='Inflation Trend Analysis (General CPI Index)')
fig.update_layout(xaxis_title='Date', yaxis_title='CPI - General Index')
fig.show()
Consumer Price Index Analysis: Inflation Trend Analysis

From around 2013 to 2023, there is a steady increase in the CPI in India, which reflects a continuous rise in inflation. The general upward trend suggests that the cost of goods and services has gradually increased over this period, with occasional fluctuations. The sharp rise in the last few years points to a significant inflationary impact, especially around and after 2020.

Seasonal and Cyclical Patterns

Now, I’ll decompose the CPI data into seasonal, trend, and residual components to identify patterns:

# seasonal and cyclical patterns
rural_urban_cpi.set_index('Date', inplace=True)
monthly_cpi = rural_urban_cpi['General index'].resample('M').mean().interpolate(method='linear')
decomposition = seasonal_decompose(monthly_cpi, model='multiplicative', period=12)

fig = go.Figure()
fig.add_trace(go.Scatter(x=decomposition.observed.index, y=decomposition.observed, mode='lines', name='Observed'))
fig.add_trace(go.Scatter(x=decomposition.trend.index, y=decomposition.trend, mode='lines', name='Trend'))
fig.add_trace(go.Scatter(x=decomposition.seasonal.index, y=decomposition.seasonal, mode='lines', name='Seasonal'))
fig.add_trace(go.Scatter(x=decomposition.resid.index, y=decomposition.resid, mode='lines', name='Residual'))
fig.update_layout(title='Seasonal Decomposition of CPI (Observed, Trend, Seasonal, Residual)', xaxis_title='Date')
fig.show()
Seasonal and Cyclical Patterns

The trend line (in red) closely follows the observed CPI values, which indicates a steady upward trend over time. The seasonal component (in green) is minimal, which suggests little seasonal fluctuation in the CPI. The residual component (in purple) is close to zero, which indicates minimal random variation, which implies that the CPI trend is consistent and primarily driven by long-term factors rather than seasonal or irregular influences.

Comparison Across Sectors or Regions

Now, let’s compare the average CPI across different sectors (Rural, Urban, Rural+Urban):

# comparison across sectors or regions
sector_cpi_means = cpi_data.groupby(['Sector'])['General index'].mean().reset_index()
fig = px.bar(sector_cpi_means, x='Sector', y='General index', title='Average CPI Comparison Across Sectors (Rural, Urban, Rural+Urban)')
fig.update_layout(xaxis_title='Sector', yaxis_title='Average CPI - General Index')
fig.show()
Consumer Price Index Analysis: Comparison Across Sectors or Regions

The CPI values are relatively consistent across all sectors, with only slight differences, which indicates that inflation, as measured by the CPI, affects rural and urban areas similarly. This suggests that price changes in goods and services are fairly uniform across these regions.

Correlation with Economic Indicators

Now, let’s examine the correlation between various categories within the CPI (e.g., Food, Fuel, Health) and the overall General index:

# replace non-numeric values with NaN and ensure all columns are numeric
cpi_categories = cpi_data[['Cereals and products', 'Meat and fish', 'Egg', 'Milk and products', 'Oils and fats',
                           'Fruits', 'Vegetables', 'Fuel and light', 'Housing', 'Health', 'Transport and communication',
                           'Recreation and amusement', 'Education', 'Personal care and effects', 'Miscellaneous', 'General index']]
cpi_categories = cpi_categories.apply(pd.to_numeric, errors='coerce')  # convert to numeric

# calculate the correlation matrix
correlation_matrix = cpi_categories.corr()

# plot the correlation matrix as a heatmap
fig = px.imshow(correlation_matrix, text_auto=True, color_continuous_scale='RdBu_r', zmin=-1, zmax=1,
                title='Correlation between CPI Categories and General Index')
fig.update_layout(xaxis_title='CPI Category', yaxis_title='CPI Category')
fig.show()
Correlation with Economic Indicators

Categories such as Housing, Transport and communication, and Miscellaneous show high positive correlations with each other and with the overall index, which suggests that changes in these categories have a significant impact on the general CPI. Conversely, categories like Egg and Vegetables show relatively lower correlations with other categories, which indicates more independent or variable price movements in these areas.

CPI and Specific Sector Analysis

Now, let’s analyze the inflation trends within specific sectors over time:

# CPI and specific sector analysis
sectors_to_analyze = ['Fuel and light', 'Health', 'Housing', 'Cereals and products']
sector_data = rural_urban_cpi[sectors_to_analyze].fillna(method='ffill').reset_index()

fig = go.Figure()
for sector in sectors_to_analyze:
    fig.add_trace(go.Scatter(x=sector_data['Date'], y=sector_data[sector], mode='lines', name=sector))
fig.update_layout(title='CPI Trends for Selected Sectors', xaxis_title='Date', yaxis_title='CPI Value')
fig.show()
Consumer Price Index Analysis: CPI and Specific Sector Analysis

Each sector shows a general upward trend over time, which indicates rising prices. Fuel and light have experienced the steepest increase, particularly after 2020, which reflectes higher inflation in this category. Health and Housing have followed a more gradual, steady increase over the years, with Health showing a relatively consistent rise. Cereals and products, while generally increasing, show more fluctuations, particularly around 2020, which indicates price volatility in this category.

Event-Based Analysis (COVID-19 Periods)

Now, let’s analyze CPI trends specifically during the COVID-19 period (2020-2021):

# event-based analysis (COVID-19 Period)
covid_period = rural_urban_cpi[(rural_urban_cpi.index >= '2020-01-01') & (rural_urban_cpi.index <= '2021-12-31')][sectors_to_analyze + ['General index']].fillna(method='ffill').reset_index()

fig = go.Figure()
fig.add_trace(go.Scatter(x=covid_period['Date'], y=covid_period['General index'], mode='lines', name='General CPI Index', line=dict(width=2, color='black')))
for sector in sectors_to_analyze:
    fig.add_trace(go.Scatter(x=covid_period['Date'], y=covid_period[sector], mode='lines', name=sector))
fig.update_layout(title='CPI Trends During COVID-19 Period (2020-2021)', xaxis_title='Date', yaxis_title='CPI Value')
fig.show()
Event-Based Analysis (COVID-19 Periods)

The Health and Housing sectors experienced notable increases, with Health showing a steady rise and Housing seeing a sharper increase from early 2021. Fuel and light saw a significant decline in early 2020, possibly due to reduced demand during lockdowns, followed by a steep rise in 2021 as economic activities resumed. Cereals and products remained relatively stable with minor fluctuations. Overall, the graph reflects the varied inflationary impacts of COVID-19 across these sectors, with essentials like health and housing showing resilience and growth.

Conclusion

The key findings from the CPI analysis are as follows:

  1. Overall Inflation Trend: There has been a steady increase in the CPI over the past decade, with inflation particularly rising after 2020.
  2. Minimal Seasonal Effect: The seasonal decomposition shows minimal seasonal fluctuations, indicating that CPI trends are mainly driven by long-term factors.
  3. Rural vs. Urban Impact: Inflation levels are consistent across rural, urban, and combined sectors, suggesting uniform price changes in these regions.
  4. Sectoral Correlations: High correlations are observed between sectors like housing, transport, and miscellaneous, indicating their significant impact on overall inflation, while categories like eggs and vegetables show more independent price movements.
  5. Sector-Specific Trends: Fuel and light have experienced the steepest price increase, especially post-2020, while health and housing show steady inflation growth. Cereals and products display more volatility.
  6. COVID-19 Impact (2020-2021): During the pandemic, fuel prices initially dropped due to lower demand, then surged in 2021. Health and housing sectors saw consistent price increases, reflecting inflationary pressures on essential services during this period.

I hope you liked this article on Consumer Price Index Analysis with Python. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.

Aman Kharwal
Aman Kharwal

AI/ML Engineer | Published Author. My aim is to decode data science for the real world in the most simple words.

Articles: 2073

Leave a Reply

Discover more from AmanXai by Aman Kharwal

Subscribe now to keep reading and get access to the full archive.

Continue reading