Flipkart Reviews Sentiment Analysis using Python

Flipkart is one of the most popular Indian companies. It is an e-commerce platform that competes with popular e-commerce platforms like Amazon. One of the most popular use cases of data science is the task of sentiment analysis of product reviews sold on e-commerce platforms. So, if you want to learn how to analyze the sentiment of Flipkart reviews, this article is for you. In this article, I will walk you through the task of Flipkart reviews sentiment analysis using Python.

Flipkart Reviews Sentiment Analysis using Python

The dataset I am using here for Flipkart reviews sentiment analysis is downloaded from Kaggle. Let’s start this task by importing the necessary Python libraries and the dataset:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator

data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/flipkart_reviews.csv")
print(data.head())

                                        Product_name  ... Rating
0  Lenovo Ideapad Gaming 3 Ryzen 5 Hexa Core 5600...  ...      5
1  Lenovo Ideapad Gaming 3 Ryzen 5 Hexa Core 5600...  ...      5
2  Lenovo Ideapad Gaming 3 Ryzen 5 Hexa Core 5600...  ...      5
3  DELL Inspiron Athlon Dual Core 3050U - (4 GB/2...  ...      5
4  DELL Inspiron Athlon Dual Core 3050U - (4 GB/2...  ...      5

[5 rows x 3 columns]

This dataset contains only three columns. Let’s have a look at whether any of these columns contains missing values or not:

print(data.isnull().sum())

Product_name    0
Review          0
Rating          0
dtype: int64

So the dataset does not have any null values. As this is the task of sentiment analysis of Flipkart reviews, I will clean and prepare the column containing reviews before heading to sentiment analysis:

import nltk
import re
nltk.download('stopwords')
stemmer = nltk.SnowballStemmer("english")
from nltk.corpus import stopwords
import string
stopword=set(stopwords.words('english'))

def clean(text):
    text = str(text).lower()
    text = re.sub('\[.*?\]', '', text)
    text = re.sub('https?://\S+|www\.\S+', '', text)
    text = re.sub('<.*?>+', '', text)
    text = re.sub('[%s]' % re.escape(string.punctuation), '', text)
    text = re.sub('\n', '', text)
    text = re.sub('\w*\d\w*', '', text)
    text = [word for word in text.split(' ') if word not in stopword]
    text=" ".join(text)
    text = [stemmer.stem(word) for word in text.split(' ')]
    text=" ".join(text)
    return text
data["Review"] = data["Review"].apply(clean)

Sentiment Analysis of Flipkart Reviews

The Rating column of the data contains the ratings given by every reviewer. So let’s have a look at how most of the people rate the products they buy from Flipkart:

ratings = data["Rating"].value_counts()
numbers = ratings.index
quantity = ratings.values

import plotly.express as px
figure = px.pie(data, 
             values=quantity, 
             names=numbers,hole = 0.5)
figure.show()

So 60% of the reviewers have given 5 out of 5 ratings to the products they buy from Flipkart. Now let’s have a look at the kind of reviews people leave. For this, I will use a word cloud to visualize the most used words in the reviews column:

text = " ".join(i for i in data.Review)
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, 
                      background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

Now I will analyze the sentiments of Flipkart reviews by adding three columns in this dataset as Positive, Negative, and Neutral by calculating the sentiment scores of the reviews:

nltk.download('vader_lexicon')
sentiments = SentimentIntensityAnalyzer()
data["Positive"] = [sentiments.polarity_scores(i)["pos"] for i in data["Review"]]
data["Negative"] = [sentiments.polarity_scores(i)["neg"] for i in data["Review"]]
data["Neutral"] = [sentiments.polarity_scores(i)["neu"] for i in data["Review"]]
data = data[["Review", "Positive", "Negative", "Neutral"]]
print(data.head())

                                              Review  ...  Neutral
0  best  great performancei got around  backup bi...  ...    0.504
1                                        good perfom  ...    0.256
2  great perform usual also game laptop issu batt...  ...    0.723
3                        wife happi best product 👌🏻😘  ...    0.488
4  light weight laptop new amaz featur batteri li...  ...    1.000

[5 rows x 4 columns]

Now let’s see how most of the reviewers think about the products and services of Flipkart:

x = sum(data["Positive"])
y = sum(data["Negative"])
z = sum(data["Neutral"])

def sentiment_score(a, b, c):
    if (a>b) and (a>c):
        print("Positive 😊 ")
    elif (b>a) and (b>c):
        print("Negative 😠 ")
    else:
        print("Neutral 🙂 ")
sentiment_score(x, y, z)

Neutral 🙂

So most of the reviews are neutral. Let’s have a look at the total of Positive, Negative, and Neutral sentiment scores to find a conclusion about Flipkart reviews:

print("Positive: ", x)
print("Negative: ", y)
print("Neutral: ", z)

Positive:  923.5529999999985
Negative:  96.77500000000013
Neutral:  1283.6880000000006

Conclusion

So, most people give Neutral reviews, and a small proportion of people give Negative reviews. So we can say that people are satisfied with Flipkart products and services. I hope you liked this article on Flipkart sentiment analysis using Python. Feel free to ask valuable questions in the comments section below.