0

Dataset
Gender KnownBrands
0 Man NIVEA MEN;GATSBY;
1 Man GATSBY;GARNIER MEN;L’OREAL MEN EXPERT;
2 Woman CLINIQUE FOR MEN;SK-II MEN;Neutrogena MEN;
3 Man NIVEA MEN;GARNIER MEN;L’OREAL MEN EXPERT;GATSBY;
4 Woman NIVEA MEN;GATSBY;

I need to split the KnownBrands into individual columns in order to do a count plot by brands.

#split the brands
brands = Men["KnownBrands"].str.split(";").explode().astype(object).reset_index()

#use pivot to provide total for each brands
output = brandnames.pivot(index="index", columns="KnownBrands", values= "KnownBrands").reset_index(drop=True).drop('',1)
brandname=output.count().plot.bar()

#Rotate the x-axis name vertically to prevent overlapping
plt.xticks(rotation='45',horizontalalignment='right')
plt.xlabel("Brands")
plt.ylabel("Frequency")
plt.title("Brands Known by Respondents")
#Chart data labels, only seaborn version 3.4.2 have this function
plt.bar_label(brandname.containers[0])
plt.show();

Plot

1 Answers1

0

Hi and welcome to StackOverflow. You mentioned countplot(). This is available in seaborn. Assuming that is what you are planning to use... Note that the countplot will count the number of entries and graph will show how many items are present once, how many are present twice, etc... The updated code is below.

>>df
    Gender  KnownBrands
0   Man     NIVEA MEN;GATSBY;
1   Man     GATSBY;GARNIER MEN;L’OREAL MEN EXPERT;
2   Woman   CLINIQUE FOR MEN;SK-II MEN;Neutrogena MEN;
3   Man     NIVEA MEN;GARNIER MEN;L’OREAL MEN EXPERT;GATSBY;
4   Woman   NIVEA MEN;GATSBY;

brands = df["KnownBrands"].str.split(";").explode().astype(object).reset_index()
output = brands.pivot(index="index", columns="KnownBrands", values= "KnownBrands").reset_index(drop = True).drop('', 1)

>>output.count()
KnownBrands
CLINIQUE FOR MEN      1
GARNIER MEN           2
GATSBY                4
L’OREAL MEN EXPERT    2
NIVEA MEN             3
Neutrogena MEN        1
SK-II MEN             1
dtype: int64

import seaborn as sns
sns.countplot(x=output.count())

Output plot

enter image description here

Redox
  • 9,321
  • 5
  • 9
  • 26
  • Hi @Redox, thanks so much for the help. I tried but need to tweak the code for a little for the chart plotting. `sns.countplot(data=brandss,x='KnownBrands')` But I got the total in the chart also, I am not sure how to exclude the total. – snoopychui Jun 14 '22 at 08:35
  • Please click on Edit below your question and add the updated code (where you are getting the total) and what the issue is/what you are expecting to change and let me know. Unable to really understand the issue from above – Redox Jun 14 '22 at 08:41
  • Further testing and tweaking of codes, I managed to exclude the total. `output.count().plot.bar()` – snoopychui Jun 14 '22 at 08:52