-1

I'm currently struggling with my dataframe in Pandas (new to this). I have a 3 columns dataframe : Categorical_data1, Categorical_data2,Output. (2400 rows x 3 columns).

Both categorical data (inputs) are strings and output is depending of inputs.

Categorical_data1 = ['type1','type2', ... , 'type6'] Categorical_data2 = ['rain1','rain2', 'rain3','rain4]

So 24 possible pairs of categorical data.

I want to plot a heatmap (using seaborn for instance) of the number of 0 in outputs regarding couples of categorical data (Cat_data1,Cat_data2). I tried several things using boolean.

I tried to figure out how to compute exact amount of 0

count = ((df['Output'] == 0) & (df(['Categorical_Data1'] == 'type1') & (df(['Categorical_Data2'] == 'rain1')))).sum()

but it failed. The output belongs to [0,1] with a large amount of 0 (around 1200 over 2400). My goal is to have something like this Source by jcdoming (I can't upload images...) with months = Categorical Data1, years = Categorical Data2 ; and numbers of 0 in ouputs).

Thank you for your help.

MikeWa
  • 3
  • 1

1 Answers1

0

Use a seaborn countplot. It gives counts of categorical data occurrences in a certain feature. Use hue to add in the second feature to the visualization:

import seaborn as sns
sns.countplot(data=dataframe, x='Categorical_Data1', hue='Categorical_Data2')
Nicolas Martinez
  • 719
  • 1
  • 6
  • 23
Sting_ZW
  • 68
  • 6