-1

I have a dataset that looks like this:

Feature A Feature B Feature C labels
0.3 0.5 0.4 0
0.2 0.7 0.1 1
... ... ... ...

Now I want to look into the distribution of the features for the classes. The features are scaled between 0-1 and are divided into two classes (labels 0 or 1). What I want to do is draw two boxplots for each feature column, one boxplot being the distribution for class 0 and the other for class 1. All boxplots should be drawn into one figure.

This means in the end I have a figure shwoing six boxplots (FeatureA class1, featureA class0, featureB class1, ...)

This is what I currently am doing:

sns.set(rc={'figure.figsize':(14,20)})
sns.boxplot(x="variable", y="value", data=pd.melt(df_here), palette = 'Set2')

plt.show()

And I a getting results that look like this:

enter image description here

How could I split the plots for each feature into two concerning on which class the data belongs to? Thanks for the help!

denisRei
  • 9
  • 2

1 Answers1

0

I think you need to add a "hue" parameter for the label. Check an example in the seaborn boxplot documentation: https://seaborn.pydata.org/generated/seaborn.boxplot.html

PawelWL
  • 36
  • 4