I would like to plot three boxplots for 1, 2 and 3 weight_cat
values (these are the only distinct values it has). These boxplots should show dependency height on weight category (weight_cat
).
So I have such a dataframe:
print data.head(5)
Height Weight weight_cat
Index
1 65.78331 112.9925 1
2 71.51521 136.4873 2
3 69.39874 153.0269 3
4 68.21660 142.3354 2
5 67.78781 144.2971 2
The code below finally eats all my ram. This is not normal, I believe:
Seaborn.boxplot(x="Height", y="weight_cat", data=data)
What is wrong here? This is the link to manual. Shape of the dataframe is (25000,4). This the link to the csv file.
This is how you can get the same data:
data = pd.read_csv('weights_heights.csv', index_col='Index')
def weight_category(weight):
newWeight = weight
if newWeight < 120:
return 1
if newWeight >= 150:
return 3
else:
return 2
data['weight_cat'] = data['Weight'].apply(weight_category)