0

Please find my dataset here at:

home_data.csv

I am reading the datafile using Pandas. The dataset has several columns, 2 of which are of interest to me:namely "Price" and "zipcode". I want to plot a boxplot using Pyplot or seaborn with zipcodes on the x-axis and the price on the y-axis. Basically, what I want to do is for each zip code, I want the whisker plotted, so that I can see the distribution against each zip code.

I have been able to plot this. However, the x-axis is too crowded and I cannot see the zip codes printed. I have looked at options in the documentation and I cannot seem to find anything or rather I would say, I have no clue as to how I could make them easier to read.

GraphLab create has a nice feature where the zipcodes on the x-axis can be made draggable. Do we have anything similar with Pyplot or Seaborn?

My code is as follows:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%pylab inline

filename = "./home_data.csv"
sales_df = pd.read_csv(filename)

sns.boxplot(x='zipcode',y='price',data=sales_df,linewidth=1,fliersize=2 )
halfer
  • 19,824
  • 17
  • 99
  • 186
Kiran Hegde
  • 680
  • 4
  • 14
  • There are a lot of similar questions already. It would be necessary to tell in how far they did not help. Or, if you really want to ask how to make the labels draggable, you should do so explicitely; note however that the inline backend creates png files, which can of course not be interactively dragged. – ImportanceOfBeingErnest Mar 28 '18 at 09:08
  • Hello @ImportanceOfBeingErnest While i agree that there could be lot of similar questions, for a newbie, at times, it becomes even difficult to know if a similar problem has already been addressed. The questions are worded differently and for a newbie like me, i might read it but might not necessarily at that point of time understand that it's the same question. – Kiran Hegde Mar 29 '18 at 05:17
  • 1
    I just found out that it's impossible to drag the ticklabels freely. It would be possible to drag the x ticklabels along the y direction or the y ticklabels along the x direction. Even that is rather complicated to implement. If you call yourself a "newbie", you will not be able to do that. The alternative is of course to create some `annotations` yourself at some positions. Those can be easily made draggable. In any case since this question here mainly asks about a crowded axes and not about the problem of dragging labels, it can stay as it is. You may of course ask a new question if you want – ImportanceOfBeingErnest Mar 29 '18 at 15:12

1 Answers1

2

One solution would be to rotate the labels of the x-axis when you plot. That should help to uncrowd the axis. Since seaborn returns an matplotlib.axes object you can just set them with that.

Try

ax = sns.boxplot(x='zipcode',y='price',data=sales_df,linewidth=1,fliersize=2)
_ = ax.set_xticklabels(ax.get_xticklabels(), rotation=-80)

You can play around with the amount of rotation to see what looks best, but I looked at your data and -80 seems to make it easily readable.

I'd also suggest increasing the figure size if necessary. You can play around with the ratio, but this seems to create something decent.

plt.figure(figsize=(20,10))
ax = sns.boxplot(x='zipcode',y='price',data=sales_df,linewidth=1,fliersize=2)
_ = ax.set_xticklabels(ax.get_xticklabels(), rotation=-80)
plt.show()

enter image description here

ALollz
  • 57,915
  • 7
  • 66
  • 89
  • Thanks very much for the answer. This did help me to a great extent. Do you also happen to know if there is way to make the x-axis draggable? Also, other than the official documentation, is there something which makes understanding pyplot and seaborn much more easier? – Kiran Hegde Mar 29 '18 at 05:34