I am learning to use matplotlib with pandas and I am having a little trouble with it. There is a dataframe which has districts and coffee shops as its y and x labels respectively. And the column values represent the start date of the coffee-shops in respective districts
starbucks cafe-cool barista ........ 60 shops
dist1 2008-09-18 2010-05-04 2007-02-21 ...............
dist2 2007-06-12 2011-02-17
dist3
.
.
100 districts
I want to plot a scatter plot with x axis as time series and y axis as coffee-shops. Since I couldn't figure out a direct one line way to plot this, I extracted the coffee-shops as one list and dates as other list.
shops = list(df.columns.values)
dt = pd.DataFrame(df.ix['dist1'])
dates = dt.set_index('dist1')
First I tried plt.plot(dates, shops)
. Got a ZeroDivisionError: integer division or modulo by zero - error. I could not figure out the reason for it. I saw on some posts that the data should be numeric, so I used ytick function.
y = [1, 2, 3, 4, 5, 6,...60]
still plt.plot(dates, y)
threw same ZeroDivisionError. If I could get past this may be I would be able to plot using tick function. Source -
http://matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html
I am trying to plot the graph for only first row/dist1. For that I fetched the first row as a dataframe df1 = df.ix[1]
and then used the following
for badges, dates in df.iteritems():
date = dates
ax.plot_date(date, yval)
# Record the number and label of the coffee shop
label_ticks.append(yval)
label_list.append(badges)
yval+=1
.
I got an error at line ax.plot_date(date, yval)
saying x and y should be have same first dimension. Since I am plotting one by one for each coffe-shop for dist1 shouldn't the length always be one for both x and y? PS: date is a datetime.date object