1

I am following this section, I realize this code was made using Python 2 but they have xticks showing on the 'Start Date' axis and I do not. My chart only shows Start Date and no dates are provided.

graph # Set as_index=False to keep the 0,1,2,... index. Then we'll take the mean of the polls on that day. poll_df = poll_df.groupby(['Start Date'],as_index=False).mean()

# Let's go ahead and see what this looks like
poll_df.head()
Start Date  Number of Observations  Obama   Romney  Undecided   Difference
0   2009-03-13  1403    44  44  12  0.00
1   2009-04-17  686 50  39  11  0.11
2   2009-05-14  1000    53  35  12  0.18
3   2009-06-12  638 48  40  12  0.08
4   2009-07-15  577 49  40  11  0.09
Great! Now plotting the Differencce versus time should be straight forward.

# Plotting the difference in polls between Obama and Romney
fig = poll_df.plot('Start Date','Difference',figsize=(12,4),marker='o',linestyle='-',color='purple')

https://nbviewer.jupyter.org/github/jmportilla/Udemy-notes/blob/master/Data%20Project%20-%20Election%20Analysis.ipynb

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
Ags911
  • 31
  • 6
  • You may need to download the graph to see it properly as it has no background. – Ags911 Mar 12 '19 at 22:14
  • I think you forgot to convert the start date column to actual dates. – ImportanceOfBeingErnest Mar 13 '19 at 09:21
  • @ImportanceOfBeingErnest Thanks for your reply but how would I do that in the context of this question and why didn't they have to convert the date in the example? Thanks. – Ags911 Mar 13 '19 at 22:02
  • [How](https://stackoverflow.com/questions/17134716/convert-dataframe-column-type-from-string-to-datetime). As to why, I have no idea, possibly some older versions did behave differently. – ImportanceOfBeingErnest Mar 13 '19 at 22:18
  • Thanks for your reply again, I have tried your suggestion but now it breaks the function of the graph below it. It even affects this persons project of the same type. https://www.kaggle.com/kadser/analysis-of-romney-vs-obama/notebook while the dates come back it breaks that graph completely after changing the 'Start Date' to datetime64[ns], I am using Jupyter Notebook btw. – Ags911 Mar 15 '19 at 20:21
  • herepoll_df.plot('Start Date','Difference', figsize=(12,4), marker='o', linestyle='-',color='purple', xlim=(329,356)) #debate oct 3rd 329 corresponds to October 1st plt.axvline(x=329+2,linewidth=4,color='blue') #debate oct 11th plt.axvline(x=329+10,linewidth=4,color='blue') #debate oct 22nd plt.axvline(x=329+21,linewidth=4,color='blue') – Ags911 Mar 15 '19 at 20:22
  • I suppose that other notebook is then also produced by the older pandas version. – ImportanceOfBeingErnest Mar 15 '19 at 20:34
  • I have pandas 0.23.4 and numpy 1.15.4. Python 3. The notebook was also created by Python 3 but the original I found was using Python 2. Do you have a fix? – Ags911 Mar 16 '19 at 02:31
  • The important bit is that the column needs to be datetimes (i.e. of type `datetime64`). I do not know what breaks when you do that, but best do not set the `xlim` at all. Then you can filter the data itself into the range you want to plot. – ImportanceOfBeingErnest Mar 16 '19 at 12:18
  • @ImportanceOfBeingErnest Thanks, how do I do that? – Ags911 Mar 17 '19 at 10:03

0 Answers0