6

I was able to plot the data using the below code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

url = "http://real-chart.finance.yahoo.com/table.csv?s=YHOO&a=03&b=12&c=2006&d=01&e=9&f=2016&g=d&ignore=.csv"

df = pd.read_csv(url)
df.index = df["Date"]
df.sort_index(inplace=True)

df['Adj Close'].plot()
plt.show()

But now I want to calculate the rolling mean of the data and plot that. This is what I've tried:

pd.rolling_mean(df.resample("1D", fill_method="ffill"), window=3, min_periods=1)
plt.plot()

But this gives me the error:

Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex

All I want to do is plot the rolling mean of the data. Why is this happening?

mechanical_meat
  • 163,903
  • 24
  • 228
  • 223
template boy
  • 10,230
  • 8
  • 61
  • 97
  • Is pandas parsing the dates as dates correctly? It seems to parse them as strings often unless you tell it too otherwise. You can use `df.dtypes` to find out – Jezzamon Feb 11 '16 at 00:06

2 Answers2

2

Why don't you just use the datareader?

import pandas.io.data as web

aapl = web.DataReader("aapl", 'yahoo', '2010-1-1')['Adj Close']
aapl.plot(title='AAPL Adj Close');pd.rolling_mean(aapl, 50).plot();pd.rolling_mean(aapl, 200).plot()

enter image description here

To give more control over the plotting:

aapl = web.DataReader("aapl", 'yahoo', '2010-1-1')['Adj Close']
aapl.name = 'Adj Close'
aapl_50ma = pd.rolling_mean(aapl, 50)
aapl_50ma.name = '50 day MA'
aapl_200ma = pd.rolling_mean(aapl, 200)
aapl_200ma.name = '200 day MA'
aapl.plot(title='AAPL', legend=True);aapl_50ma.plot(legend=True);aapl_200ma.plot(legend=True)

enter image description here

Alexander
  • 105,104
  • 32
  • 201
  • 196
  • What are you running it on? IPython Jupyter Console/Notebook, or something else? – Alexander Feb 11 '16 at 01:07
  • Nevermind I got it (I had to do `plt.show()`). Also, how do I put a legend on this chart that tells me what the lines mean? – template boy Feb 11 '16 at 01:08
  • 1
    This is a great answer! Here is what I had to use for Pandas 0.22.0 - `aapl.rolling(window=50, center=False).mean()` since `pd.rolling_mean(aapl, 50)` is deprecated. Also, as per `datareader` [documentation](http://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-access), some other internet source is required since YAHOO finance is now deprecated. – edesz Feb 09 '18 at 05:00
1

Pandas isn't parsing your dates correctly, as it doesn't by default when loading CSVs. parse_dates either needs to be True, to parse the index, or a list of column numbers to parse. It's loading them as strings instead. Also, read_csv allows setting the index automatically. The following will work:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

url = "http://real-chart.finance.yahoo.com/table.csv?s=YHOO&a=03&b=12&c=2006&d=01&e=9&f=2016&g=d&ignore=.csv"

df = pd.read_csv(url, parse_dates=True, index_col=0)
#df.index = df["Date"]
df.sort_index(inplace=True)

df['Adj Close'].plot()
plt.show()

And then

rm = pd.rolling_mean(df.resample("1D", fill_method="ffill"), window=3, min_periods=1)
rm['Adj Close'].plot()

However, this latter bit of code is currently plotting and giving me an odd error that I need to look into. Note that in odd cases, in jupyter/ipython notebooks with inline plotting, this may give an error if you don't use the matplotlib/pylab magic before importing matplotlib.

Community
  • 1
  • 1
cge
  • 9,552
  • 3
  • 32
  • 51
  • After `rm['Adj Close'].plot()` I did `plt.show()` but it returns to me the same plot as the first one... – template boy Feb 11 '16 at 00:17
  • It's plotting a different plot. You're just using a window that's too small to see it. Try setting it to 30 instead of 3, for example. – cge Feb 11 '16 at 00:18