22

I have a DataFrame with a few time series:

         divida    movav12       var  varmovav12
Date                                            
2004-01       0        NaN       NaN         NaN
2004-02       0        NaN       NaN         NaN
2004-03       0        NaN       NaN         NaN
2004-04      34        NaN       inf         NaN
2004-05      30        NaN -0.117647         NaN
2004-06      44        NaN  0.466667         NaN
2004-07      35        NaN -0.204545         NaN
2004-08      31        NaN -0.114286         NaN
2004-09      30        NaN -0.032258         NaN
2004-10      24        NaN -0.200000         NaN
2004-11      41        NaN  0.708333         NaN
2004-12      29  24.833333 -0.292683         NaN
2005-01      31  27.416667  0.068966    0.104027
2005-02      28  29.750000 -0.096774    0.085106
2005-03      27  32.000000 -0.035714    0.075630
2005-04      30  31.666667  0.111111   -0.010417
2005-05      31  31.750000  0.033333    0.002632
2005-06      39  31.333333  0.258065   -0.013123
2005-07      36  31.416667 -0.076923    0.002660

I want to decompose the first time series divida in a way that I can separate its trend from its seasonal and residual components.

I found an answer here, and am trying to use the following code:

import statsmodels.api as sm

s=sm.tsa.seasonal_decompose(divida.divida)

However I keep getting this error:

Traceback (most recent call last):
File "/Users/Pred_UnBR_Mod2.py", line 78, in <module> s=sm.tsa.seasonal_decompose(divida.divida)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/seasonal.py", line 58, in seasonal_decompose _pandas_wrapper, pfreq = _maybe_get_pandas_wrapper_freq(x)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/filters/_utils.py", line 46, in _maybe_get_pandas_wrapper_freq
freq = index.inferred_freq
AttributeError: 'Index' object has no attribute 'inferred_freq'

How can I proceed?

halfer
  • 19,824
  • 17
  • 99
  • 186
aabujamra
  • 4,494
  • 13
  • 51
  • 101

5 Answers5

40

Works fine when you convert your index to DateTimeIndex:

df.reset_index(inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
s=sm.tsa.seasonal_decompose(df.divida)

<statsmodels.tsa.seasonal.DecomposeResult object at 0x110ec3710>

Access the components via:

s.resid
s.seasonal
s.trend
Stefan
  • 41,759
  • 13
  • 76
  • 81
  • 2
    Quick question: how do I access that result? I'm only getting the – aabujamra Dec 24 '15 at 22:42
  • Thank you @Stefan, saved my life! – Amy21 Sep 05 '17 at 16:13
  • Hi, when I try this code I get the following error: `AttributeError: 'RangeIndex' object has no attribute 'inferred_freq'` Any suggestion ?? – Leevo Jan 31 '19 at 09:35
  • The error says your Index is of type `RangeIndex` when it should be `DateTimeIndex` (see what happens to the `Date` column in the example). – Stefan Jan 31 '19 at 14:01
  • @Leevo you have to assign a frequency, for example you can resample your data: https://stackoverflow.com/questions/17001389/pandas-resample-documentation – PV8 Jun 12 '19 at 10:59
2

Statsmodel will decompose the series only if you provide frequency. Usually all time series index will contain frequency eg: Daywise, Business days, weekly So it shows error. You can remove this error by two ways:

  1. What Stefan did is he gave the index column to pandas DateTime function. It uses internal function infer_freq to find the frequency and return the index with frequency.
  2. Else you can set the frequency to your index column as df.index.asfreq(freq='m'). Here m represents month. You can set the frequency if you have domain knowledge or by d.
roschach
  • 8,390
  • 14
  • 74
  • 124
  • Thank you, the old problem is solved. But now it says: `ValueError: cannot insert level_0, already exists`. Any suggestion ?? – Leevo Jan 31 '19 at 09:37
  • Give some more detailed description of your problem. The code with traceback error will help to solve the problem – saravanan saminathan Feb 01 '19 at 11:53
0

Make it simple:

Follow three steps:
1. if not done, make the column in yyyy-mm-dd or dd-mm-yyyy( using excel).
2. Then using pandas convert it into date format as: df['Date'] = pd.to_datetime(df['Date'])
3. decompose it using:

from statsmodels.tsa.seasonal import seasonal_decompose
decomposition=seasonal_decompose(ts_log)

And finally:

enter image description here

Nauman Naeem
  • 408
  • 3
  • 12
Reeves
  • 726
  • 7
  • 13
0

It depends on the index format. You can have DateTimeIndex or you can have PeriodIndex. Stefan presented the example for DateTimeIndex. Here is my example for PeriodIndex. My original DataFrame has a MultiIndex index with year in first level and month in second level. Here is how I convert it to PeriodIndex:

df["date"] = pd.PeriodIndex (df.index.map(lambda x: "{0}{1:02d}".format(*x)),freq="M")
df = df.set_index("date")

Now it is ready to be used by seasonal_decompose.

Matt Najarian
  • 151
  • 1
  • 8
0

Try parsing the date column using parse_dates , and later mention the index column .

from statsmodels.tsa.seasonal import seasonal_decompose
data=pd.read_csv(airline,header=0,squeeze=True,index_col=[0],parse_dates=[0])
res=seasonal_decompose(data)
avin
  • 1