0

I'm having a little bit of a difficult time understanding how to implement the Panel OLS in pandas. I have received help on this topic and I thought I was understanding the situation. Now that I am trying to implement I am having difficulty. Below is my data:

url='https://raw.githubusercontent.com/108michael/ms_thesis/master/crsp.dime.mpl.df.1'



   df=pd.read_csv(url, usecols=(['date', 'cid', 'log_diff_rgdp', 'billsum_support', \
'years_exp', 'leg_totalbills', 'log_diff_rgdp', 'unemployment',  'expendituresfor',\
    'direct_expenditures', 'indirect_expenditures', 'Republican', 'sen'])))
    df.head(1)  

    cid     date    log_diff_rgdp   unemployment    leg_totalbills  years_exp   Republican  sen     billsum_support     expendituresfor     direct_expenditures     indirect_expenditures
0   N00013870   2007    0.026069    4.6     44  5   1.0     1.0     1.0     4.0     4.0     0.0


df=df.T.to_panel()

df=df.transpose(2,0,1)

df

<class 'pandas.core.panel.Panel'>
Dimensions: 505 (items) x 10 (major_axis) x 72 (minor_axis)
Items axis: N00000010 to N00035686
Major_axis axis: 2005 to 2014
Minor_axis axis: index to indirect_expenditures

It is my understanding (I think I could be wrong about this) that the Items axis contains all of the panels; that the Minor_axis contains all of the columns in each of the panels; and that the Major_axis is the time index. I have posted the first row of my data before sending it to Paneland billsum_support is the 4th from the last column; but, when I try to regress with billsum_support as the Y variable I get the following error.

reg=PanelOLS(y=df['billsum_support'],x=df[['years_exp', 'unemployment', 'dir_ind_expendituresfor']],time_effects=True)
reg
KeyError                                  Traceback (most recent call last)
/home/jayaramdas/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1875             try:
-> 1876                 return self._engine.get_loc(key)
   1877             except KeyError:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4027)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3891)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12408)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12359)()

KeyError: 'billsum_support'

I have seen the working example here but this person seems to have their data in stacked format instead of Panel. Is there someone that has some experience with OLS Panel and can understand what I am doing wrong here?

Community
  • 1
  • 1
Collective Action
  • 7,607
  • 15
  • 45
  • 60

1 Answers1

0

I got it; following up on ptrj, and doing some simple exploring I found the solution and will post it in the question

df=df.pivot_table(index='date',columns='cid', fill_value=0,aggfunc=np.mean)

df=df.T.to_panel()

df=df.transpose(2,1,0)

df=df.to_frame()
Community
  • 1
  • 1
Collective Action
  • 7,607
  • 15
  • 45
  • 60