0

I have a data frame that looks like this:

    Name    Permits_13  Score_13    Permits_14  Score_14    Permits_15  Score_15
0   P.S. 015 ROBERTO CLEMENTE   12.0    284 22  279 32  283
1   P.S. 019 ASHER LEVY 18.0    296 51  301 55  308
2   P.S. 020 ANNA SILVER    9.0 294 9   290 10  293
3   P.S. 034 FRANKLIN D. ROOSEVELT  3.0 294 4   292 1   296
4   P.S. 064 ROBERT SIMON   3.0 287 15  288 17  291
5   P.S. 110 FLORENCE NIGHTINGALE   0.0 313 3   306 4   308
6   P.S. 134 HENRIETTA SZOLD    4.0 290 12  292 17  288
7   P.S. 137 JOHN L. BERNSTEIN  4.0 276 12  273 17  274
8   P.S. 140 NATHAN STRAUS  13.0    282 37  284 59  284
9   P.S. 142 AMALIA CASTRO  7.0 290 15  285 25  284
10  P.S. 184M SHUANG WEN    5.0 327 12  327 9   327

And I would like to transform it to a data panel structure as the answer for this question Fixed effect in Pandas or Statsmodels, so I can use the PanelOLS with fixed effects.

My first attempt was to do this transformation:

df1 = df.ix[:,['Permits_13', 'Score_13']].T
df2 = df.ix[:,['Permits_14', 'Score_14']].T
df3 = df.ix[:,['Permits_15', 'Score_15']].T
pf = pandas.Panel({'df1':df1,'df2':df2,'df3':df3})

However, it doesn't seem to be the correct way, once I have no information about time. Here, columns ending with 13, 14 and 15, represent observations for the years of 2013, 2014 and 2015, in that order.

Do I have to create a data frame for each one of the rows in the original data?

This is my first trial using Pandas, and any help would be appreciated.

Community
  • 1
  • 1
pceccon
  • 9,379
  • 26
  • 82
  • 158

1 Answers1

0

The docstring of DataFrame.to_panel() says:

Transform long (stacked) format (DataFrame) into wide (3D, Panel) format.

Currently the index of the DataFrame must be a 2-level MultiIndex. This may be generalized later

So that means you need to do:

  1. Stack your dataframe (as it's currently "wide", not "long")
  2. Pick two columns who can unique define the index of your dataframe
  3. Set those columns as your index
  4. Call to_panel()

So that's:

df.stack().set_index(['first_col', 'other_col']).to_panel()
Paul H
  • 65,268
  • 20
  • 159
  • 136