19

I have a package that uses pandas Panels to generate MultiIndex pandas DataFrames. However, whenever I use pandas.Panel, I get the following DeprecationError:

DeprecationWarning: Panel is deprecated and will be removed in a future version. The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method. Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/. Pandas provides a .to_xarray() method to help automate this conversion.

However, I can't understand what the first recommendation here is actually recommending in order to create MultiIndex DataFrames. If Panel is going to be removed, how am I going to be able to use Panel.to_frame?


To clarify: I am not asking what deprecation is, or how to convert my Panels to DataFrames. What I am asking is, if I am using pandas.Panel and then pandas.Panel.to_frame in a library to create MultiIndex DataFrames from 3D ndarrays, and Panels are going to be deprecated, then what is the best option for making those DataFrames without using the Panel API?

Eg, if I'm doing the following, with X as a ndarray with shape (N,J,K):

p = pd.Panel(X, items=item_names, major_axis=names0, minor_axis=names1)
df = p.to_frame()

this is clearly no longer a viable future-proof option for DataFrame construction, though it was the recommended method in this question.

denfromufa
  • 5,610
  • 13
  • 81
  • 138
cge
  • 9,552
  • 3
  • 32
  • 51
  • If you `Panel.to_frame()` now, you will no longer have any panel data? – Stephen Rauch Jan 28 '18 at 01:23
  • 3
    This is for a package, not a specific dataset. When Panel is deprecated, then the initial Panel generation isn't going to work. – cge Jan 28 '18 at 01:53
  • I'm asking why the recommended alternative here is to use a method that is going to be deprecated, though it appears I now have my answer from the comments here. – cge Jan 28 '18 at 01:59
  • Are you building the API that uses panel? Or are you calling some code that returns you a panel? – cs95 Jan 28 '18 at 02:05
  • I've clarified this in the question, sorry - I am both generating the Panel from a 3D ndarray and then converting it to DataFrame. – cge Jan 28 '18 at 02:07

1 Answers1

14

Consider the following panel:

data = np.random.randint(1, 10, (5, 3, 2))
pnl = pd.Panel(
    data, 
    items=['item {}'.format(i) for i in range(1, 6)], 
    major_axis=[2015, 2016, 2017], 
    minor_axis=['US', 'UK']
)

If you convert this to a DataFrame, this becomes:

             item 1  item 2  item 3  item 4  item 5
major minor                                        
2015  US          9       6       3       2       5
      UK          8       3       7       7       9
2016  US          7       7       8       7       5
      UK          9       1       9       9       1
2017  US          1       8       1       3       1
      UK          6       8       8       1       6

So it takes the major and minor axes as the row MultiIndex, and items as columns. The shape has become (6, 5) which was originally (5, 3, 2). It is up to you where to use the MultiIndex but if you want the exact same shape, you can do the following:

data = data.reshape(5, 6).T
df = pd.DataFrame(
    data=data,
    index=pd.MultiIndex.from_product([[2015, 2016, 2017], ['US', 'UK']]),
    columns=['item {}'.format(i) for i in range(1, 6)]
)

which yields the same DataFrame (use the names parameter of pd.MultiIndex.from_product if you want to name your indices):

         item 1  item 2  item 3  item 4  item 5
2015 US       9       6       3       2       5
     UK       8       3       7       7       9
2016 US       7       7       8       7       5
     UK       9       1       9       9       1
2017 US       1       8       1       3       1
     UK       6       8       8       1       6

Now instead of pnl['item1 1'], you use df['item 1'] (optionally df['item 1'].unstack()); instead of pnl.xs(2015) you use df.xs(2015) and instead of pnl.xs('US', axis='minor'), you use df.xs('US', level=1).

As you see, this is just a matter of reshaping your initial 3D numpy array to 2D. You add the other (artificial) dimension with the help of MultiIndex.

ayhan
  • 70,170
  • 20
  • 182
  • 203
  • Ah, nice, I just did almost exactly the same thing but was missing the transpose so numbers weren't lining up quite right! – JohnE Jan 28 '18 at 03:22
  • @JohnE Yeah axis=0 becomes axis=1 with `to_frame` which is not very intuitive. Yet again, nothing about panel was intuitive for me. :) – ayhan Jan 28 '18 at 03:26