2

I have a pandas dataframe of singleton Python matrices that I want to convert to dataframe of values. I can use apply to convert individual columns, but was wondering if I could do this over the entire dataframe. Here is what I have so far:

Dataframe df:

+-----------+-----------+-----------+-----------+-----------+-----------+
|   Col1    |   Col2    |   Col3    |   Col4    |    Col5   |    Col6   |
+-----------+-----------+-----------+-----------+-----------+-----------+
| [[[[4]]]] | [[[[0]]]] | [[[[1]]]] | [[[[0]]]] | [[[[0]]]] | [[[[1]]]] |
| [[[[1]]]] | [[[[1]]]] | [[[[0]]]] | [[[[2]]]] | [[[[1]]]] | [[[[1]]]] |
| [[[[0]]]] | [[[[2]]]] | [[[[3]]]] | [[[[1]]]] | [[[[1]]]] | [[[[0]]]] |
+-----------+-----------+-----------+-----------+-----------+-----------+

code to convert an individual column:

df.Col1.apply(lambda x: np.asarray(x).ravel()[0])
Leon Adams
  • 491
  • 4
  • 10
  • 2
    One question is how did you get that DataFrame in the first place? It's probably going to be more efficient to create it properly from the beginning instead of using an `applymap` with a `lambda` as a band-aid fix. – ALollz Nov 21 '19 at 16:34
  • I do have access to the module that created the data, but would much prefer to not make any changes there. – Leon Adams Nov 21 '19 at 16:36

2 Answers2

5

Use applymap instead of apply

df.applymap(lambda x: np.asarray(x).ravel()[0])
mcsoini
  • 6,280
  • 2
  • 15
  • 38
1

You can flatten the arbitrarily nested cells and then re-create the DataFrame. This relies on each matrix having exactly 1 element. This will be faster than applymap for a small DataFrame though the difference becomes smaller for a larger DataFrame.

import pandas as pd
import numpy as np

def flatten(container):
    for i in container:
        if isinstance(i, (list, tuple)):
            for j in flatten(i):
                yield j
        else:
            yield i

pd.DataFrame(np.array(list(flatten(df.to_numpy().ravel()))).reshape(df.shape),
             index=df.index,
             columns=df.columns)

#   Col1  Col2  Col3  Col4  Col5  Col6
#0     4     0     1     0     0     1
#1     1     1     0     2     1     1
#2     0     2     3     1     1     0
ALollz
  • 57,915
  • 7
  • 66
  • 89