2

I have a dataframe df:

                A    B
first second          
bar   one     0.0  0.0
      two     0.0  0.0
foo   one     0.0  0.0
      two     0.0  0.0

I transform it to another one where values are tuples:

                      A          B
first second                      
bar   one     (6, 1, 0)  (0, 9, 3)
      two     (9, 3, 4)  (6, 2, 1)
foo   one     (1, 9, 0)  (4, 0, 0)
      two     (6, 1, 5)  (8, 3, 5)

My question is how can I get it (expanded) to be like below where tuples values become columns with multiindex? Can I do it during transform or should I do it as an additional step after transform?

                   A       B
               m n k   m n k            
first second   
bar   one      6 1 0   0 9 3
      two      9 3 4   6 2 1
foo   one      1 9 0   4 0 0
      two      6 1 5   8 3 5

Code for the above:

import numpy as np
import pandas as pd

np.random.seed(123)


def expand(s):
    # complex logic of `result` has been replaced with `np.random`
    result = [tuple(np.random.randint(10, size=3)) for i in s]
    return result


index = pd.MultiIndex.from_product([['bar', 'foo'], ['one', 'two']], names=['first', 'second'])
df = pd.DataFrame(np.zeros((4, 2)), index=index, columns=['A', 'B'])
print(df)

expanded = df.groupby(['second']).transform(expand)
print(expanded)
vogdb
  • 4,669
  • 3
  • 27
  • 29
  • 1
    Possible duplicate of [this question](https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe). See the **Generalizing to multiple columns** part. – Quang Hoang Oct 15 '19 at 15:38
  • It is a close one but it does not consider MultiIndex. Anyway the link is interesting. Thank you. – vogdb Oct 15 '19 at 18:52

2 Answers2

1

Try this:

df_lst = []
for col in df.columns:
    expanded_splt = expanded.apply(lambda x: pd.Series(x[col]),axis=1)
    columns  = pd.MultiIndex.from_product([[col], ['m', 'n', 'k']])
    expanded_splt.columns = columns
    df_lst.append(expanded_splt)
pd.concat(df_lst, axis=1)

Output:

                A           B
                m   n   k   m   n   k
first   second                      
bar     one     6   1   0   0   9   3
        two     9   3   4   6   2   1
foo     one     1   9   0   4   0   0
        two     6   1   5   8   3   5
Ian
  • 3,605
  • 4
  • 31
  • 66
1

Finally I found time to find an answer that suits me.

expanded_data = expanded.agg(lambda x: np.concatenate(x), axis=1).to_numpy()
expanded_data = np.stack(expanded_data)
column_index = pd.MultiIndex.from_product([expanded.columns, ['m', 'n', 'k']])
exploded = pd.DataFrame(expanded_data, index=expanded.index, columns=column_index)
print(exploded)
              A        B      
              m  n  k  m  n  k
first second                  
bar   one     6  1  0  0  9  3
      two     9  3  4  6  2  1
foo   one     1  9  0  4  0  0
      two     6  1  5  8  3  5

vogdb
  • 4,669
  • 3
  • 27
  • 29