2

Let's say I have a DataFrame like this:

df = pd.DataFrame(data = [[1,2,3,4,5,6], [3,4,5,6,7,8]], 
                  columns = pd.MultiIndex.from_product([('A1', 'B1', 'A2'), (10,20)], names=['level_0','level_1']))

Here's how it looks like: DataFrame image

I want to add a new level in the columns which contains 1 where level_0 value contains "1" and and 2 where level_0 value contains "2". So, basically:

  • Where level_0 == "A1" --> new_level = 1
  • Where level_0 == "B1" --> new_level = 1
  • Where level_0 == "A2" --> new_level = 2

Any suggestions on how to do it?

gioarma
  • 418
  • 2
  • 15

2 Answers2

1

You could extract the values with a regex ((\d+)$ = last digits of the value) and rework the MultiIndex with MultiIndex.from_arrays:

values = df.columns.get_level_values('level_0').str.extract('(\d+)$', expand=False)
# ['1', '1', '1', '1', '2', '2']

df.columns = pd.MultiIndex.from_arrays([*zip(*df.columns.to_list()), values],
                                       names=[*df.columns.names, 'level_2']
                                      )

NB. this generalizes to any XXX00 value

output:

level_0 A1    B1    A2   
level_1 10 20 10 20 10 20
level_2  1  1  1  1  2  2
0        1  2  3  4  5  6
1        3  4  5  6  7  8
mozway
  • 194,879
  • 13
  • 39
  • 75
1

Use lsit comprehension for extract number from first level values and create new MultiIndex by MultiIndex.from_tuples:

import re

df.columns = pd.MultiIndex.from_tuples([(re.findall(r'(\d+)$', x[0])[0], *x) 
                                         for x in df.columns.tolist()], 
                                       names=('new_level',*df.columns.names))
print (df)

new_level  1           2   
level_0   A1    B1    A2   
level_1   10 20 10 20 10 20
0          1  2  3  4  5  6
1          3  4  5  6  7  8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Nice, thanks! Can I ask what `*x` is used for? – gioarma Feb 15 '22 at 08:55
  • 1
    @gioarma - it is for unpack tuples, check [this](https://stackoverflow.com/questions/36901/what-does-double-star-asterisk-and-star-asterisk-do-for-parameters) – jezrael Feb 15 '22 at 08:57