Pandas: add a column to a multiindex column dataframe

Question

I would like to add a column to the second level of a multiindex column dataframe.

In [151]: df
Out[151]: 
first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813 
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596

The usual trick of direct assignment does not work:

In [152]: df['bar']['three'] = [0, 1, 2]

In [153]: df
Out[153]: 
first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596

How can I add the third row to under "bar"?

I guess the OP means to add the third *column*. – Qaswed Aug 12 '19 at 13:43 — Qaswed, Aug 12 '19 at 13:43

score 123 · Accepted Answer · answered Apr 18 '13 at 17:11

123

It's actually pretty simple (FWIW, I originally thought to do it your way):

df['bar', 'three'] = [0, 1, 2]
df = df.sort_index(axis=1)
print(df)

        bar                        baz          
        one       two  three       one       two
A -0.212901  0.503615      0 -1.660945  0.446778
B -0.803926 -0.417570      1 -0.336827  0.989343
C  3.400885 -0.214245      2  0.895745  1.011671

answered Apr 18 '13 at 17:11

spencerlyon2

9,476
4
30
39

Thanks. I must say it is totally not obvious (to me) why the new column shows up only after using sort_index. – Apr 19 '13 at 00:22
8

Oh sorry that's not part of the answer, just me being picky. It will actually show up as soon as you call `df['bar', 'three'] = [0, 1, 2]`. By default pandas will put it at the end of the DataFrame (after [baz, two]). I just wanted to see it with the other `bar`s. – spencerlyon2 Apr 19 '13 at 03:38
4

This appends the new column 'three' to the subtable 'bar'. But what if you want to insert (instead of append) this new column in subtable 'bar', e.g. insert 'three' in between 'one' and 'two'? – Joris Kinable Jul 07 '19 at 13:43
The order of columns doesn't really matter here. If you wanted to reorder them so they displayed "one, three, two" you could do that by using `df.loc[:, XX]` where `XX` has tuples ("bar", "one"), ("bar", "three"), etc. – spencerlyon2 Jul 16 '19 at 14:08
Doesn't work for me in my similar case; I'm getting `TypeError: cannot insert an item into a CategoricalIndex that is not already an existing category`. I don't know what might make a multiindex a `CategorialIndex` as opposed to a "regular" one. – ibeatty Nov 02 '19 at 16:32
2

Is it possible to generalize this to adding a third column to every sub-index? (i.e. in this case to have the `three` column added both for `bar` and for `baz`? – SO_tourist Nov 16 '19 at 13:38
@JorisKinable I posted the solution for using `insert` in a separate answer: https://stackoverflow.com/a/75791460/6383205 – dopexxx Mar 20 '23 at 14:22

score 25 · Answer 2 · edited Jul 12 '23 at 17:08

If we want to add a multi-level column:

Source DF:

In [221]: df
Out[221]:
first        bar                 baz
second       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740
B       0.488875  0.428836  1.413451 -0.683677
C      -0.243064 -0.069446 -0.911166  0.478370

Option 1: adding result of division: bar / baz as a new foo column

In [222]: df = df.join(
     ...:     df[['bar']].div(df['baz']).rename(columns={'bar':'foo'}))

In [223]: df
Out[223]:
first        bar                 baz                 foo
second       one       two       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740 -2.317647  1.424980
B       0.488875  0.428836  1.413451 -0.683677  0.345873 -0.627250
C      -0.243064 -0.069446 -0.911166  0.478370  0.266761 -0.145172

Option 2: adding multi-level column with three "sub-columns":

In [235]: df = df.join(pd.DataFrame(
     ...:     np.random.rand(3,3),
     ...:     columns=pd.MultiIndex.from_product([['new'], ['one','two','three']]),
     ...:     index=df.index))

In [236]: df
Out[236]:
first        bar                 baz                 new
second       one       two       one       two       one       two     three
A      -1.089798  2.053026  0.470218  1.440740  0.274291  0.636257  0.091048
B       0.488875  0.428836  1.413451 -0.683677  0.668157  0.456931  0.227568
C      -0.243064 -0.069446 -0.911166  0.478370  0.333824  0.363060  0.949672

And how to appen and independiente column? I tried with: `df = df.join(pd.DataFrame(np.random.rand(3, 1), columns = pd.MultiIndex.from_product([['new']]), index = df.index))` Is the right way? — Chacho Fuva, Dec 10 '19 at 17:02

Ynjxsjmh · Answer 3 · 2022-10-20T14:34:37.883

If you want to add multiple columns to a multiindex column dataframe, you can try

All same value for columns

df[[("foo", "bar1"), ("foo", "bar2")]] = 2

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    2    2
1  0.267913  1.918923  0.132791  0.178503    2    2
2  1.550526 -0.312235 -1.177689 -0.081596    2    2

Same value for each column

df[[("foo", "bar1"), ("foo", "bar2")]] = [2, 3]

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    2    3
1  0.267913  1.918923  0.132791  0.178503    2    3
2  1.550526 -0.312235 -1.177689 -0.081596    2    3

Different value for each cell

df[[("foo", "bar1"), ("foo", "bar2")]] = [[1,2], [3,4], [5,6]] # shape is (3, 2) where 3 is index length and 2 is new added column length

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    1    2
1  0.267913  1.918923  0.132791  0.178503    3    4
2  1.550526 -0.312235 -1.177689 -0.081596    5    6

Another usecase is that we have a single index dataframe, and we want to concat it to the multi index dataframe

        bar                 baz
       one       two       one       two     concat to      bar1  bar2
0  0.487880 -0.487661 -1.030176  0.100813   <---------  0     1     2
1  0.267913  1.918923  0.132791  0.178503               1     3     4
2  1.550526 -0.312235 -1.177689 -0.081596               2     5     6

Generate a list of tuples for columns

df[[("foo", col) for col in single_index_df.columns]] = single_index_df

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    1    2
1  0.267913  1.918923  0.132791  0.178503    3    4
2  1.550526 -0.312235 -1.177689 -0.081596    5    6

Create a new multi index columns dataframe from the single index dataframe as Option 2 of MaxU - stop genocide of UA

df = df.join(pd.DataFrame(single_index_df.values,
                          columns=pd.MultiIndex.from_product([['foo'], single_index_df.columns]),
                          index=single_index_df.index))

Create a multi index dataframe from single index dataframe with pd.concat({'foo': single_index_df}, axis=1)

df = pd.concat([df, pd.concat({'foo': single_index_df}, axis=1)], axis=1)

score 1 · Answer 4 · answered Mar 20 '23 at 14:21

1

If you want to insert (instead of append at the end of the DF) do this:

df.insert(0, ('bar', 'three'), [0, 1, 2])

The second item has to be hashable, so a list will not work.

answered Mar 20 '23 at 14:21

dopexxx

2,298
19
30

Pandas: add a column to a multiindex column dataframe

4 Answers4

Linked

Related