104

I would like to add a column to the second level of a multiindex column dataframe.

In [151]: df
Out[151]: 
first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813 
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596 

The usual trick of direct assignment does not work:

In [152]: df['bar']['three'] = [0, 1, 2]

In [153]: df
Out[153]: 
first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596

How can I add the third row to under "bar"?

4 Answers4

123

It's actually pretty simple (FWIW, I originally thought to do it your way):

df['bar', 'three'] = [0, 1, 2]
df = df.sort_index(axis=1)
print(df)

        bar                        baz          
        one       two  three       one       two
A -0.212901  0.503615      0 -1.660945  0.446778
B -0.803926 -0.417570      1 -0.336827  0.989343
C  3.400885 -0.214245      2  0.895745  1.011671
spencerlyon2
  • 9,476
  • 4
  • 30
  • 39
  • Thanks. I must say it is totally not obvious (to me) why the new column shows up only after using sort_index. –  Apr 19 '13 at 00:22
  • 8
    Oh sorry that's not part of the answer, just me being picky. It will actually show up as soon as you call `df['bar', 'three'] = [0, 1, 2]`. By default pandas will put it at the end of the DataFrame (after [baz, two]). I just wanted to see it with the other `bar`s. – spencerlyon2 Apr 19 '13 at 03:38
  • 4
    This appends the new column 'three' to the subtable 'bar'. But what if you want to insert (instead of append) this new column in subtable 'bar', e.g. insert 'three' in between 'one' and 'two'? – Joris Kinable Jul 07 '19 at 13:43
  • The order of columns doesn't really matter here. If you wanted to reorder them so they displayed "one, three, two" you could do that by using `df.loc[:, XX]` where `XX` has tuples ("bar", "one"), ("bar", "three"), etc. – spencerlyon2 Jul 16 '19 at 14:08
  • Doesn't work for me in my similar case; I'm getting `TypeError: cannot insert an item into a CategoricalIndex that is not already an existing category`. I don't know what might make a multiindex a `CategorialIndex` as opposed to a "regular" one. – ibeatty Nov 02 '19 at 16:32
  • 2
    Is it possible to generalize this to adding a third column to every sub-index? (i.e. in this case to have the `three` column added both for `bar` and for `baz`? – SO_tourist Nov 16 '19 at 13:38
  • @JorisKinable I posted the solution for using `insert` in a separate answer: https://stackoverflow.com/a/75791460/6383205 – dopexxx Mar 20 '23 at 14:22
25

If we want to add a multi-level column:

Source DF:

In [221]: df
Out[221]:
first        bar                 baz
second       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740
B       0.488875  0.428836  1.413451 -0.683677
C      -0.243064 -0.069446 -0.911166  0.478370

Option 1: adding result of division: bar / baz as a new foo column

In [222]: df = df.join(
     ...:     df[['bar']].div(df['baz']).rename(columns={'bar':'foo'}))

In [223]: df
Out[223]:
first        bar                 baz                 foo
second       one       two       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740 -2.317647  1.424980
B       0.488875  0.428836  1.413451 -0.683677  0.345873 -0.627250
C      -0.243064 -0.069446 -0.911166  0.478370  0.266761 -0.145172

Option 2: adding multi-level column with three "sub-columns":

In [235]: df = df.join(pd.DataFrame(
     ...:     np.random.rand(3,3),
     ...:     columns=pd.MultiIndex.from_product([['new'], ['one','two','three']]),
     ...:     index=df.index))

In [236]: df
Out[236]:
first        bar                 baz                 new
second       one       two       one       two       one       two     three
A      -1.089798  2.053026  0.470218  1.440740  0.274291  0.636257  0.091048
B       0.488875  0.428836  1.413451 -0.683677  0.668157  0.456931  0.227568
C      -0.243064 -0.069446 -0.911166  0.478370  0.333824  0.363060  0.949672
wjandrea
  • 28,235
  • 9
  • 60
  • 81
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • And how to appen and independiente column? I tried with: `df = df.join(pd.DataFrame(np.random.rand(3, 1), columns = pd.MultiIndex.from_product([['new']]), index = df.index))` Is the right way? – Chacho Fuva Dec 10 '19 at 17:02
2

If you want to add multiple columns to a multiindex column dataframe, you can try

  1. All same value for columns
df[[("foo", "bar1"), ("foo", "bar2")]] = 2
        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    2    2
1  0.267913  1.918923  0.132791  0.178503    2    2
2  1.550526 -0.312235 -1.177689 -0.081596    2    2
  1. Same value for each column
df[[("foo", "bar1"), ("foo", "bar2")]] = [2, 3]
        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    2    3
1  0.267913  1.918923  0.132791  0.178503    2    3
2  1.550526 -0.312235 -1.177689 -0.081596    2    3
  1. Different value for each cell
df[[("foo", "bar1"), ("foo", "bar2")]] = [[1,2], [3,4], [5,6]] # shape is (3, 2) where 3 is index length and 2 is new added column length
        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    1    2
1  0.267913  1.918923  0.132791  0.178503    3    4
2  1.550526 -0.312235 -1.177689 -0.081596    5    6

Another usecase is that we have a single index dataframe, and we want to concat it to the multi index dataframe

        bar                 baz
       one       two       one       two     concat to      bar1  bar2
0  0.487880 -0.487661 -1.030176  0.100813   <---------  0     1     2
1  0.267913  1.918923  0.132791  0.178503               1     3     4
2  1.550526 -0.312235 -1.177689 -0.081596               2     5     6
  1. Generate a list of tuples for columns
df[[("foo", col) for col in single_index_df.columns]] = single_index_df
        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    1    2
1  0.267913  1.918923  0.132791  0.178503    3    4
2  1.550526 -0.312235 -1.177689 -0.081596    5    6
  1. Create a new multi index columns dataframe from the single index dataframe as Option 2 of MaxU - stop genocide of UA
df = df.join(pd.DataFrame(single_index_df.values,
                          columns=pd.MultiIndex.from_product([['foo'], single_index_df.columns]),
                          index=single_index_df.index))
  1. Create a multi index dataframe from single index dataframe with pd.concat({'foo': single_index_df}, axis=1)
df = pd.concat([df, pd.concat({'foo': single_index_df}, axis=1)], axis=1)
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
1

If you want to insert (instead of append at the end of the DF) do this:

df.insert(0, ('bar', 'three'), [0, 1, 2])

The second item has to be hashable, so a list will not work.

dopexxx
  • 2,298
  • 19
  • 30