Move column by name to front of table in pandas

Question

Here is my df:

                             Net   Upper   Lower  Mid  Zsore
Answer option                                                
More than once a day          0%   0.22%  -0.12%   2    65 
Once a day                    0%   0.32%  -0.19%   3    45
Several times a week          2%   2.45%   1.10%   4    78
Once a week                   1%   1.63%  -0.40%   6    65

How can I move a column by name ("Mid") to the front of the table, index 0. This is what the result should look like:

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

My current code moves the column by index using df.columns.tolist() but I'd like to shift it by name.

score 159 · Accepted Answer · edited Jul 08 '22 at 13:59

We can use loc to reorder by passing a list:

In [27]:
# get a list of columns
cols = list(df)
# move the column to head of list using index, pop and insert
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[27]:
['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
In [28]:
# use ix to reorder
df = df.loc[:, cols]
df
Out[28]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

Another method is to take a reference to the column and reinsert it at the front:

In [39]:
mid = df['Mid']
df.drop(labels=['Mid'], axis=1,inplace = True)
df.insert(0, 'Mid', mid)
df
Out[39]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

You can, with very early versions of Pandas, also use ix to achieve the same results:

df = df.ix[:, cols]

But ix was deprecated from pandas 0.20.0 onwards and was discontinued as of Pandas 1.0.

elPastor · Answer 2 · 2019-06-06T17:41:55.083

135

Maybe I'm missing something, but a lot of these answers seem overly complicated. You should be able to just set the columns within a single list:

Column to the front:

df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]

Or if instead, you want to move it to the back:

df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]

Or if you wanted to move more than one column:

cols_to_move = ['Mid', 'Zsore']
df           = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]

edited Jun 06 '19 at 17:41

answered Jun 06 '19 at 14:27

elPastor

8,435
11
53
81

1

For anyone else, make sure for multiple columns you use option 3. Option 1 with multiple columns won't delete `Mid` & `Zscore` from column from the original position. I found this out with a `Grouper` error trying to groupby when the same column was there twice. – kevin_theinfinityfund Sep 30 '20 at 18:55
@kevin_theinfinityfund I think that only would have happened if you did `if col != 'Mid' or col != 'Zscore'` -- with `if col != 'Mid' and col != 'Zscore'` it will not duplicate the columns. Or even better: `if col not in ('Mid', 'Zscore')`. – zmbc Oct 14 '22 at 17:37
Yes the 3rd piece you offered is the 3rd option I referenced. It is the cleanest way to go about multiple columns. – kevin_theinfinityfund Oct 16 '22 at 01:27

normanius · Answer 3 · 2022-10-17T20:00:09.677

I prefer this solution:

col = df.pop("Mid")
df.insert(0, col.name, col)

It's simpler to read and faster than other suggested answers.

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

Performance assessment:

For this test, the currently last column is moved to the front in each repetition. In-place methods generally perform better. While citynorman's solution can be made in-place, Ed Chum's method based on .loc and sachinnm's method based on reindex cannot.

While other methods are generic, citynorman's solution is limited to pos=0. I didn't observe any performance difference between df.loc[cols] and df[cols], which is why I didn't include some other suggestions.

Original system (2019): ~~Python 3.6.8 and pandas 0.24.2 on a MacBook Pro (Mid 2015).~~
Current system (2022): Python 3.10.5 and pandas 1.4.3 on a MacBook Pro (2021, Apple M1).

import numpy as np
import pandas as pd

n_cols = 11
df = pd.DataFrame(np.random.randn(200000, n_cols),
                  columns=range(n_cols))

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

def move_to_front_normanius_inplace(df, col):
    move_column_inplace(df, col, 0)
    return df

def move_to_front_chum(df, col):
    cols = list(df)
    cols.insert(0, cols.pop(cols.index(col)))
    return df.loc[:, cols]

def move_to_front_chum_inplace(df, col):
    col = df[col]
    df.drop(col.name, axis=1, inplace=True)
    df.insert(0, col.name, col)
    return df

def move_to_front_elpastor(df, col):
    cols = [col] + [ c for c in df.columns if c!=col ]
    return df[cols] # or df.loc[cols]

def move_to_front_sachinmm(df, col):
    cols = df.columns.tolist()
    cols.insert(0, cols.pop(cols.index(col)))
    df = df.reindex(columns=cols, copy=False)
    return df

def move_to_front_citynorman_inplace(df, col):
    # This approach exploits that reset_index() moves the index
    # at the first position of the data frame.
    df.set_index(col, inplace=True)
    df.reset_index(inplace=True)
    return df

def test(method, df):
    col = np.random.randint(0, n_cols)
    method(df, col)

col = np.random.randint(0, n_cols)
ret_mine = move_to_front_normanius_inplace(df.copy(), col)
ret_chum1 = move_to_front_chum(df.copy(), col)
ret_chum2 = move_to_front_chum_inplace(df.copy(), col)
ret_elpas = move_to_front_elpastor(df.copy(), col)
ret_sach = move_to_front_sachinmm(df.copy(), col)
ret_city = move_to_front_citynorman_inplace(df.copy(), col)

# Assert equivalence of solutions.
assert(ret_mine.equals(ret_chum1))
assert(ret_mine.equals(ret_chum2))
assert(ret_mine.equals(ret_elpas))
assert(ret_mine.equals(ret_sach))
assert(ret_mine.equals(ret_city))

Results:

# For n_cols = 11:
%timeit test(move_to_front_normanius_inplace, df)
# 137 µs ± 692 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 177 µs ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 821 µs ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit test(move_to_front_chum, df)
# 926 µs ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit test(move_to_front_elpastor, df)
# 901 µs ± 6.44 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 3.25 ms ± 32.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# For n_cols = 31:
%timeit test(move_to_front_normanius_inplace, df)
# 188 µs ± 3.46 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 214 µs ± 649 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 5.17 ms ± 68.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 5.52 ms ± 82.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_elpastor, df)
# 5.48 ms ± 198 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 14.7 ms ± 317 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Above are the updated values for a rerun in 2022. The rankings have remained stable over the past years for different systems, although the absolute numbers have decreased by factors between 2 and 10, from which citynorman's solution and my solution (normanius) have benefited the most.

Great solution. However, we should not assign the modified df with the inserted column to the original df explicitly. So instead of `df = df.insert(0, col.name, col)`, we need to do `df.insert(0, col.name, col)`. You have it correctly in the function ```move_column_inplace()``` though. — mellifluous, Mar 23 '20 at 18:27
Thanks @normanius. Been working hard in Dexter's Laboratory I see. :-) Great solution. Occam's Razor. Simple and elegant. — brohjoe, Jul 16 '20 at 02:40

Sachinmm · Answer 4 · 2015-09-22T05:28:41.857

You can use the df.reindex() function in pandas. df is

                      Net  Upper   Lower  Mid  Zsore
Answer option                                      
More than once a day  0%  0.22%  -0.12%    2     65
Once a day            0%  0.32%  -0.19%    3     45
Several times a week  2%  2.45%   1.10%    4     78
Once a week           1%  1.63%  -0.40%    6     65

define an list of column names

cols = df.columns.tolist()
cols
Out[13]: ['Net', 'Upper', 'Lower', 'Mid', 'Zsore']

move the column name to wherever you want

cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[16]: ['Mid', 'Net', 'Upper', 'Lower', 'Zsore']

then use df.reindex() function to reorder

df = df.reindex(columns= cols)

out put is: df

                      Mid  Upper   Lower Net  Zsore
Answer option                                      
More than once a day    2  0.22%  -0.12%  0%     65
Once a day              3  0.32%  -0.19%  0%     45
Several times a week    4  2.45%   1.10%  2%     78
Once a week             6  1.63%  -0.40%  1%     65

citynorman · Answer 5 · 2022-03-03T21:21:38.657

25

I didn't like how I had to explicitly specify all the other column in the other solutions.

cfg_col_sel = ['Mid', 'Zscore']
cfg_col_sel = cfg_col_sel+[s for s in df.columns if not s in cfg_col_sel]
df = df[cfg_col_sel]

This is another way of doing it but less flexible

df = df.set_index('Mid').reset_index()

edited Mar 03 '22 at 21:21

answered Dec 01 '17 at 17:14

citynorman

4,918
3
38
39

This exploits that current versions of `reset_index()` insert the dropped index at the first position. Note, however, that this behavior is not specified in the [docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html). – normanius Nov 04 '19 at 03:27
1

As for the performance, see my answer. It's advantageous to use `inplace=True` for both `set_index()` and `reset_index()`. – normanius Nov 04 '19 at 03:29

Bhagabat Behera · Answer 6 · 2018-06-24T12:35:57.933

11

Here is a generic set of code that I frequently use to rearrange the position of columns. You may find it useful.

cols = df.columns.tolist()
n = int(cols.index('Mid'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
df = df[cols]

edited Jun 24 '18 at 12:35

answered Jun 24 '18 at 12:00

Bhagabat Behera

853
7
7

3

ideally, explain your answer and what makes it a good solution, and not only post a piece of code. You are risking downvoting – tjebo Jun 24 '18 at 12:24

Dustin Helliwell · Answer 7 · 2019-11-09T18:18:12.643

5

To reorder the rows of a DataFrame just use a list as follows.

df = df[['Mid', 'Net', 'Upper', 'Lower', 'Zsore']]

This makes it very obvious what was done when reading the code later. Also use:

df.columns
Out[1]: Index(['Net', 'Upper', 'Lower', 'Mid', 'Zsore'], dtype='object')

Then cut and paste to reorder.

For a DataFrame with many columns, store the list of columns in a variable and pop the desired column to the front of the list. Here is an example:

cols = [str(col_name) for col_name in range(1001)]
data = np.random.rand(10,1001)
df = pd.DataFrame(data=data, columns=cols)

mv_col = cols.pop(cols.index('77'))
df = df[[mv_col] + cols]

Now df.columns has.

Index(['77', '0', '1', '2', '3', '4', '5', '6', '7', '8',
       ...
       '991', '992', '993', '994', '995', '996', '997', '998', '999', '1000'],
      dtype='object', length=1001)

edited Nov 09 '19 at 18:18

answered Jun 26 '19 at 23:50

Dustin Helliwell

630
1
8
12

What if you work with a DataFrame consisting of 1001 columns? – normanius Nov 04 '19 at 00:30
The concept is the same, however with many columns the columns should be stored in a list and the list should be manipulated. See my edits above for an example. My example is effectively the same as https://stackoverflow.com/a/51009742/5827921. – Dustin Helliwell Nov 09 '19 at 18:20

score 3 · Answer 8 · answered Feb 17 '21 at 17:24

3

df.set_index('Mid').reset_index()

seems to be a pretty easy way about this.

answered Feb 17 '21 at 17:24

baxx

3,956
6
37
75

Giorgos Myrianthous · Answer 9 · 2021-08-17T22:54:45.930

3

If you want to move a column to the front of the dataframe, you can use use set_index().

df.set_index(df.pop('column_name'), inplace=True)
df.reset_index(inplace=True)

You first need to set the column you want to bring to the front as the index of the dataframe (and we do a pop in order to drop the column from the original dataframe before setting it as the index in order to avoid name collision in the next action) and finally call reset_index() to make the old index the first column of the dataframe.

For more details see How to change the order of dataframe columns in pandas.

edited Aug 17 '21 at 22:54

answered Aug 15 '21 at 22:00

Giorgos Myrianthous

36,235
20
134
156

This one resets your index, which may cause problems elsewhere. – Jelmer Wind Mar 13 '23 at 13:23

score 2 · Answer 10 · answered Jun 18 '20 at 19:11

Here is a very simple answer to this.

Don't forget the two (()) 'brackets' around columns names.Otherwise, it'll give you an error.


# here you can add below line and it should work 
df = df[list(('Mid','Upper', 'Lower', 'Net','Zsore'))]
df

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

clearly OP doesn't want to explicitly spell out the column names. In some cases with very wide dataframes it might not even be possible. — data-monkey, Nov 05 '20 at 10:48

score 1 · Answer 11 · answered Aug 09 '22 at 20:31

1

You can use movecolumn package in Python to move columns:

pip install movecolumn

Then you can write your code as:

import movecolumn as mc
mc.MoveTo1(df,'mid')

Hope that helps.

P.S : The package can be found here. https://pypi.org/project/movecolumn/

answered Aug 09 '22 at 20:31

Saad Bin Munir

64
3

score 1 · Answer 12 · answered Oct 12 '22 at 20:56

1

Easy and quick solution::

For pandas >= 1.3 (Edited in 2022):

df.insert(0, 'mean', df.pop('mean'))

How about (for Pandas < 1.3, the original answer)

df.insert(0, 'mean', df['mean'])

answered Oct 12 '22 at 20:56

Sachin

1,460
17
24

score 0 · Answer 13 · answered Jul 08 '20 at 20:12

0

The most simplist thing you can try is:

df=df[[ 'Mid',   'Upper',   'Lower', 'Net'  , 'Zsore']]

answered Jul 08 '20 at 20:12

NeelMani Shrivastava

1
1

Move column by name to front of table in pandas

13 Answers13

Here is a very simple answer to this.

Linked

Related