How to rearrange Pandas column sequence?

Question

>>> df =DataFrame({'a':[1,2,3,4],'b':[2,4,6,8]})
>>> df['x']=df.a + df.b
>>> df['y']=df.a - df.b
>>> df
   a  b   x  y
0  1  2   3 -1
1  2  4   6 -2
2  3  6   9 -3
3  4  8  12 -4

Now I want to rearrange the column sequence, which makes 'x','y' column to be the first & second columns by :

>>> df = df[['x','y','a','b']]
>>> df
    x  y  a  b
0   3 -1  1  2
1   6 -2  2  4
2   9 -3  3  6
3  12 -4  4  8

But if I have a long coulmns 'a','b','c','d'....., and I don't want to explictly list the columns. How can I do that ?

Or Does Pandas provide a function like set_column_sequence(dataframe,col_name, seq) so that I can do : set_column_sequence(df,'x',0) and set_column_sequence(df,'y',1) ?

score 50 · Answer 1 · answered May 19 '14 at 15:32

50

You could also do something like this:

df = df[['x', 'y', 'a', 'b']]

You can get the list of columns with:

cols = list(df.columns.values)

The output will produce something like this:

['a', 'b', 'x', 'y']

...which is then easy to rearrange manually before dropping it into the first function

answered May 19 '14 at 15:32

freddygv

9,168
1
15
9

4

For newbies like me, re-arrange the `list` you get from `cols`. Then `df=df[cols]` i.e. the re-arranged list gets dropped into the first expression without only one set of brackets. – Sid Mar 20 '18 at 15:17
Could you be explaining the first example? – Delrius Euphoria Nov 05 '20 at 07:17

score 12 · Answer 2 · edited Nov 03 '15 at 17:53

12

There may be an elegant built-in function (but I haven't found it yet). You could write one:

# reorder columns
def set_column_sequence(dataframe, seq, front=True):
    '''Takes a dataframe and a subsequence of its columns,
       returns dataframe with seq as first columns if "front" is True,
       and seq as last columns if "front" is False.
    '''
    cols = seq[:] # copy so we don't mutate seq
    for x in dataframe.columns:
        if x not in cols:
            if front: #we want "seq" to be in the front
                #so append current column to the end of the list
                cols.append(x)
            else:
                #we want "seq" to be last, so insert this
                #column in the front of the new column list
                #"cols" we are building:
                cols.insert(0, x)
return dataframe[cols]

For your example: set_column_sequence(df, ['x','y']) would return the desired output.

If you want the seq at the end of the DataFrame instead simply pass in "front=False".

edited Nov 03 '15 at 17:53

Jesper - jtk.eth

7,026
11
36
63

answered Sep 08 '12 at 10:41

Andy Hayden

359,921
101
625
535

Hopefully I can find Pandas build-in 'set_column_sequence(df, col_list, assign_col_seq)' function, that I can use "set_column_sequence(df,['x','y'],[0,1])" to have the job done. – bigbug Sep 08 '12 at 10:57
@bigbug Hopefully! If I find one I'll update my answer... 'til you do, this should work. – Andy Hayden Sep 08 '12 at 11:03
How about including a list comprehension version to speed it up? – pylang Aug 04 '16 at 15:48
@pylang I don't think that'll speed it up (that's not the main perf issue I shouldn't think) but a neat way to write it is: `s = {col: i for i, col in enumerate(first_cols)}; sorted(df.columns, key=lambda c: s.get(c, len(s)))`. This is still (4 years later) kinda awkward to write, I though perhaps there was a trick with sort_index, but there doesn't seem to be. hmmm – Andy Hayden Aug 05 '16 at 08:39
@pylang potentially you can do: `df[df.columns[df.columns.map(lambda col: s.get(col, len(s))).argsort()]]` but that;s pretty ugly. Still better than this old answer, so I may edit that in... – Andy Hayden Aug 05 '16 at 08:41

score 7 · Answer 3 · answered Nov 20 '17 at 14:53

7

You can do the following:

df =DataFrame({'a':[1,2,3,4],'b':[2,4,6,8]})

df['x']=df.a + df.b
df['y']=df.a - df.b

create column title whatever order you want in this way:

column_titles = ['x','y','a','b']

df.reindex(columns=column_titles)

This will give you desired output

answered Nov 20 '17 at 14:53

Okroshiashvili

3,677
2
26
40

My preferred way because this can be chained with other methods on the dataframe – Rabeez Riaz Dec 26 '19 at 06:14

score 3 · Accepted Answer · answered Mar 27 '13 at 11:49

def _col_seq_set(df, col_list, seq_list):
    ''' set dataframe 'df' col_list's sequence by seq_list '''
    col_not_in_col_list = [x for x in list(df.columns) if x not in col_list]
    for i in range(len(col_list)):
        col_not_in_col_list.insert(seq_list[i], col_list[i])

    return df[col_not_in_col_list]
DataFrame.col_seq_set = _col_seq_set

score 1 · Answer 5 · answered Sep 08 '12 at 20:41

1

I would suggest you just write a function to do what you're saying probably using drop (to delete columns) and insert to insert columns at a position. There isn't an existing API function to do what you're describing.

answered Sep 08 '12 at 20:41

Wes McKinney

101,437
32
142
108

Garrett · Answer 6 · 2012-09-11T08:44:21.407

0

Feel free to disregard this solution as subtracting a list from an Index does not preserve the order of the original Index, if that's important.

In [61]: df.reindex(columns=pd.Index(['x', 'y']).append(df.columns - ['x', 'y']))
Out[61]: 
    x  y  a  b
0   3 -1  1  2
1   6 -2  2  4
2   9 -3  3  6
3  12 -4  4  8

edited Sep 11 '12 at 08:44

answered Sep 08 '12 at 23:58

Garrett

47,045
6
61
50

How to rearrange Pandas column sequence?

6 Answers6

Linked