Dropping multiple Pandas columns by Index

Question

I have a large pandas dataframe (>100 columns). I need to drop various sets of columns and i'm hoping there is a way of using the old

df.drop(df.columns['slices'],axis=1)

I've built selections such as:

a = df.columns[3:23]
b = df.colums[-6:]

as a and b represent column sets I want to drop.

The following

list(df)[3:23]+list(df)[-6:]

yields the correct selection, but i can't implement it with a drop:

df.drop(df.columns[list(df)[3:23]+list(df)[-6:]],axis=1)

ValueError: operands could not be broadcast together with shapes (20,) (6,)

I looked around but can't get my answer.

Selecting last n columns and excluding last n columns in dataframe

(Below pertains to the error I receive):

python numpy ValueError: operands could not be broadcast together with shapes

This one feels like they're having a similar issue, but the 'slices' aren't separate: Deleting multiple columns based on column names in Pandas

Cheers

That works too - as usual more than one way to skin a cat. This is more in line with what i was trying to get to, so thanks very much. — BAC83, Aug 09 '18 at 11:50

score 10 · Answer 1 · edited May 05 '19 at 20:03

10

This returns the dataframe with the columns removed

df.drop(list(df)[2:5], axis=1)

edited May 05 '19 at 20:03

Zoe

27,060
21
118
148

answered Aug 09 '18 at 11:58

Chabu

121
8

score 8 · Accepted Answer · answered Aug 09 '18 at 12:43

8

You can use np.r_ to seamlessly combine multiple ranges / slices:

from string import ascii_uppercase

df = pd.DataFrame(columns=list(ascii_uppercase))

idx = np.r_[3:10, -5:0]

print(idx)

array([ 3,  4,  5,  6,  7,  8,  9, -5, -4, -3, -2, -1])

You can then use idx to index your columns and feed to pd.DataFrame.drop:

df.drop(df.columns[idx], axis=1, inplace=True)

print(df.columns)

Index(['A', 'B', 'C', 'K', 'L', 'M', 'N',
       'O','P', 'Q', 'R', 'S', 'T', 'U'], dtype='object')

answered Aug 09 '18 at 12:43

jpp

159,742
34
281
339

1

I thought there would be some np function for slice combinations but I couldn't find it. Cheers – BAC83 Aug 09 '18 at 12:47
1

I got caught out on the need to define the end of the list slice (the 0 in `[-n:0]`) but think i understand now. Thanks again! – BAC83 Aug 09 '18 at 12:54

DINA TAKLIT · Answer 3 · 2019-05-29T14:10:42.183

You can use this simple solution:

cols = [3,7,10,12,14,16,18,20,22]
df.drop(df.columns[cols],axis=1,inplace=True)

the result :

    0   1   2   4   5   6   8   9    11  13      15     17      19       21
0   3   12  10  3   2   1   7   512  64  1024.0  -1.0   -1.0    -1.0    -1.0
1   5   12  10  3   2   1   7   16   2   32.0    32.0   1024.0  -1.0    -1.0
2   5   12  10  3   2   1   7   512  2   32.0    32.0   32.0    -1.0    -1.0
3   5   12  10  3   2   1   7   16   1   32.0    64.0   1024.0  -1.0    -1.0

As you can see the columns with given index have been all deleted.

You can replace the int value by the name of the column you have in your array if we suppose you have A,B,C ...etc you can replace int values in cols like this for example :

cols = ['A','B','C','F']

score 2 · Answer 4 · answered Aug 09 '18 at 11:42

2

IIUC:

a = df.columns[3:23].values.tolist()
b = df.colums[-6:].values.tolist()

a.extend(b)
df.drop(a,1,inplace=True)

answered Aug 09 '18 at 11:42

shivsn

7,680
1
26
33

score 1 · Answer 5 · answered Aug 09 '18 at 11:48

1

I have run into a similar issue before and had trouble with this but fixed it by "subtracting" one df from the other, not sure if this will work for you but it did for me:

df = df[~df.index.isin(a.index)]
df = df[~df.index.isin(b.index)]

answered Aug 09 '18 at 11:48

Patrick Maynard

314
3
18

Dropping multiple Pandas columns by Index

5 Answers5