I understand that to drop a column you use df.drop('column name', axis=1). Is there a way to drop a column using a numerical index instead of the column name?
-
I figure this will not work for the reasons shown here: http://stackoverflow.com/questions/13411544/delete-column-from-pandas-dataframe – John Nov 30 '13 at 07:37
11 Answers
You can delete column on i
index like this:
df.drop(df.columns[i], axis=1)
It could work strange, if you have duplicate names in columns, so to do this you can rename column you want to delete column by new name. Or you can reassign DataFrame like this:
df = df.iloc[:, [j for j, c in enumerate(df.columns) if j != i]]

- 3
- 3

- 107,110
- 28
- 195
- 197
-
18I think you missed the point - they want to drop by index, not by label. Converting index into a label is just dropping by label :( – Darren May 09 '19 at 21:15
-
How to index cols, if I have to drop 100 columns that are continuous in the middle of the data frame – mArk May 28 '20 at 14:05
-
1The second technique using iloc works well given duplicate column names and is very performant. Thanks. – Nick Apr 11 '21 at 17:28
Drop multiple columns like this:
cols = [1,2,4,5,12]
df.drop(df.columns[cols],axis=1,inplace=True)
inplace=True
is used to make the changes in the dataframe itself without doing the column dropping on a copy of the data frame. If you need to keep your original intact, use:
df_after_dropping = df.drop(df.columns[cols],axis=1)
-
3
-
15if you do not use `inplace=True` then you will have to do `df = df.drop()` if you want to see the change in `df` itself. – muon Feb 08 '16 at 20:21
-
How to index cols, if I have to drop 100 columns that are continuous in the middle of the data frame. – mArk May 28 '20 at 14:05
-
you could do something like `col_indices = [df.columns.tolist().index(c) for c in list_of_colnames]` – muon Dec 03 '21 at 17:46
If there are multiple columns with identical names, the solutions given here so far will remove all of the columns, which may not be what one is looking for. This may be the case if one is trying to remove duplicate columns except one instance. The example below clarifies this situation:
# make a df with duplicate columns 'x'
df = pd.DataFrame({'x': range(5) , 'x':range(5), 'y':range(6, 11)}, columns = ['x', 'x', 'y'])
df
Out[495]:
x x y
0 0 0 6
1 1 1 7
2 2 2 8
3 3 3 9
4 4 4 10
# attempting to drop the first column according to the solution offered so far
df.drop(df.columns[0], axis = 1)
y
0 6
1 7
2 8
3 9
4 10
As you can see, both Xs columns were dropped. Alternative solution:
column_numbers = [x for x in range(df.shape[1])] # list of columns' integer indices
column_numbers .remove(0) #removing column integer index 0
df.iloc[:, column_numbers] #return all columns except the 0th column
x y
0 0 6
1 1 7
2 2 8
3 3 9
4 4 10
As you can see, this truly removed only the 0th column (first 'x').

- 1,848
- 1
- 18
- 26
-
4You're my hero. Was trying to think of a clever way to do this for way too long. – ATK7474 May 27 '20 at 15:39
-
3This iloc solution is exactly what I was looking for. dropping the first x columns becomes `df = df.iloc[:, x:]` If you want to drop columns x through y you could do something like: `all_cols = set(range(0,len(df.columns))) keep_cols = all_cols - set(range(x,y+1)) df = df.iloc[:, list(keep_cols)]` – JDenman6 Jan 06 '21 at 15:11
-
3This answer deserves more upvotes as it handles duplicate column names correctly. – u-phoria May 13 '21 at 14:13
-
1@AlexandreHuat a CS Lord with less than 1500 points! ;) Thanks you, anyways – Saeed Aug 24 '21 at 13:03
-
1
If you have two columns with the same name. One simple way is to manually rename the columns like this:-
df.columns = ['column1', 'column2', 'column3']
Then you can drop via column index as you requested, like this:-
df.drop(df.columns[1], axis=1, inplace=True)
df.column[1]
will drop index 1.
Remember axis 1 = columns and axis 0 = rows.

- 1,263
- 13
- 22
You need to identify the columns based on their position in dataframe. For example, if you want to drop (del) column number 2,3 and 5, it will be,
df.drop(df.columns[[2,3,5]], axis = 1)

- 953
- 13
- 25
You can simply supply columns
parameter to df.drop
command so you don't to specify axis
in that case, like so
columns_list = [1, 2, 4] # index numbers of columns you want to delete
df = df.drop(columns=df.columns[columns_list])
For reference see columns
parameter here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html?highlight=drop#pandas.DataFrame.drop

- 304
- 3
- 15
if you really want to do it with integers (but why?), then you could build a dictionary.
col_dict = {x: col for x, col in enumerate(df.columns)}
then df = df.drop(col_dict[0], 1)
will work as desired
edit: you can put it in a function that does that for you, though this way it creates the dictionary every time you call it
def drop_col_n(df, col_n_to_drop):
col_dict = {x: col for x, col in enumerate(df.columns)}
return df.drop(col_dict[col_n_to_drop], 1)
df = drop_col_n(df, 2)

- 14,213
- 4
- 18
- 22
You can use the following line to drop the first two columns (or any column you don't need):
df.drop([df.columns[0], df.columns[1]], axis=1)

- 5,547
- 8
- 20
- 42

- 41
- 1
Good way to get the columns you want (doesn't matter duplicate names).
For example you have the column indices you want to drop contained in a list-like variable
unnecessary_cols = [1, 4, 5, 6]
then
import numpy as np
df.iloc[:, np.setdiff1d(np.arange(len(df.columns)), unnecessary_cols)]

- 272
- 1
- 11
Appreciate I'm very late to the party, but I had the same issue with a DataFrame that has a MultiIndex. Pandas really doesn't like non-unique multi indices, to a degree that most of the solutions above don't work in that setting (e.g. the .drop
function just errors with a ValueError: cannot handle a non-unique multi-index!
)
The solution I got to was using .iloc
instead. According to the documentation, use can use iloc with a mask (= list of True/False values of which columns you want to keep):
With a boolean array whose length matches the columns.
df.iloc[:, [True, False, True, False]]
Combined with df.columns.duplicated()
to identify duplicated columns, you can do this in an efficient, pandas-native way:
df = df.iloc[:, ~df.columns.duplicated()]

- 611
- 6
- 14
Since there can be multiple columns with same name , we should first rename the columns. Here is code for the solution.
df.columns=list(range(0,len(df.columns)))
df.drop(columns=[1,2])#drop second and third columns

- 10,366
- 25
- 84
- 114