2

The most common way to delete columns from a dataframe seems to be del df["column_name"]

However, del df.loc[:,column_name] does not work even though df["column_name"] and df.loc[:,common_name] do essentially the same thing.

I'm actually wondering because I'm hoping there's an equally easy method of deleting rows. Rows can't be accessed with the direct df[] syntax, so I'd have to use del df.loc[], which as I've described doesn't work even if I'm dealing with a column

Why doesnt del df.loc[:,column_name] work, even though del df[column_name] does?

James Ronald
  • 685
  • 1
  • 6
  • 13
  • I assumed the most appropirate way to remove columns was to use `.drop` whilst specifying the axis. same for rows, `df.drop([0,4,5],0)` – Umar.H Jul 06 '20 at 17:58
  • @Datanovice Indeed, but that's assuming you know the index which can be hard to get sometimes. I guess you also need to know column names for the `del` usage, but in my experience row indices are harder to figure out, and it's usually rows I'm searching through in one way or another – James Ronald Jul 06 '20 at 18:36
  • Then I think you have a specific usecase, why don't you add some data to show your problem and i'll have a crack at a solution. – Umar.H Jul 06 '20 at 18:55

1 Answers1

1

As for why del df.loc[:, column] does not work, I assume because it's not baked into the API, the developers made .drop for a reason!

You can read here more, where even Wes the main author of Pandas recommend's del but see the other answers and comments for a wider discussion regarding best practice.

You can use drop

from the documenation.

Drop specified labels from rows or columns.

Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names. When using a multi-index, labels on different levels can be removed by specifying the level.

Drop on a named index, specifying the axis 0 for index. 1 is for columns.

import pandas as pd
df = pd.DataFrame({'num_legs': [2, 4, 8, 0],

               'num_wings': [2, 0, 0, 0],

               'num_specimen_seen': [10, 2, 1, 8]},

              index=['falcon', 'dog', 'spider', 'fish'])


print(df)

        num_legs  num_wings  num_specimen_seen
falcon         2          2                 10
dog            4          0                  2
spider         8          0                  1
fish           0          0                  8


print(df.drop(['falcon','dog'],0))
        num_legs  num_wings  num_specimen_seen
spider         8          0                  1
fish           0          0                  8

Drop based on an integer based index.

s = pd.Series(list('abca'))

df = pd.get_dummies(s)

print(df)

   a  b  c
0  1  0  0
1  0  1  0
2  0  0  1
3  1  0  0

print(df.drop([0,2],0))

    a   b   c
1   0   1   0
3   1   0   0

Drop a column.

print(df.drop(['a'],1))

  b  c
0  0  0
1  1  0
2  0  1
3  0  0
Umar.H
  • 22,559
  • 7
  • 39
  • 74