How do I drop a duplicate pandas df column based on the name of the column?

Question

How do I drop the duplicate row below "Second Column Name" based on its name? Droping by a specific name is important as depending on the name it could be either keep first or keep last.

I want to have something like this but I think it only works for rows:

df= df.drop_duplicates(subset=['Second Column Name'], keep='first')

The desired output would be :

   First Column Name  Second Column Name  Third Column Name 
0                  1                   3                  6              
1                  2                   5                  5

This is the code so far:

df = pd.DataFrame({'First Column Name':  [1, 2],
        'Second Column Name': [3, 5],
        'Third Column Name': [6, 5],
        'Fourth Column Name': [4, 7],
        })

df = df.rename(columns={ 'Fourth Column Name' : "Second Column Name"})

print(df)
   First Column Name  Second Column Name  Third Column Name  Second Column Name
0                  1                   3                  6                   4
1                  2                   5                  5                   7

score 0 · Answer 1 · answered Aug 22 '20 at 17:13

0

This will do it:

df = df.loc[:, ~df.columns.duplicated()]
print(df)

   First Column Name  Second Column Name  Third Column Name
0                  1                   3                  6
1                  2                   5                  5

answered Aug 22 '20 at 17:13

NYC Coder

7,424
2
11
24

I want to remove only dupicates with a specific name – JPWilson Aug 22 '20 at 17:22
Wait so you know the duplicates before hand? – NYC Coder Aug 22 '20 at 17:26
No, the duplicate names will be gotten from dynamic variables – JPWilson Aug 22 '20 at 17:28

How do I drop a duplicate pandas df column based on the name of the column?

1 Answers1