-1

How do I drop the duplicate row below "Second Column Name" based on its name? Droping by a specific name is important as depending on the name it could be either keep first or keep last.

I want to have something like this but I think it only works for rows:

df= df.drop_duplicates(subset=['Second Column Name'], keep='first')

The desired output would be :

   First Column Name  Second Column Name  Third Column Name 
0                  1                   3                  6              
1                  2                   5                  5

This is the code so far:

df = pd.DataFrame({'First Column Name':  [1, 2],
        'Second Column Name': [3, 5],
        'Third Column Name': [6, 5],
        'Fourth Column Name': [4, 7],
        })

df = df.rename(columns={ 'Fourth Column Name' : "Second Column Name"})

print(df)
   First Column Name  Second Column Name  Third Column Name  Second Column Name
0                  1                   3                  6                   4
1                  2                   5                  5                   7
Machavity
  • 30,841
  • 27
  • 92
  • 100
JPWilson
  • 691
  • 4
  • 14

1 Answers1

0

This will do it:

df = df.loc[:, ~df.columns.duplicated()]
print(df)

   First Column Name  Second Column Name  Third Column Name
0                  1                   3                  6
1                  2                   5                  5
NYC Coder
  • 7,424
  • 2
  • 11
  • 24