2

My problem is that my data isn't a good representation of what is really going on because it has a lot of duplicate rows. Consider the following-

    a    b
1  23   42
2  23   42
3  23   42
4  14   12
5  14   12

I only want 1 row and to eliminate all duplicates. It should look like the following after it's done.

    a    b
1  23   42
2  14   12

Is there a function to do this?

Ravaal
  • 3,233
  • 6
  • 39
  • 66

1 Answers1

7

Let's use drop_duplicates with keep='first':

df2.drop_duplicates(keep='first')

Output:

    a   b
1  23  42
4  14  12
Scott Boston
  • 147,308
  • 15
  • 139
  • 187