I'm trying to remove all rows that have any duplicates. I ONLY want the unique rows. I've tried the keep = False
parameter for drop_duplicates()
with `subset = [ORDER ID, ITEM CODE] , and its just not doing the right thing.
lets say my dataframe looks something like this
|ORDER ID | ITEM CODE |
123 XXX
123 YYY
123 YYY
456 XXX
456 XXX
456 XXX
789 XXX
000 YYY
I want it to look like this:
|ORDER ID | ITEM CODE |
123 XXX
789 XXX
000 YYY
As you can see the subset would be both the order ID and Item code columns and we would lose rows 2-6 ideally. (The actual dataset has a lot more columns.)