Let's say I have a dataframe:
a = [1,1,2,3,4]
b = [1,1,6,7,8]
c = [2,9,3,4,5]
ab = pd.DataFrame(zip(a,b,c), columns = {'col1', 'col2', 'col3'})
ab
col2 col3 col1
0 1 1 2
1 1 1 9
2 2 6 3
3 3 7 4
4 4 8 5
And let's say I wanted to get unique rows across n columns (in this case col2 and col3, but would love a general n example). but keep all columns in the dataframe and only omit the duplicate as shown below.
col2 col3 col1
0 1 1 2
2 2 6 3
3 3 7 4
4 4 8 5
What would be the best way to do this?
This is a similar question to Subset with unique cases, based on multiple columns but only in Python