I have a dataframe as follow:
import pandas as pd
d = {'location1': [1, 2,3,8,6], 'location2':
[2,1,4,6,8]}
df = pd.DataFrame(data=d)
The dataframe df
means there is a road between two locations. look like:
location1 location2
0 1 2
1 2 1
2 3 4
3 8 6
4 6 8
The first row means there is a road between locationID1
and locationID2
, however, the second row also encodes this information. The forth and fifth rows also have repeated information. I am trying the remove those repeated by keeping only one row. Any of row is okay.
For example, my expected output is
location1 location2
0 1 2
2 3 4
4 6 8
Any efficient way to do that because I have a large dataframe with lots of repeated rows.
Thanks a lot,