I have an undirected network of connections in a dataframe.
Source_ID Target_ID
0 1 5
1 7 2
2 12 6
3 3 9
4 16 11
5 2 7 <------The same as row 1
6 4 8
7 5 1 <------The same as row 0
8 99 81
But since this is an undirected network, row 0 and row 7 are technically the same, as are row 1 and row 5. df.drop_duplicates()
isn't smart enough to know how to eliminate these as duplicates, as it see them as two distinct rows, at least as far as my attempts have yielded.
I also tried what I thought should work, which is using the index of Source_ID
and Target_ID
and setting Source_ID
to be "lower" than target_ID
. But that didn't seem to produce the results I needed either.
df.drop(df.loc[df['Target_ID'] < d['Source_ID']]
.index.tolist(), inplace=True)
Therefore, I need to figure out a way to drop the duplicate connections (while keeping the first) such that my fixed dataframe looks like (after an index reset):
Source_ID Target_ID
0 1 5
1 7 2
2 12 6
3 3 9
4 16 11
5 4 8
6 99 81