import pandas as pd
data={'x':['A','A','B','B','C','E','F'],
'y':['B','C','A','C','D','F','G']}
df=pd.DataFrame(data)
print(df)
I have a big dataframe like this one (simplified with ABC):
x y
0 A B
1 A C
2 B A
3 B C
4 C D
5 E F
6 F G
There are "loops" like row 0: A <-> B and row 2: B <-> A which mean the same relation for me.
I want to have the relation between the x and y column values and give them a unique new id.
So for this example table this means:
A = B = C = D give this a unique id, i.e. 90 E = F = G give this a unique id, i.e. 91
The Result table i need should be:
id value
0 90 A
1 90 B
2 90 C
3 90 D
4 91 E
5 91 F
6 91 G
How can i achieve this with pandas? Help will be very much appreciated!