I have a problem that I would appreciate some help.
I have stored all IP connections from an IDPS into a data frame and trying to identify unique connections The connection is defined as IP1:IP2 is same as IP2:IP1.
import pandas as pd
import re
data = {'IP1': ['168.125.x.1', '10.10.x.1', '10.10.x.1.', '168.125.x.2',10.10.x.2],
'IP2': [10.10.x.1, 168.125.x.1, 168.125.x.1, 10.10.x.2,168.12.5.x.2]}
Based on this data, The answer should have been:
168.12.5.x.1 <-> 10.10.x.1
168.12.5.x.2 <-> 10.10.x.2
but I am not getting above answer at all. I appreciate any help you can provide on how to get this implemented in a data frame.
df = pd.DataFrame(data)
df = df.drop_duplicates(inplace=False)
temp1 = df.loc[df['IP1'].isin(df['IP2']) &
df['IP2'].isin(df['IP1'])]
cleanconnection = df[~df.isin(temp1)].dropna()
Thanks