IP Address Duplicate connections cleanup (Dataframes)

Asked Apr 03 '20 at 02:35

Active Apr 03 '20 at 03:19

Viewed 22 times

I have a problem that I would appreciate some help.

I have stored all IP connections from an IDPS into a data frame and trying to identify unique connections The connection is defined as IP1:IP2 is same as IP2:IP1.

import pandas as pd
import re

data = {'IP1': ['168.125.x.1', '10.10.x.1', '10.10.x.1.', '168.125.x.2',10.10.x.2], 
        'IP2': [10.10.x.1, 168.125.x.1, 168.125.x.1, 10.10.x.2,168.12.5.x.2]}

Based on this data, The answer should have been:

168.12.5.x.1 <-> 10.10.x.1
168.12.5.x.2 <-> 10.10.x.2

but I am not getting above answer at all. I appreciate any help you can provide on how to get this implemented in a data frame.

df = pd.DataFrame(data) 

df = df.drop_duplicates(inplace=False)

temp1 = df.loc[df['IP1'].isin(df['IP2']) & 
                       df['IP2'].isin(df['IP1'])]

cleanconnection = df[~df.isin(temp1)].dropna()

Thanks

edited Apr 03 '20 at 03:19

asked Apr 03 '20 at 02:35

user13205885

It gives you the desired output when you change "df['ipaddress2'].isin(df['ipaddress1'])]" to use IP1 and IP2, this a code writing error. – Rishi Apr 03 '20 at 02:49
Sorry - i made a mistake above Rishi. It is IP2 and IP1 temp1 = df.loc[df['IP1'].isin(df['IP2']) & df['IP2'].isin(df['IP1'])] Still does not result in same... – user13205885 Apr 03 '20 at 03:00

IP Address Duplicate connections cleanup (Dataframes)

0 Answers0