0

I have written the following code to get all of the IP-Addresses out of a file and print them:

with open("C:\\users\\joey\\desktop\\access.log",'r') as bestand:
    for line in bestand:
        try:
            splittedline = line.split('sftp-session')[1].split("[")[1].split("]")[0]
        except Exception:
            continue
        print splittedline

The following code prints all of the IP-Addresses of another file:

with open("C:\\users\\joey\\desktop\\exit_nodes.csv",'r') as bestand:
    for line in bestand:
        print line

How can I compare the 2 files and only show unique IP-Addresses and remove the duplicates?

The output atm is like:

217.172.190.19
217.210.165.43
218.250.241.229
223.18.115.229
223.133.243.101
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
joey
  • 241
  • 1
  • 4
  • 16

1 Answers1

2

If the order is not important, use a set:

ips_1 = set()

with open("C:\\users\\joey\\desktop\\access.log",'r') as bestand:
    for line in bestand:
        try:
            ips1.add(linprint splittedlinee.split('sftp-session')[1].split("[")[1].split("]")[0])
        except Exception:
            continue

ips_2 = set()
with open("C:\\users\\joey\\desktop\\exit_nodes.csv",'r') as bestand:
    for line in bestand:
        ips_2.add(line)

You can then use the set methods to look which ips are in both files, which are only on one file or to get all unique ips:

Which ips are in both files?

ips_1.intersection(ips_2)

Which ips are only in file 1?

ips_1.difference(ips_2)

All unique ips:

ips_1.union(ips_2)
MaxNoe
  • 14,470
  • 3
  • 41
  • 46