Comparing CSV matching rows with Python

Question

I have two CSVs containing only one column, each:

littleListIPs.csv:

10.187.172.140
10.187.172.141
10.187.172.142
10.187.172.143
10.187.172.144
10.187.172.145
10.187.172.154
10.187.172.155

(...)

-

BigListIPs.csv:

10.187.172.146
10.187.172.147
10.187.172.148
10.187.172.149
10.187.172.150
10.187.172.151
10.187.172.152
10.187.172.153
10.187.172.154
10.187.172.155

(...)

I need a script that will compare them and create a third file (output.csv), containing every single row from littleListIPs.csv, and a column that confirms if that IP exists on the BigListIPs.csv file, like in the following output (you can place ";" instead of "|"):

10.187.172.140 | Not present in BigListIPs.csv
10.187.172.141 | Not present in BigListIPs.csv
10.187.172.142 | Not present in BigListIPs.csv
10.187.172.143 | Not present in BigListIPs.csv
10.187.172.144 | Not present in BigListIPs.csv
10.187.172.145 | Not present in BigListIPs.csv
10.187.172.154 | Present in BigListIPs.csv
10.187.172.155 | Present in BigListIPs.csv

I have seen a similar case that was solved here in Stack (Python: Comparing two CSV files and searching for similar items), but I could not manipulate it well for my needs, even being a simpler case. Thanks for any help.

please post what you have tried and we can help you from there — R Nar, Nov 06 '15 at 20:38

spazm · Accepted Answer · 2015-11-06T21:25:45.263

Written in python 2.x, since that's what I have handy.

Load BigIp list into a set. Checking for in an array is O(n), checking for in a set is O(1).
use with to open the files, which is good practice and makes sure they were closed properly.

code:

#!/usr/bin/env python

import csv

little_ip_filename = "littleListIPs.csv"
big_ip_filename = "BigListIPs.csv"
output_filename = "results.csv"

# Load all the entries from BigListIPs into a set for quick lookup.
big_ips = set()

with open(big_ip_filename, 'r') as f:
    big_ip = csv.reader(f)
    for csv_row in big_ip:
        big_ips.add(csv_row[0])

# print big_ips

with open(little_ip_filename, 'r') as input_file, open(output_filename, 'w') as output_file:
    input_csv = csv.reader(input_file)
    output_csv = csv.writer(output_file)
    for csv_row in input_csv:
        ip = csv_row[0]
        status = "Present" if ip in big_ips else "Not Present"
        output_csv.writerow([ip, status + " in BigListIPs.csv"])

littleListIPs.csv:

10.187.172.140
10.187.172.141
10.187.172.142
10.187.172.143
10.187.172.144
10.187.172.145
10.187.172.154
10.187.172.155

BigListIPs.csv:

10.187.172.146
10.187.172.147
10.187.172.148
10.187.172.149
10.187.172.150
10.187.172.151
10.187.172.152
10.187.172.153
10.187.172.154
10.187.172.155

results.csv:

10.187.172.140,Not Present in BigListIPs.csv
10.187.172.141,Not Present in BigListIPs.csv
10.187.172.142,Not Present in BigListIPs.csv
10.187.172.143,Not Present in BigListIPs.csv
10.187.172.144,Not Present in BigListIPs.csv
10.187.172.145,Not Present in BigListIPs.csv
10.187.172.154,Present in BigListIPs.csv
10.187.172.155,Present in BigListIPs.csv

SirParselot · Answer 2 · 2015-11-06T20:51:36.837

0

You can just use in to check if IP is in BigList and then write to third file

littlelistIPs = ['10.187.172.140', '10.187.172.141', '10.187.172.142', '10.187.172.143',
                '10.187.172.144', '10.187.172.145', '10.187.172.154', '10.187.172.155']

biglistIPs = ['10.187.172.146', '10.187.172.147', '10.187.172.148', '10.187.172.149',
              '10.187.172.150', '10.187.172.151', '10.187.172.152', '10.187.172.153',
              '10.187.172.154', '10.187.172.155']

with open('output.csv', 'w') as f:
    for i in littlelistIPs:
        if i in biglistIPs:
            f.write(i + ' | present in BigListIPs.csv\n')
        else:
            f.write(i + ' | Not present in BigListIPs.csv\n')

edited Nov 06 '15 at 20:51

answered Nov 06 '15 at 20:47

SirParselot

2,640
2
20
31

given that BigListIPs is assumed to be big, converting it to a set will make the lookup cheaper. – spazm Nov 06 '15 at 20:50
@spazm yes, very true if there are duplicates but I was assuming there wouldn't be any. – SirParselot Nov 06 '15 at 20:53

Comparing CSV matching rows with Python

2 Answers2

Linked