0

I have 2 stock lists (New and Old). How can I compare it to see what items have been added and what had been removed (happy to add them to 2 different files added and removed)?

so far I have tired along the lines of looking row by row.

import csv

new = "new.csv"
old = "old.csv"
add_file = "add.csv"
remove_file = "remove.csv"

with open(new,encoding="utf8") as new_read, open(old,encoding="utf8") as old_read:
    new_reader = csv.DictReader(new_read)
    old_reader = csv.DictReader(old_read)
    for new_row in new_reader :
        for old_row in old_reader:
            if old_row["STOCK CODE"] == new_row["STOCK CODE"]:
                print("found")

This works for 1 item. if I add an *else: * it just keeps printing that until its found. So it's not an accurate way of comparing the files.

I have 5k worth of rows.

There must be a better way to add the differences to the 2 different files and keep the same data structure at the same time ?

N.B i have tired this link Python : Compare two csv files and print out differences 2 minor issues: 1. the data structure is not kept 2. there is not reference to the change of location

Shaggy89
  • 689
  • 1
  • 5
  • 18
  • Are both of the files in any given order? – Axe319 Jan 09 '20 at 11:06
  • What do you want the output to look like? I'm a little confused by what you mean when you say "keep the data structure the same"? They're just CSVs right? By change of location do you mean what row in the CSV it occurs in? This might be easier to understand if we had a few lines of each input csv to run on. – PyPingu Jan 09 '20 at 11:06
  • files are here https://www.dropbox.com/s/yjjc60a0517zjbn/stack_files.zip?dl=0 – Shaggy89 Jan 09 '20 at 11:20
  • @Axe319 no not in any order – Shaggy89 Jan 09 '20 at 11:21
  • By any chance, do these files come out of a database (because that would make sense if they are stock lists)? – Tomalak Jan 09 '20 at 11:25
  • they get emailed 5am every morning -_- i have python reading the email as it comes in so i want to compare the differences from yesterday's to the current days so i can update my webpage (via python) – Shaggy89 Jan 09 '20 at 11:30
  • For the example files you linked, what would you expect the output to look like? – PyPingu Jan 09 '20 at 11:36
  • Yes as the second part of my program (that is working) uploads the data to my website. – Shaggy89 Jan 09 '20 at 11:42
  • That doesn't answer the question of what you expect to be in `add.csv` and `remove.csv`? – PyPingu Jan 09 '20 at 11:47
  • So in add.csv would be the items that are not in old.csv and in remove.csv would be the items no longer in new.csv – Shaggy89 Jan 09 '20 at 11:55

1 Answers1

0

You could just read the data into memory and then compare. I used sets for the codes in this example for faster lookup.

import csv

def get_csv_data(file_name):
    data = []
    codes = set()
    with open(file_name, encoding="utf8") as csv_file:
        reader = csv.DictReader(csv_file)
        for row in reader:
            data.append(row)
            codes.add(row['STOCK CODE'])

    return data, codes

def write_csv(file_name, data, codes):
    with open(file_name, 'w', encoding="utf8", newline='') as csv_file:
        headers = list(data[0].keys())
        writer = csv.DictWriter(csv_file, fieldnames=headers)

        writer.writeheader()
        for row in data:
            if row['STOCK CODE'] not in codes:
                writer.writerow(row)

new_data, new_codes = get_csv_data('new.csv')
old_data, old_codes = get_csv_data('old.csv')

write_csv('add.csv', new_data, old_codes)
write_csv('remove.csv', old_data, new_codes)
Axe319
  • 4,255
  • 3
  • 15
  • 31
  • Cheers works for the most part.. if there is no difference the it comes up with ,,,,,,,, in the remove .csv file – Shaggy89 Jan 09 '20 at 11:50
  • Yeah, this is assuming there is a difference. You could always do your comparison first and if there's no difference not even open up the csv to write to. – Axe319 Jan 09 '20 at 11:53