0

I am working with python and csv, trying to check each row of a file for specific values. I think my loop here is wrong. Where can I add the conditions to check?

It basically has to check for an existing data set and break the loop if it finds the same data set or else should keep writing. This is something I tried:

csvfile = open('file1.csv','r')
csvFileArray = []
for row in csv.DictReader(csvfile):
    csvFileArray.append(row)
    
for i in range(0,lines):
    #print(csvFileArray[i])
    (firstentry == csvFileArray[i])

csv data:

Status,School,Details
Accepted,Notre Dame / Mendoza,GMAT: 690 Round: Round 4 | Midwest
Note,Vanderbilt / Owen,GRE: 318 Round: Round 2
Waitlisted,Dartmouth / Tuck,GMAT: 720 Round: Round 2

For the condition: It will mainly match the status, school, details from new entry. If new entry is same to already existing entries then break else writerow

Aytida
  • 189
  • 1
  • 10

1 Answers1

0
import csv

seen = set()
with open("tmp.csv", "r") as f:
    for line in csv.reader(f, delimiter=","):
        if line in seen:
            break
        else:
            seen.add(line)

Depending on what you are looking to do, you might also find this approach useful: How can I filter lines on load in Pandas read_csv function?

rudolfovic
  • 3,163
  • 2
  • 14
  • 38