0

As it says on the tin, I'm writing a Python (2.7) script that looks at two spreadsheets (saved as .csv), comparing certain columns to find the rows containing the same person's information, and extracting certain data from each spreadsheet for that person and putting that data on another spreadsheet.

What I've written is included below.

The problem I'm encountering is that when I run this program, the first person and a person in the middle of the sheet from the G__Notification spreadsheet are being found and reported. However, none of the others are, despite the fact that I can manually look through the spreadsheet and compare the columns myself and find the person.

It's not that they just aren't being reported, as I included a print function at the point where a match has been found, and it only prints 2--once for each of the aforementioned people.

Have I done something bizarre here? Any help would be great.

# This script will take data from Matrix.csv and GO_Notification.csv, find the matching
# data sets, and report the data in an output folder.


import csv

# Will read data from Matrix_CSV, goFile_CSV
# Will write data to AutoGeneratorOutput.csv
f = open('AutoGeneratorOutput.csv', 'wb')
g = open('GO_Notification.csv', 'rb')
h = open('Matrix.csv', 'rb')
matrixFile_reader = csv.reader(h)
goFile_reader = csv.reader(g)
outputFile_writer = csv.writer(f)

# Create the headings in the output file
headings = ['Employee #', 'Name', 'Last 4 of SS', 'Hired', 'GO Date',\
            'PL Description', 'Department Name', 'Title', 'Supervisor'\
            'Accudose', 'Intellishelf', 'Vocera']
outputFile_writer.writerow(headings)

matrixFile_reader.next()
goFile_reader.next()

while 1:
    for goRow in goFile_reader:
        goLine = goRow
        h.seek(0)     # Return to the top of the matrixFile for the next iteration
        for matrixRow in matrixFile_reader:
            try:
                matrixLine = matrixRow
                # Compare the departments, job numbers, and PLs to find a match
                if goLine[9].strip() == matrixLine[1].strip() and goLine[11].strip() == matrixLine[5].strip() \
                   and goLine[12].strip() == matrixLine[3].strip():
                      # Here's a match
                      output = [goLine[0], goLine[1], '', goLine[2], goLine[3], goLine[9],\
                              goLine[11], goLine[13], goLine[15], matrixLine[20], matrixLine[21],\
                              matrixLine[22], matrixLine[23]]
                      outputFile_writer.writerow(output)
                      print(goLine[1])
            except StopIteration:
                pass
    break

# Close the files when finished
f.close()
g.close()
h.close()
print('Finished')

2 Answers2

1

It's a little hard to tell what you mean without sample input data. You've also got some confusing unnecessary code and removing that is the first step.

while 1:
    for foo:
       goLine=goow
       [etcetera]
    break

does the same thing as

for foo:
    goLine=goRow
    [etcetera]

so you can get rid of your "while" and "break" lines.

Also, I'm not sure why you're catching StopIteration. Delete your try / catch lines.

foobarbecue
  • 6,780
  • 4
  • 28
  • 54
0

Load each file as a table table in a DB then query using a join ;)

... well, not that stupid since Python has support for Sqlite3 + in-memory DB.
See Importing a CSV file into a sqlite3 database table using Python

Community
  • 1
  • 1
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125