1

I'm working on a program in python that converts a csvs to a list of lists. It does this multiple times for different files so I made it into a function. I haven't encountered errors with this, but I'm worried it's the most pythonic/smartest/fastest way, because these are enormous csvs.

import csv

searchZipCode = #there's a zip code here
zipCoords = #there's a file here

def parseFile(selected):
    with open(selected) as selectedFile:
            selectedReader = csv.reader(selectedFile, delimiter=',')
            for row in selectedReader:
                    yield row

def parseZips():
    return parseFile(zipCoords)

zips = parseZips()
for row in zips:
    if row[0] == searchZipCode:
            searchState = row[1]
            searchLat   = row[2]
            searchLong  = row[3]
            print searchState 

Basically, I'm wondering why for row has to repeat twice. Is there not a more elegant solution?

It'sNotALie.
  • 22,289
  • 12
  • 68
  • 103
Al.Sal
  • 984
  • 8
  • 19
  • Your `parseFile` function is a generator, so you are actually only going through each row once. You can see a good explanations of what generators are [here](http://stackoverflow.com/a/102632/1507867). – FastTurtle Jun 25 '13 at 18:36
  • Nope this looks pretty "pythonic" generators and contexts. The parseZips function looks like it doesn't really serve a purpose. Everything else looks good though. – John Jun 25 '13 at 18:40

1 Answers1

1

You can simply compare while you are reading the rows, instead of yielding and then iterating.

def findZip(selected, search):
    results = []
    with open(selected) as file:
        handle = csv.reader(file, delimiter=',')
        for row in handle:
            if row[0] == search
                results.append(row[1:4])
    return results

If you are looking to optimize it even more, you can break out of the loop once you find a match, provided that there's going to be only one match.

def findZip(selected, search):
    with open(selected) as file:
        handle = csv.reader(file, delimiter=',')
        for row in handle:
            if row[0] == search
                return row[1:4]
Achrome
  • 7,773
  • 14
  • 36
  • 45
  • I was thinking the yield would keep it from taking up too much memory. There can be as many matches as needed so that could get tricky. – Al.Sal Jun 25 '13 at 19:16
  • Using a generator in this case doesn't serve any purpose. Use a generator when you want to create large data only on request. In this case, a normal function would work as is. – Achrome Jun 25 '13 at 19:23