-1
with open('test.csv') as f:
    reader = csv.reader(f, skipinitialspace=True, delimiter = ';')
    header = next(reader)
    a = [dict(zip(header, row)) for row in reader]

Basically, with the above code I am transforming a csv into a list of dictionaries it is working for small csv files however, for my 500 MB one I am having errors. I am fairly new to the company and do not want to seem like a n00b so is there a way to do this while making it run on 32 bit addresses ?

P.S: I tried csv.DictReader and it didn't work

Traceback (most recent call last):
  File "<input>", line 3, in <module>
  File "c:\python27\Lib\csv.py", line 116, in next
    d = dict(zip(self.fieldnames, row))
MemoryError
J.Doe
  • 166
  • 1
  • 9

1 Answers1

0

a 500MB csv is going to take up a lot of space in memory. load in a smaller one then compare the difference between the file size and memory used to get an idea of overhead in python.

that said, the problem is that the program is loading the entire file into memory by doing a list comprehension. try only loading the portion needed

you can also not load into memory using the solutions here: Read random lines from huge CSV file in Python

TinyTheBrontosaurus
  • 4,010
  • 6
  • 21
  • 34
  • in memory? that's unlikely. that or whatever function is consuming this data needs to be made smarter – TinyTheBrontosaurus Nov 27 '17 at 19:46
  • @J.Doe: Then use 64-bit Python. Being able to use boatloads of RAM is one of the major features of 64-bit architectures like x86-64. Using tons of RAM instead of writing efficient programs becomes an option. 32-bit x86 code is basically only useful to maybe save memory when you don't need to access a lot of RAM. (32-bit pointers = lower memory consumption for pointer-heavy data structures). – Peter Cordes Nov 28 '17 at 12:01