I have a 3 columns file of about 28Gb. I would like to read it with python and put its content in a list of 3D tuples. Here's the code I'm using :
f = open(filename)
col1 = [float(l.split()[0]) for l in f]
f.seek(0)
col2 = [float(l.split()[1]) for l in f]
f.seek(0)
col3 = [float(l.split()[2]) for l in f]
f.close()
rowFormat = [col1,col2,col3]
tupleFormat = zip(*rowFormat)
for ele in tupleFormat:
### do something with ele
There's no 'break' command in the for loop, meaning that I actually read the whole content of the file. When the script is being run, I notice from the 'htop' command that it takes 156G of virtual memory (VIRT column) and almost the same amount for the resident memory (RES column). Why is my script using 156G whereas the file size is only 28G ?