I have an ASCII file that is essentially a grid of 16-bit signed integers; the file size on disk is approximately 300MB. I do not need to read the file into memory, but do need to store its contents as a single container (of containers), so for initial testing on memory use I tried list
and tuples
as inner containers with the outer container always as a list
via list comprehension:
with open(file, 'r') as f:
for _ in range(6):
t = next(f) # skipping some header lines
# Method 1
grid = [line.strip().split() for line in f] # produces a 3.3GB container
# Method 2 (on another run)
grid = [tuple(line.strip().split()) for line in f] # produces a 3.7GB container
After discussing use of the grid amongst the team, I need to keep it as a list of lists up until a certain point at which time I will then convert it to a list of tuples for program execution.
What I am curious about is how a 300MB file can have its lines stored in a container of containers and have its overall size be 10x the original raw file size. Does each container really occupy that much memory space for holding a single line each?