1

I need to store integers as a string. Eg. - [1,2,3] will be stored as '1;2;3'. For doing this I need to first convert the list of integers to a list of strings. But the memory usage for this conversion is huge.

The sample code to show the problem.

from sys import getsizeof
import tracemalloc

tracemalloc.start()

curr, peak = tracemalloc.get_traced_memory()
print((f'Current: {round(curr/1e6)} MB\nPeak: {round(peak/1e6)} MB'))

print()

list_int = [1]*int(1e6)

curr, peak = tracemalloc.get_traced_memory()
print((f'Current: {round(curr/1e6)} MB\nPeak: {round(peak/1e6)} MB'))
print(f'Size of list_int: {getsizeof(list_int)/1e6} MB')

print()

list_str = [str(i) for i in list_int]

curr, peak = tracemalloc.get_traced_memory()
print((f'Current: {round(curr/1e6)} MB\nPeak: {round(peak/1e6)} MB'))
print(f'Size of list_str: {getsizeof(list_str)/1e6} MB')

Output:

Current: 0 MB
Peak: 0 MB

Current: 8 MB
Peak: 8 MB
Size of list_int: 8.000056 MB

Current: 66 MB
Peak: 66 MB
Size of list_str: 8.448728 MB

The memory taken by both lists is similar (8 MB), but the memory used by the program during conversion is huge (66 MB).

How can I solve this memory issue?

Edit: My need is to convert it to a string, so I will run ';'.join(list_str) in the end. So,, even if I use a generator/iterable let's say list_str = map(str, list_int), the memory usage comes out to be same.

  • I don't know why that happens but how about using an iterable? For example, `list_str = map(str, list_int)`. That would not store the whole list of strings, so that you use less memory up front. – j1-lee Nov 25 '21 at 06:37
  • @j1-lee Yes, but as I mentioned my end goal is to create a string so with the iterable as you mentioned when I run ';'.join(list_str), the memory usage becomes same. – aniketsharma00411 Nov 25 '21 at 06:39
  • Ah you are right. – j1-lee Nov 25 '21 at 06:40
  • @aniketsharma00411 with generator memory usage will be too less as compared to list comprehension . See my answer below. – Shekhar Samanta Nov 25 '21 at 06:45
  • @ShekharSamanta Check the edit, when we use a generator while converting the list of string to string the memory usage increases. You can run this code to check that: http://vpaste.net/hCAcu – aniketsharma00411 Nov 25 '21 at 06:49
  • 66 MB is not huge by anyone's definition. – Tim Roberts Nov 25 '21 at 06:49
  • 1
    @TimRoberts 66MB is not huge but when I run something similar in my application the memory consumption for list_str of size 60 MB is 560+ MB which crashes my server. The code I showed is just an example. – aniketsharma00411 Nov 25 '21 at 06:51
  • And 66 MB is huge if you need it on a microcontroller like the Raspberry Pico. But OK, in that case even 8 MB for a list would seem huge. – Matthias Nov 25 '21 at 07:08

1 Answers1

0

Use Numpy instead. Try this

from sys import getsizeof
import tracemalloc
import numpy as np

tracemalloc.start()

arr = np.ones((1000000,), dtype=np.str)
for i in [1]*int(1e6):
    arr[i] = str(i)

curr, peak = tracemalloc.get_traced_memory()
print((f'Current: {round(curr/1e6)} MB\nPeak: {round(peak/1e6)} MB'))
print(f'Size of list_str: {getsizeof(list(arr))/1e6} MB')

Output with bit improvement I Think

Current: 4 MB
Peak: 12 MB
Size of list_str: 9.000112 MB
Shekhar Samanta
  • 875
  • 2
  • 12
  • 25