80

I want to print the memory size of all variables in my scope simultaneously.

Something similar to:

for obj in locals().values():
    print sys.getsizeof(obj)

But with variable names before each value so I can see which variables I need to delete or split into batches.

Ideas?

zsquare
  • 9,916
  • 6
  • 53
  • 87
user278016
  • 821
  • 1
  • 7
  • 6

3 Answers3

131

A bit more code, but works in Python 3 and gives a sorted, human readable output:

import sys
def sizeof_fmt(num, suffix='B'):
    ''' by Fred Cirera,  https://stackoverflow.com/a/1094933/1870254, modified'''
    for unit in ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']:
        if abs(num) < 1024.0:
            return "%3.1f %s%s" % (num, unit, suffix)
        num /= 1024.0
    return "%.1f %s%s" % (num, 'Yi', suffix)

for name, size in sorted(((name, sys.getsizeof(value)) for name, value in list(
                          locals().items())), key= lambda x: -x[1])[:10]:
    print("{:>30}: {:>8}".format(name, sizeof_fmt(size)))

Example output:

                  umis:   3.6 GiB
       barcodes_sorted:   3.6 GiB
          barcodes_idx:   3.6 GiB
              barcodes:   3.6 GiB
                  cbcs:   3.6 GiB
         reads_per_umi:   1.3 GiB
          umis_per_cbc:  59.1 MiB
         reads_per_cbc:  59.1 MiB
                   _40:  12.1 KiB
                     _:   1.6 KiB

Note, that this will only print the 10 largest variables and remains silent about the rest. If you really want all printed, remove the [:10] from the second last line.

jan-glx
  • 7,611
  • 2
  • 43
  • 63
  • Nice! Can you explain what these `_40` are? To me it shows multiple `_\d+` rows. Some seem to have the exact same size like a named variable, others don't. – MoRe Feb 13 '19 at 09:52
  • 3
    @MoRe these are (probably) temporary variables holding the output of jupyter notebook cells. [see documentation](https://ipython.org/ipython-doc/3/interactive/reference.html#output-caching-system) – jan-glx Feb 14 '19 at 10:37
  • "This system obviously can potentially put heavy memory demands on your system, since it prevents Python’s garbage collector from removing any previously computed results. You can control how many results are kept in memory with the configuration option `InteractiveShell.cache_size`. If you set it to 0, output caching is disabled. You can also use the `%reset` and `%xdel` magics to clear large items from memory" – jan-glx Feb 14 '19 at 10:39
  • This snippet is really useful, although my variables totaled up to about 5.1 GB, whereas memory usage according to `resource.getrusage` was around 10.9 GB. This is in Google Colab. What could be accounting for the rest of the memory usage? – demongolem Mar 23 '20 at 16:17
  • I used this snippet for a=numpy.zeros((6340,200,200)). It shows a=1.9GB. Is it normal? – gocen Mar 31 '20 at 08:22
  • 1
    @gocen: yes. 6340*200*200 doubles *64 bit/double / (8 bit/byte) / (2^30 bytes per GB) = 1.889 GB – jan-glx Mar 31 '20 at 20:57
  • @demongolem: I don't know. And I don't know enough about how python works to answer. I can image libraries allocating memory outside of python, there might be memory leaks, the might be even python variables not in `locals` (e.g. .globals?), maybe garbage collection helps, try memory profiling and consider asking a different question. – jan-glx Mar 31 '20 at 21:04
  • @jan-gix but it is 1.6Kib for this list = [[ ['0.0' for col in range(6340)] for col in range(200)] for row in range(200)] What is the difference? – gocen Apr 01 '20 at 08:43
  • Unfortunately this doesn't include all variables in the printout – user3901917 Dec 13 '22 at 01:41
  • It's a large notebook with plenty of variables and too much code to include here, but there are lots of variables that simply aren't returned, all of which all get returned by zsquare's solution below – user3901917 Dec 13 '22 at 18:36
  • @user3901917 Well, this answer tries to reduce noise by only printing the 10 largest variables. If you really want all printed, just remove the `[:10]` in the second last line. I added a note to the answer to make this clearer. Thanks for bringing it up. – jan-glx Dec 14 '22 at 19:28
  • I see now. Thanks for clarifying! Updated to upvote. – user3901917 Dec 15 '22 at 21:37
  • 1
    I get a "dictionary changed size during iteration" when running the code above. Here is a modification exporting to a list first: `import sys def sizeof_fmt(num, suffix='B'): ... (truncated due char limit) local_vars = list(locals().items()) variables = [(var, (sys.getsizeof(obj))) for var, obj in local_vars] variables = sorted(((var, size_value) for var, size_value in variables), key= lambda x: -x[1]) variables = [(var, sizeof_fmt(size_value)) for var, size_value in variables] for var, size_fmt in variables[:10]: print("{:>30}: {:>8}".format(var, size_fmt))` – SJGD Feb 14 '23 at 10:11
  • @SJGD probably not a common problem but changed as suggested, thanks – jan-glx Feb 14 '23 at 13:45
72

You can iterate over both the key and value of a dictionary using .items()

from __future__ import print_function  # for Python2
import sys

local_vars = list(locals().items())
for var, obj in local_vars:
    print(var, sys.getsizeof(obj))
TomDLT
  • 4,346
  • 1
  • 20
  • 26
zsquare
  • 9,916
  • 6
  • 53
  • 87
0

I found that for some containers I wasn't getting the correct answer (overheads only?).
Combining @jan_Glx's ans above to a snippet from the post below
How to know bytes size of python object like arrays and dictionaries? - The simple way

from __future__ import print_function
from sys import getsizeof, stderr, getsizeof
from itertools import chain
from collections import deque
try:
    from reprlib import repr
except ImportError:
    pass

def sizeof_fmt(num, suffix='B'):
    ''' by Fred Cirera,  https://stackoverflow.com/a/1094933/1870254, modified'''
    for unit in ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']:
        if abs(num) < 1024.0:
            return "%3.1f %s%s" % (num, unit, suffix)
        num /= 1024.0
    return "%.1f %s%s" % (num, 'Yi', suffix)

def total_size(o, handlers={}, verbose=False):
    """ Returns the approximate memory footprint an object and all of its contents.

    Automatically finds the contents of the following builtin containers and
    their subclasses:  tuple, list, deque, dict, set and frozenset.
    To search other containers, add handlers to iterate over their contents:

        handlers = {SomeContainerClass: iter,
                    OtherContainerClass: OtherContainerClass.get_elements}

    """
    dict_handler = lambda d: chain.from_iterable(d.items())
    all_handlers = {tuple: iter,
                    list: iter,
                    deque: iter,
                    dict: dict_handler,
                    set: iter,
                    frozenset: iter,
                   }
    all_handlers.update(handlers)     # user handlers take precedence
    seen = set()                      # track which object id's have already been seen
    default_size = getsizeof(0)       # estimate sizeof object without __sizeof__

    def sizeof(o):
        if id(o) in seen:       # do not double count the same object
            return 0
        seen.add(id(o))
        s = getsizeof(o, default_size)

        if verbose:
            print(s, type(o), repr(o), file=stderr)

        for typ, handler in all_handlers.items():
            if isinstance(o, typ):
                s += sum(map(sizeof, handler(o)))
                break
        return s

    return sizeof(o)


##### Example call #####

for name, size in sorted(((name, total_size(value, verbose=False)) for name, value in list(
                          locals().items())), key= lambda x: -x[1])[:20]:
    print("{:>30}: {:>8}".format(name, sizeof_fmt(size)))

    
Dara O h
  • 89
  • 8