24

I've already found the following question, but I was wondering if there was a quicker and dirtier way of grabbing an estimate of how much memory the python interpreter is currently using for my script that doesn't rely on external libraries.

I'm coming from PHP and used to use memory_get_usage() and memory_get_peak_usage() a lot for this purpose and I was hoping to find an equivalent.

Community
  • 1
  • 1
Shabbyrobe
  • 12,298
  • 15
  • 60
  • 87
  • Possible duplicate of [Total memory used by Python process?](https://stackoverflow.com/questions/938733/total-memory-used-by-python-process) – Don Kirkby Nov 26 '18 at 06:27

6 Answers6

31

A simple solution for Linux and other systems with /proc/self/status is the following code, which I use in a project of mine:

def memory_usage():
    """Memory usage of the current process in kilobytes."""
    status = None
    result = {'peak': 0, 'rss': 0}
    try:
        # This will only work on systems with a /proc file system
        # (like Linux).
        status = open('/proc/self/status')
        for line in status:
            parts = line.split()
            key = parts[0][2:-1].lower()
            if key in result:
                result[key] = int(parts[1])
    finally:
        if status is not None:
            status.close()
    return result

It returns the current and peak resident memory size (which is probably what people mean when they talk about how much RAM an application is using). It is easy to extend it to grab other pieces of information from the /proc/self/status file.

For the curious: the full output of cat /proc/self/status looks like this:

% cat /proc/self/status
Name:   cat
State:  R (running)
Tgid:   4145
Pid:    4145
PPid:   4103
TracerPid:      0
Uid:    1000    1000    1000    1000
Gid:    1000    1000    1000    1000
FDSize: 32
Groups: 20 24 25 29 40 44 46 100 1000 
VmPeak:     3580 kB
VmSize:     3580 kB
VmLck:         0 kB
VmHWM:       472 kB
VmRSS:       472 kB
VmData:      160 kB
VmStk:        84 kB
VmExe:        44 kB
VmLib:      1496 kB
VmPTE:        16 kB
Threads:        1
SigQ:   0/16382
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   03
Cpus_allowed_list:      0-1
Mems_allowed:   1
Mems_allowed_list:      0
voluntary_ctxt_switches:        0
nonvoluntary_ctxt_switches:     0
Martin Geisler
  • 72,968
  • 25
  • 171
  • 229
  • 2
    is the peak/resident in kb or bytes? – Shabbyrobe May 23 '09 at 01:58
  • 1
    Good question -- it's in kilobytes, I've added that information to the original answer. – Martin Geisler May 23 '09 at 09:16
  • Thanks heaps for the great answer. As an aside, would you have any idea why the peak ends up above 80mb(!!!) if I spawn a bunch of threads, even though the resident stays relatively low? Also, do you have any clues as to how to do this on Win32? – Shabbyrobe May 25 '09 at 13:14
  • Not to pick but is it positively Kilo (1000) or Kiki (1024) bytes? –  Jan 22 '18 at 17:25
18

You could also use the getrusage() function from the standard library module resource. The resulting object has the attribute ru_maxrss, which gives total peak memory usage for the calling process:

>>> import resource
>>> resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
2656

The Python docs aren't clear on what the units are exactly, but the Mac OS X man page for getrusage(2) describes the units as kilobytes.

The Linux man page isn't clear, but it seems to be equivalent to the /proc/self/status information (i.e. kilobytes) described in the accepted answer. For the same process as above, running on Linux, the function listed in the accepted answer gives:

>>> memory_usage()                                    
{'peak': 6392, 'rss': 2656}

This may not be quite as easy to use as the /proc/self/status solution, but it is standard library, so (provided the units are standard) it should be cross-platform, and usable on systems which lack /proc/ (eg Mac OS X and other Unixes, maybe Windows).

Also, getrusage() function can also be given resource.RUSAGE_CHILDREN to get the usage for child processes, and (on some systems) resource.RUSAGE_BOTH for total (self and child) process usage.

This will cover the memory_get_usage() case, but doesn't include peak usage. I'm unsure if any other functions from the resource module can give peak usage.

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Nathan Craike
  • 5,031
  • 2
  • 24
  • 19
  • My OSX (lion) gives: `35819520` on a process I'm running which I'm pretty sure is `35MB` rather than `35GB`, so it would seem to be bytes. :) – Kit Sunde Aug 06 '12 at 09:28
  • 2
    On my Ubuntu 11.10 machine, I get resource.getrusage() to be a value that is much closer to the the peak value of memory_usage() rather than the rss. Are you sure that ru_maxrss refers to the current memory usage and not the peak memory usage? – Phani Aug 17 '12 at 15:38
  • 1
    @Phani It seems like it's peak usage to me as well. More info about ru_maxrss in this answer: http://stackoverflow.com/a/12050966/67184 – Chris Conley Jan 22 '13 at 21:29
  • Note that the `ru_idrss` field that provides the current resident set size is currently unmaintained (Linux 3.10.7-2), so it'll return 0. This [answer](http://stackoverflow.com/questions/7205806/is-getrusage-broken-in-linux-2-6-30) has more details. – Emaad Ahmed Manzoor Nov 22 '13 at 18:06
  • 1
    Mac OS definitely returns the RSS in bytes, Linux returns it in kilobytes. – Neil Dec 06 '13 at 23:52
11

Accepted answer rules, but it might be easier (and more portable) to use psutil. It does the same and a lot more.

UPDATE: muppy is also very convenient (and much better documented than guppy/heapy).

Day
  • 9,465
  • 6
  • 57
  • 93
johndodo
  • 17,247
  • 15
  • 96
  • 113
  • Yours is my accepted answer but I wasn't the one who asked the question so the best I can give you is an upvote. – CashCow Nov 30 '12 at 09:28
  • Thanks! I have found [muppy](http://packages.python.org/Pympler/muppy.html) to be even better in some ways, and also very nicely documented - it's worth checking out if you have memory leak problems. – johndodo Dec 03 '12 at 08:29
  • For specifics, see the (almost same) answer: http://stackoverflow.com/a/21632554/1959808 – 0 _ Dec 06 '15 at 01:45
2

try heapy

dfa
  • 114,442
  • 31
  • 189
  • 228
1

/proc/self/status has the following relevant keys:

  • VmPeak: Peak virtual memory size.
  • VmSize: Virtual memory size.
  • VmHWM: Peak resident set size ("high water mark").
  • VmRSS: Resident set size.

So if the concern is resident memory, the following code can me used to retrieve it:

def get_proc_status(keys = None):
    with open('/proc/self/status') as f:
        data = dict(map(str.strip, line.split(':', 1)) for line in f)

    return tuple(data[k] for k in keys) if keys else data

peak, current = get_proc_status(('VmHWM', 'VmRSS'))
print(peak, current)  # outputs: 14280 kB 13696 kB

Here's an article by memory_profiler's author that explains that getrusage's ru_maxrss isn't always a practical measure. Also note that, VmHWM may differ from ru_maxrss (what I see in some cases ru_maxrss is greater). But in the simple case they are the same:

import resource


def report():
    maxrss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    peak, current = get_proc_status(('VmHWM', 'VmRSS'))
    print(current, peak, maxrss)


report()

s = ' ' * 2 ** 28  # 256MiB
report()

s = None
report()

In addition here's a very comprehensible yet informative case study by atop authors which explains what is kernel, virtual and resident memory, and how they are interdependent.

saaj
  • 23,253
  • 3
  • 104
  • 105
1

The same kind of data that's in /proc/self/status is also in /proc/self/statm. However, it's easier to parse, because it's just a space delimited list of several statistics. I haven't been able to tell if both files are always present.

/proc/[pid]/statm

Provides information about memory usage, measured in pages. The columns are:

  • size (1) total program size (same as VmSize in /proc/[pid]/status)
  • resident (2) resident set size (same as VmRSS in /proc/[pid]/status)
  • shared (3) number of resident shared pages (i.e., backed by a file) (same as RssFile+RssShmem in /proc/[pid]/status)
  • text (4) text (code)
  • lib (5) library (unused since Linux 2.6; always 0)
  • data (6) data + stack
  • dt (7) dirty pages (unused since Linux 2.6; always 0)

Here's a simple example:

from pathlib import Path
from resource import getpagesize

PAGESIZE = getpagesize()
PATH = Path('/proc/self/statm')


def get_resident_set_size() -> int:
    """Return the current resident set size in bytes."""
    # statm columns are: size resident shared text lib data dt
    statm = PATH.read_text()
    fields = statm.split()
    return int(fields[1]) * PAGESIZE


data = []
start_memory = get_resident_set_size()
for _ in range(10):
    data.append('X' * 100000)
    print(get_resident_set_size() - start_memory)

That produces a list that looks something like this:

0
0
368640
368640
368640
638976
638976
909312
909312
909312

You can see that it jumps by about 300,000 bytes after roughly 3 allocations of 100,000 bytes.

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Don Kirkby
  • 53,582
  • 27
  • 205
  • 286