1

I have a python application that is over time consuming GB's of data.. To track down the memory usage I installed guppy and printed the heap after a time unit, I see the following in the report.

Partition of a set of 43325494 objects. Total size = 7524458264 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 2556102   6 2678794896  36 2678794896  36 dict of cocotb.binary.BinaryValue
...

Now objects of type BinaryValue are created in dozens of modules in my application. To progress with the debug I will have to get the line number/filename/variable name of these objects. So Is there some way to obtain this information? If not, then what strategies can I use to rootcause the issue?

Note:Since the python application is tightly coupled to a C program it is not possible to run the python code interactively and any debug steps has to be directly instrumented in the code.

vijayvithal
  • 551
  • 1
  • 5
  • 13

1 Answers1

0

The direct referred 'name' is shown in the byvia attribute, like:

heap()[0].byvia

Considering your set is dict of [something] the via will always be .__dict__, so doing a referrers will be more useful:

heap()[0].referrers.byvia

The reference pattern is shown in the rp attribute and the shortest path from root (guppy's traverse tree) is shown in the sp attribute:

heap()[0].rp
heap()[0].sp

I personally prefer filtering to set of objects of interest, then get one somgle object from the set (eg: .byid[0]), and do a shortest path to that.

The documentation of all the different attributes are python 3 version and python 2 version