22

I'm trying to develop a better understanding of the amount of memory that is allocated on the heap in c++. I've written a small test program which basically does nothing else than fill a number of 2D vectors. I'm running this on a linux 64bit VM and use valgrind's massif tool in order to profile the memory.

The environment I'm running this test on: Linux VM running in VirtualBox on Win10. VM configuration: Base memory: 5248MB, 4CPU's, cap At 100%, disk-type VDI (dynamically alocated storage).

c++ memory profiling test program:

/**
 * g++ -std=c++11 test.cpp -o test.o
 */

#include <string>
#include <vector>
#include <iostream>

using namespace std;

int main(int argc, char **arg) {
    int n = stoi(arg[1]);
    vector<vector<int> > matrix1(n);
    vector<vector<int> > matrix2(n);
    vector<vector<int> > matrix3(n);
    vector<vector<int> > matrix4(n);
    vector<vector<int> > matrix5(n);
    vector<vector<int> > matrix6(n);
    vector<vector<int> > matrix7(n);
    vector<vector<int> > matrix8(n);

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix1[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix2[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix3[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix4[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix5[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix6[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix7[i].push_back(j);
        }
    }

    for (int i=0; i<n; ++i) {
        for (int j=0; j<n; ++j) {
            matrix8[i].push_back(j);
        }
    }
}

I run the following bash script in order to extract memory profiles at different values of n (test.o is the program above, compiled with g++ -std=c++11, g++ is version 5.3.0)

valgrind --tool=massif --massif-out-file=massif-n1000.txt ./test.o 250
valgrind --tool=massif --massif-out-file=massif-n1000.txt ./test.o 500
valgrind --tool=massif --massif-out-file=massif-n1000.txt ./test.o 1000
valgrind --tool=massif --massif-out-file=massif-n2000.txt ./test.o 2000
valgrind --tool=massif --massif-out-file=massif-n4000.txt ./test.o 4000
valgrind --tool=massif --massif-out-file=massif-n8000.txt ./test.o 8000
valgrind --tool=massif --massif-out-file=massif-n16000.txt ./test.o 16000
valgrind --tool=massif --massif-out-file=massif-n32000.txt ./test.o 32000

This gives me the following results:

|--------------------------------|
| n     | peak heap memory usage |
|-------|------------------------|
| 250   | 2.1 MiB                |         
| 500   | 7.9 MiB                |
| 1000  | 31.2 MiB               |
| 2000  | 124.8 MiB              |
| 4000  | 496.5 MiB              |
| 8000  | 1.9  GiB               |
| 16000 | 6.2 GiB                |
| 32000 | 6.1 GiB                |
|--------------------------------|

Each matrix will be n^2 in size, I have a total of 8 matrices, hence I expected a memory usage to be around f(n) = 8 * n^2.

Question 1 From n=250 to n=8000, why is the memory usage more or less multiplied by 4 at n*=2 ?

From n=16000 to n=32000 something very strange is happening because valgrind actually reports a memory decrease.

Question 2 What is happening between n=16000 and n=32000, how can it be possible that heap memory is less, while in theory more data should be allocated?

See below the massif-visualizer output for n=16000 and n=32000.

n=16000 n=32000

Gio
  • 3,242
  • 1
  • 25
  • 53
  • 1
    I've [plotted your results](http://fooplot.com/#W3sidHlwZSI6MCwiZXEiOiI4KnheMi8zMDAwMDAiLCJjb2xvciI6IiMyQjI0RkYifSx7InR5cGUiOjMsImVxIjpbWyIyNTAiLCIyLjEiXSxbIjUwMCIsIjcuOSJdLFsiMTAwMCIsIjMxLjIiXSxbIjIwMDAiLCIxMjQuOCJdLFsiNDAwMCIsIjQ5Ni41Il0sWyI4MDAwIiwiMTkwMCJdLFsiMTYwMDAiLCI2MjAwIl0sWyIzMjAwMCIsIjYxMDAiXV0sImNvbG9yIjoiIzAwMDAwMCJ9LHsidHlwZSI6MTAwMCwid2luZG93IjpbIjAiLCIzMjAwMCIsIjAiLCI4MDAwIl19XQ--) – YSC Aug 21 '17 at 14:00
  • 5
    I would recommend tagging this question with the compiler and platform you are running your test on. The internal mechanisms of memory allocation is determined by your implementation, c++ makes few impositions on the subject. – François Andrieux Aug 21 '17 at 14:03
  • My understanding is that how `vector` manages its memory depends on the compiler to some extent. Typically the memory management of `vector` includes some degree of pre-allocation of memory as well as some degree of prediction for memory allocation. take a look at https://frogatto.com/2009/11/17/how-cs-vector-works-the-gritty-details/ – Richard Chambers Aug 21 '17 at 14:09
  • Doesn't using push_back actually alter the statistics you're looking for, mostly because of the need to constantly reallocate the space for copying the arrays when they run out of space? I was thinking that a better way to measure the real allocation would be to use .resize or .reserve ? In this way you'd not pay the overhead of the copy. Please feel free to crush my understanding... – Marco Aug 21 '17 at 23:03
  • The test profiling program reflects a real program which uses push_back, if push_back alters the statistics than it should be reflected in the profiling results, which is a good thing in order to understand how memory is allocated in my real program. But yes I believe you are right that if in a real world program you can use .reserve or .resize, you should probably do that in order to reduce the reallocation overhead, a detailed article regarding this matter is available here: https://frogatto.com/2009/11/17/how-cs-vector-works-the-gritty-details/ – Gio Aug 22 '17 at 09:45

1 Answers1

32

1) Because the sizes of your matrix vectors (and thus their memory footprint) grow as n2, so doubling n leads to quadrupling of memory usage. Any deviations from the exact relationship (as opposed to asymptotic) are due to different factors (e.g. metadata used by malloc / std::allocator, block size doubling method used by vector)

2) You are beginning to run out of memory, so Linux is starting to page some; use --pages-as-heap=yes if you want to see the total (active + paged) memory usage. (Source: http://valgrind.org/docs/manual/ms-manual.html)

Jean-François Corbett
  • 37,420
  • 30
  • 139
  • 188
meowgoesthedog
  • 14,670
  • 4
  • 27
  • 40
  • I've tried to run my test at n=16000 with the option --pages-as-heap=yes, unfortunately this causes valgrind to crash. It starts of with a large number of "brk segment overflow in thread #1, can't grow to 0x4a2c000" errors, until valgrind eventually runs out of memory and crashes. – Gio Aug 21 '17 at 14:47
  • 1
    @Gio maybe [this post](https://stackoverflow.com/questions/35129135/valgrind-reporting-a-segment-overflow) will help you – meowgoesthedog Aug 21 '17 at 14:50
  • 2
    Also the latest Valgrind, 3.13, has relaxed the memory limitations somewhat. From http://valgrind.org/docs/manual/dist.news.html: `The amount of memory that Valgrind can use has been increased from 64GB to 128GB. In particular this means your application can allocate up to about 60GB when running on Memcheck.` – Paul Floyd Aug 21 '17 at 15:35