I develop software which usually includes both OpenGL and Nvidia CUDA SDK. Recently, I also started to seek ways to optimize run-time memory footprint. I noticed the following (Debug and Release builds differ only by 4-7 Mb):
Application startup - Less than 1 Mb total
OpenGL 4.5 context creation ( + GLEW loader init) - 45 Mb total
CUDA 8.0 context (Driver API) creation 114 Mb total.
If I create OpenGL context in "headless" mode, the GL context uses 3 Mb less, which probably goes to default frame buffers allocation. That makes sense as the window size is 640x360.
So after OpenGL and CUDA context are up, the process already consumes 114 Mb.
Now, I don't have deep knowledge regarding OS specific stuff that occurs under the hood during GL and CUDA context creation, but 45 Mb for GL and 68 for CUDA seems a whole lot to me. I know that usually several megabytes goes to system frame buffers, function pointers,(probably a bulk of allocations happens on driver side). But hitting over 100 Mb with just "empty" contexts looks too much.
I would like to know:
Why GL/CUDA context creation consumes such a considerable amount of memory?
Are there ways to optimize that?
The system setup under test: Windows 10 64bit. NVIDIA GTX 960 GPU (Driver Version:388.31). 8 Gb RAM. Visual Studio 2015, 64bit C++ console project.
I measure memory consumption using Visual Studio built-in Diagnostic Tools -> Process Memory section.
UPDATE
I tried Process Explorer, as suggested by datenwolf. Here is the screenshot of what I got, (my process at the bottom marked with yellow):
I would appreciate some explanation on that info. I was always looking at "Private Bytes" in "VS Diagnostic Tools" window. But here I see also "Working Set", "WS Private" etc. Which one correctly shows how much memory my process currently uses? 281,320K looks way too much, because as I said above, the process at the startup does nothing, but creates CUDA and OpenGL contexts.