1

I have project written in C-language. I need to find out how much Stack(local variables,..) and Heap memory(allocated with malloc) this process is using. So that I can make a decision that whether a particular Microcontroller(currently my controller has 30KB RAM) meet my project's minimum RAM/Stack/Heap requirements or not.

I tried /proc/pid/smaps. But it is showing minimun 4kB stack even if the file contains only 2 local integer variables.(I think it's showing Page size or memory range).

top command output is not useful for this requirement.

Is there any tool to find out stack(with moderate accuracy in bytes) used by a process in realtime in the form of variables etc(or atleast maximum value reached in lifetime also fine).(with this later I need to setup CI job for finding these.)

Atleast I could find out heap using malloc wrapper API like below.(don't know how to find out deallocated memory in a easy way.)

Eg: void call_malloc(size_t n) { usedMem = usedMem + n; // global variable p= malloc(n); }

Galaxo
  • 41
  • 5
  • For such a memory-constrained MCU system you should not use heap allocations at all. You should also not try to rework an application designed and built for a normal PC-like system into fitting on the embedded system. For such a small system you need to set up the requirements, analysis, design and of course the implementation specifically for that target system. Programming for small embedded systems is *very* different from programming for a normal PC. – Some programmer dude Dec 07 '22 at 13:08
  • Which compiler are you using? You want to use a memory profiler like valgrind, intel vTune or similar. I don't see any reliable way to measure this using the OS because of the different behavior of the runtimes when running the code on the OS vs. the MCU – ndu Dec 07 '22 at 13:13
  • How about just running it under debugger and seeing SP values? And for heap, see all the various ways of tracing malloc allocations. – hyde Dec 07 '22 at 14:06
  • The reason you see 4Kb is because that is page size (as you said). That is the minimum amount the OS can allocate to anything. When you ask `malloc` for memory, and it does not own enough, it asks the kernel for more pages. `malloc` will subdivide the pages as it gives out memory. – Jason Dec 07 '22 at 14:56
  • If you have recursive functions then you dont know how deep the stack can get. – stark Dec 08 '22 at 03:22
  • 1) I am using gcc compiler. 2) I am using mbed-coap, so it is using malloc. 3) We work on different chipsets. But we have written C-code to run on linux/ubuntu to calculate mem leaks(Valgrind), code size...so on and update them in Gitlab CI. 4) So I need to fetch these values from process everytime if new code added or something. Any tool is fine windows/linux , free/paid.(Need to find out stack consumed) – Galaxo Dec 08 '22 at 04:52
  • I'm sorry but you can't really compare memory usage of a full Linux application to a (almost) bare-metal RTOS application. And looking through the code for the mbed-coap library, it seems you need to pass in pointers to functions for memory handling. Why not create a small array which you manage yourself in your code, and let the library "allocate" from that small array? Then you can keep track of exactly how much memory is actually used, as well as easily limit its memory usage to fit your use-cases without wasting too many bytes. – Some programmer dude Dec 08 '22 at 11:59

2 Answers2

2

I found reasonable solution.

While compiling use -fstack-usage flag. Eg: gcc -g -fstack-usage filename.c

Use the same in CFLAGS in makefile. No need to run the executable. After compiling, the same file name with .su extension will be there in that folder. It can be opened using cat/vim/notepad etc.

For heap memory calculation, simply use valgrind.

PS: While digging more I found below answer. How to determine maximum stack usage in embedded system with gcc?

Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109
Galaxo
  • 41
  • 5
0

If you run your code with the very basic command

/usr/bin/time --verbose ${executable}

you will get the following type of output. If you focus on the "Maximum resident set size", and consider the values for "Average stack size" and "Average total size" (i.e. stack + heap), would that address your needs?

Command being timed: "{your_executable}"
User time (seconds): 0.00
System time (seconds): 0.01
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.01
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 4032
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 385
Voluntary context switches: 5
Involuntary context switches: 84
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

This is also discussed more expansively here.

Eric Marceau
  • 1,601
  • 1
  • 8
  • 11