13

Working a lot with microcontrollers and C++ it is important for me to know that I do not perform dynamic memory allocations. However I would like to get the most out of the STD lib. What would be the best strategy to determine if a function/class from STD uses dynamic memory allocation?

So far I come up with these options:

  1. Read and understand the STD code. This is of course possible but lets be honest, it is not the easiest code to read and there is a lot of it.
  2. A variation on reading the code could be to have a script search for memory allocation and highlight those parts to it make it easier to read. This still would require figuring out where functions allocating memory are used, and so forts.
  3. Just testing what I would like to use and watch the memory with the debugger. So far I have been using this method but this is a reactive approach. I would like to know before hand when designing code what I can use from STD. Also what is there to say that there are some (edge) cases where memory is allocated. Those might not show up in this limited test.
  4. Finally what could be done is regularly scan the generated assembler code for memory allocations. I suspect this could be scripted and included in the toolchain but again this is a reactive method.

If you see any other options or have experience doing something similar, please let me know.

p.s. I work mainly with ARM Cortex-Mx chips at this moment compiling with GCC.

Bart
  • 1,405
  • 6
  • 32
  • 1
    You can just look up each feature on cppreference before you use it. See if they allow you to provide an allocator. In general containers dynamically allocate, algorithms don't. – François Andrieux Aug 02 '21 at 11:42
  • 2
    As a test you can override the `new` operators and break in then/log them to see if they are called. – François Andrieux Aug 02 '21 at 11:45
  • @FrançoisAndrieux Thanks for responding. True but we need to know exactly. The test is a good option but what about malloc? Is that ever used in STD? – Bart Aug 02 '21 at 11:46
  • 5
    Does this answer your question? [GCC: How to disable heap usage entirely on an MCU?](https://stackoverflow.com/questions/40130374/gcc-how-to-disable-heap-usage-entirely-on-an-mcu) – Artyer Aug 02 '21 at 11:49
  • I don't know if a standard library implementation allowed to use `malloc` directly or not, but it would be sloppy if it did. `operator new` is a customization point. – François Andrieux Aug 02 '21 at 11:51
  • 3
    Did you try to let the linker generate a map file, and scan that? Still kind of reactive, but no runtime test is necessary. – the busybee Aug 02 '21 at 11:54
  • @artyer Good tip, it would help together with the tip of francois to override new. – Bart Aug 02 '21 at 11:55
  • @thebusybee The map file is generated but I have not yet thought of using it for this. – Bart Aug 02 '21 at 11:56
  • If the language you are using does not make it clear whenever dynamic allocation is used or not, it is unsuitable for embedded systems. I would recommend to pick C language for all new projects; C++ is becoming increasingly irrelevant for embedded. – Lundin Aug 04 '21 at 13:06
  • @lundin I was not so much asking about the langue as more about the standard library. I could have asked the same about the function standard provide by C or any supporting library. Now as C++ is a bit "sacred" to me. ;-) Interesting you mention that C++ is becoming irrelevant, especially as ARM (the biggest mcu designer) choose C++ for its platform Mbed OS. – Bart Aug 04 '21 at 13:27
  • @Bart The C++ standardization is heading in the wrong direction from an embedded system's point of view. There are numerous big language flaws that make C++ unsuitable, most notably the non-existent type punning. But the various standard libs were always very much unsuitable, such as the string, vector, set, map etc containers. – Lundin Aug 04 '21 at 20:11
  • 4
    @bart That is an interesting opinion you have as I have the exact opposite experience and got great benefit from features added in recent years. Think of type traits, span, std::array, automatic looping over plain arrays etc. Now I fully agree string, vector and map are unusable and that is exactly why I asked this question. – Bart Aug 05 '21 at 07:21
  • @Bart You mention you're working primarily with cortex-M. Any particular RTOS or is this a "bare-metal" environment? – Jon Reeves Aug 16 '21 at 22:42
  • @JonReeves Bare-metal in this case. – Bart Aug 17 '21 at 05:53
  • 1
    I don't see this mentioned yet: Be aware that the language _itself_ is allowed (and does) allocate heap memory in certain circumstances. Exceptions that are thrown are often placed in heap storage since it requires dynamic lifetime. Additionally if you are planning to use coroutines, they are explicitly stated to use `new` and `delete` for their additional coroutine state storage for resumption (though the compiler may optimize this out in the right cases) – Human-Compiler Aug 18 '21 at 02:58

5 Answers5

5

You have some very good suggestions in the comments, but no actual answers, so I will attempt an answer.

In essence you are implying some difference between C and C++ that does not really exist. How do you know that stdlib functions don't allocate memory?

Some STL functions are allowed to allocate memory and they are supposed to use allocators. For example, vectors take an template parameter for an alternative allocator (for example pool allocators are common). There is even a standard function for discovering if a type uses memory

But... some types like std::function sometimes use memory allocation and sometimes do not, depending on the size of the parameter types, so your paranoia is not entirely unjustified.

C++ allocates via new/delete. New/Delete allocate via malloc/free.

So the real question is, can you override malloc/free? The answer is yes, see this answer https://stackoverflow.com/a/12173140/440558. This way you can track all allocations, and catch your error at run-time, which is not bad.

You can go better, if you are really hardcore. You can edit the standard "runtime C library" to rename malloc/free to something else. This is possible with "objcopy" which is part of the gcc tool chain. After renaming the malloc/free, to say ma11oc/fr33, any call to allocate/free memory will no longer link. Link your executable with "-nostdlib" and "-nodefaultlibs" options to gcc, and instead link your own set of libs, which you generated with objcopy.

To be honest, I've only seen this done successfully once, and by a programmer you did not trust objcopy, so he just manually found the labels "malloc" "free" using a binary editor, and changed them. It definitely works though.

Edit: As pointed out by Fureeish (see comments), it is not guaranteed by the C++ standard that new/delete use the C allocator functions. It is however, a very common implementation, and your question does specifically mention GCC. In 30 years of development, I have never seen a C++ program that runs two heaps (one for C, and one for C++) just because the standard allows for it. There would simply be no advantage in it. That doesn't preclude the possibility that there may be an advantage in the future though.
Just to be clear, my answer assumes new USES malloc to allocate memory. This doesn't mean you can assume that every new call calls malloc though, as there may be caching involved, and the operator new may be overloaded to use anything at all at the global level. See here for GCC/C++ allocator schemes.

https://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html

Yet another edit:
If you want to get technical - it depends on the version of libstdc++ you are using. You can find operator new in new_op.cc, in the (what I assume is the official) source repository

(I will stop now)

Tiger4Hire
  • 1,065
  • 5
  • 11
  • 1
    "*New/Delete allocate via malloc/free*" - they're not required to do so, if we're being technically correct. – Fureeish Aug 17 '21 at 21:42
  • This is a fair point. Experience using gcc leads me to believe that the sharing technologies (like address sanitiser) between C/C++ is just too big of an advantage for this to change quickly. I will adjust my answer to take this in to account though. Thanks for nudge – Tiger4Hire Aug 18 '21 at 09:15
  • @Tiger4Hire I will be using a slight variety of your answer in a dedicated project unit testing various STD function on target. This way I hope to identify the different cases as with `std::function`. It would also allow me to update the lib to a newer version and run the tests again. Thank you all for looking at my question! – Bart Aug 23 '21 at 07:22
3

The options you listed are pretty comprehensive, I think I would just add some practical color to a couple of them.

Option 1: if you have the source code for the specific standard library implementation you're using, you can "simplify" the process of reading it by generating a static call graph and reading that instead. In fact the llvm opt tool can do this for you, as demonstrated in this question. If you were to do this, in theory you could just look at a given method and see if goes to an allocation function of any kind. No source code reading required, purely visual.

Option 4: scripting this is easier than you think. Prerequisites: make sure you're building with -ffunction-sections, which allows the linker to completely discard functions which are never called. When you generate a release build, you can simply use nm and grep on the ELF file to see if for example malloc appears in the binary at all.

For example I have a bare metal cortex-M based embedded system which I know for a fact has no dynamic memory allocation, but links against a common standard library implementation. On the debug build I can do the following:

$ nm Debug/Project.axf | grep malloc
700172bc T malloc
$

Here malloc is found because dead code has not been stripped.

On the release build it looks like this:

$ nm Release/Project.axf | grep malloc
$

grep here will return "0" if a match was found and something other than "0" if it wasn't, so if you were to use this in a script it would be something like:

nm Debug/Project.axf | grep malloc > /dev/null
if [ "$?" == "0" ]; then
    echo "error: something called malloc"
    exit 1
fi

There's a mountain of disclaimers and caveats that come with any of these approaches. Keep in mind that embedded systems in particular use a wide variety of different standard library implementations, and each implementation is free to do pretty much whatever it wants with regard to memory management.

In fact they don't even have to call malloc and free, they could implement their own dynamic allocators. Granted this is somewhat unlikely, but it is possible, and thus grepping for malloc isn't actually sufficient unless you know for a fact that all memory management in your standard library implementation goes through malloc and free.

If you're serious about avoiding all forms of dynamic memory allocation, the only sure way I know of (and have used myself) is simply to remove the heap entirely. On most bare metal embedded systems I've worked with, the heap start address, end address, and size are almost always provided a symbols in the linker script. You should remove or rename these symbols. If anything is using the heap, you'll get a linker error, which is what you want.

To give a very concrete example, newlib is a very common libc implementation for embedded systems. Its malloc implementation requires that the common sbrk() function be present in the system. For bare metal systems, sbrk() is just implemented by incrementing a pointer that starts at the end symbol provided by the linker script.

If you were using newlib, and you didn't want to mess with the linker script, you could still replace sbrk() with a function that simply hard faults so you catch any attempt to allocate memory immediately. This in my opinion would still be much better than trying to stare at heap pointers on a running system.

Of course your actual system may be different, and you may have a different libc implementation that you're using. This question can really only answered to any reasonable satisfaction in the exact context of your system, so you'll probably have to do some of your own homework. Chances are it's pretty similar to what I've described here.

One of the great things about bare metal embedded systems is the amount of flexibility that they provide. Unfortunately this also means there are so many variables that it's almost impossible to answer questions directly unless you know all of the details, which we don't here. Hopefully this will give you a better starting point than staring at a debugger window.

Jon Reeves
  • 2,426
  • 3
  • 14
2

To make sure you do NOT use dynamic memory allocation, you can override the global new operator so that it always throws an exception. Then run unit tests against all your use of the library functions you want to use.

You may need help from the linker to avoid use of malloc and free as technically you can't override them.

Note: This would be in the test environment. You are simply validating that your code does not use dynamic allocation. Once you have done that validation, you don't need the override anymore so it would not be in place in the production code.


Are you sure you want to avoid them?

Sure, you don't want to use dynamic memory management that is designed for generic systems. That would definitely be a bad idea.

BUT does the tool chain you use not come with an implementation that is specific to your hardware that does an intelligent job for that hardware? or have some special ways to compile that allows you to use only a known piece of memory that you have pre-sized and aligned for the data area.


Moving to containers. Most STL containers allow you to specialize them with an allocator. You can write your own allocator that does not use dynamic memory.

Martin York
  • 257,169
  • 86
  • 333
  • 562
  • The current platform (arm cortex-m) does not allow for exceptions to be thrown. The applications I make require by regulations and/or code standards no dynamic memory to be used out of safety considerations. A memory pool would be allowed in less restrictive applications. – Bart Aug 19 '21 at 06:21
  • @Bart I would not have that on your cortex. But in your test environment. Once you have confirmed that no dynamic allocation is done then you don't need the override. – Martin York Aug 19 '21 at 17:22
  • @Bart: It was a good point. I added that note after your comment. – Martin York Aug 19 '21 at 18:51
1

Generally you can check (suitably thorough) documentation to see whether the function (e.g., a constructor) can throw std::bad_alloc. (The inverse is often phrased as noexcept, since that exception is often the only one risked by an operation.) There is the exception of std::inplace_merge, which becomes slower rather than throwing if allocation fails.

Davis Herring
  • 36,443
  • 4
  • 48
  • 76
1

The gcc linker supports a -Map option which will generate a link map with all the symbols in your executable. If anything in your application does dynamic memory allocation unintentionally, you will find a section with *alloc and free functions.
If you start with a program with no allocation, you can check the map after every compile to see if you have introduced one through the library function calls.

I used this method to identify an unexpected dynamic allocation introduced by using a VLA.

AShelly
  • 34,686
  • 15
  • 91
  • 152
  • new/delete are not required to call alloc/free. Though, a quick check of your implementation should tell you if they do or not. – Martin York Aug 19 '21 at 01:35
  • My current IDE (Attolic/STM32CubeIDE) and debugger Segger Ozone allow me to see this. It is however a manual operation to do so. A good final check before releasing code thow. – Bart Aug 19 '21 at 06:30