2

A common type of bug in C programs is that the program uses some uninitialized data, most often assuming something is zero when it has in fact never been initialized to zero. Such a program can seem to work because those memory locations just happen to be zero, but then one day there is some garbage there and your program crashes.

I know that valgrind is a great tool to find such problems. But sometimes valgrind cannot be used, for example if the program does memory allocation in a nonstandard way.

My question: is there some compiler option to gcc (or clang) that could be used to ask the compiler to initialize local variables to some nonzero "poison" values, in order to expose that kind of bugs?

I think it should be technically possible for the compiler to do that, to insert some instructions at each function call to put that data into the memory space of stack variables that would normally be uninitialized. There would be some performance cost, but cheap compared to using valgrind, and also valgrind may not be possible to use in some cases.

Edit: let me clarify that this question is not about compiler warnings. Of course compiler warnings are very helpful, they should be turned on and taken care of, but that does not solve all problems with uninitialized data. For example, the program may take the address of a local variable and pass that to a function, then the compiler will not know if the address is passed to allow the function to copy data there (which would be fine) or if the function will use the data pointed to (which would mean using uninitialized data).

Elias
  • 913
  • 6
  • 22
  • 1
    most any good compiler will warn when a uninitialized variable is used as a source value. This is one (of the many) excellent reasons to enable the warnings when compiling – user3629249 Jun 03 '20 at 23:30
  • @Elias - I'm not seeing any compiler option that would init locals as described. Though I have an idea; first, would adding some user-provided code to the mix be OK? – Milag Jun 06 '20 at 22:34
  • @Milag adding some code can be OK, please share your idea! – Elias Jun 07 '20 at 06:56
  • have a look at clang static analyzer, i.e. `scan-build make ...` – bbonev Jun 10 '20 at 21:59

4 Answers4

4

Yes -- clang has the -fsanitize=memory option.

Here's a short excerpt from the docs:

If a bug is detected, the program will print an error message to stderr and exit with a non-zero exit code.

% ./a.out WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f45944b418a in main umr.cc:6
    #1 0x7f45938b676c in __libc_start_main libc-start.c:226

You can also use -fsanitize-memory-track-origins to get even more information about the problem.

% clang -fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -g -O2 umr2.cc
% ./a.out
WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f7893912f0b in main umr2.cc:7
    #1 0x7f789249b76c in __libc_start_main libc-start.c:226

  Uninitialized value was stored to memory at
    #0 0x7f78938b5c25 in __msan_chain_origin msan.cc:484
    #1 0x7f7893912ecd in main umr2.cc:6

  Uninitialized value was created by a heap allocation
    #0 0x7f7893901cbd in operator new[](unsigned long) msan_new_delete.cc:44
    #1 0x7f7893912e06 in main umr2.cc:4

See the full documentation (linked above) for details about usage, runtime cost, and other tips.

Snild Dolkow
  • 6,669
  • 3
  • 20
  • 32
1

In the crude category, you can write a simple function that calls alloca() for a big hunk of stack space and memset()'s it or otherwise initializes it, then returns, and call it right before your call.

  • Thanks, that's an interesting idea. It is basically what I was looking for except that I would like it done for all function calls. Typically you will not know which function is causing trouble, that is rather what you are trying to find out. If there was a way to make the compiler automatically insert something like that before every function call, that would be great. – Elias Jun 08 '20 at 13:10
0

Compiler does that by default:

int foo()
{
    int y;

    if(y > 0) return 0;
    return 1;
}

and compiler warns you:

source>: In function 'foo':

<source>:11:7: warning: 'y' is used uninitialized in this function [-Wuninitialized]

   11 |     if(y > 0) return 0;

      |       ^

Compiler returned: 0

Just enable all warnings and do not ignore them. Yoiu do not need anything else

0___________
  • 60,014
  • 4
  • 34
  • 74
  • compiler returning 0 indicates the compiler thinks it was ok to use a uninitialized variable. Suggest (for `gcc`) to include the option: `-Werror` – user3629249 Jun 03 '20 at 23:34
  • 1
    It is not needed.. It is **only** needed for those who **ignore** warnings. – 0___________ Jun 03 '20 at 23:36
  • Compiler warnings do not solve all problems, I edited the question to clarify this. – Elias Jun 03 '20 at 23:44
  • 3
    -Wall -Werror in all the project my teams work on - always – pm100 Jun 04 '20 at 00:17
  • 1
    Does it mean that if you let compile with warnings your team ignore them? Does it have to enforce ir. – 0___________ Jun 04 '20 at 08:22
  • @P__J__ *It is not needed.. It is only needed for those who ignore warnings.* Not true. Automated builds won't fail without `-Werror`. – Andrew Henle Jun 07 '20 at 15:15
  • 1
    @pm100 - FWIW, I've worked in multiple groups that _chose_ those options and tracked results after adding them showed sustained trends of fewer regressions. – Milag Jun 07 '20 at 18:30
0

To summarize the topic so far:

  • Q: compiler option to automate initializing locals?
  • the init may include added code
  • compiler warnings and their merit have been covered

Short answer: there's no apparent gcc option to initialize locals as described, at least not on its own.

While gcc -finstrument-functions has been useful with custom profiling, unconventional work by a related user-provided routine could init space consumed by locals within the caller's stack frame. But can that be done reliably?

After building sources with -finstrument-functions, there will be compiler generated calls to __cyg_profile_func_enter() and __cyg_profile_func_exit(). For discussion, these are aliased here to the shorter syms cyg_enter() and cyg_exit().

Create a separate file, eg cyg.c, to be built without -finstrument-functions -- avoiding recursion. Add a stub routine cyg_exit() and provide cyg_enter() with content like this:

  • obtain info for the caller's stack frame
  • set locals low_addr and size for the caller's frame, adjusting as needed
  • and write a pattern, eg: memset(low_addr, '\x1f', size)

If this idea works -- probably with refinement and limitations: on return from cyg_enter() the value of locals in the caller's frame are now based on the pattern.

====

Generally I've found some compiler warnings have led to improved/careful coding style that avoids surprises. While absent a compiler option to init locals as described, I would not promote a private stack writer method for general use, although it might have some utility. Remember this idea is limited to a possible method to init locals. Instead of comments about the merits of scribbling on the stack or not, create a separate posting as needed.

Milag
  • 1,793
  • 2
  • 9
  • 8