1

I noticed that if you allocate a char array inside of a function like this

void example()
{
    char dataBuffer[100] = { 0 };
}

then examine the disassembly of the function with IDA that this actually inserts a call to memset() in order to initialize the char array. Looks something like this after I reversed it

memset(stackPointer + offset + 1, 0, 100);

The raw assembly looks like

addic     r3, r1, 0x220
addic     r3, r3, 1
clrldi    r3, r3, 32
li        r4, 0
li        r5, 0x64
bl        memset

But If I were to change the example() function to

void example()
{
    char dataBuffer[100];
}

Then the call to memset() is not inserted I noticed when examining the disassembly in IDA. So basically my question is, if the char array is not initialized to zero will it still be safe to work with? For example

void example()
{
    char dataBuffer[100];
    strcpy(dataBuffer, "Just some random text");
    strcat(dataBuffer, "blah blah blah example text\0");//null terminator probably not required as strcpy() appends null terminator and strcat() moves the null terminator to end of string. but whatever
}

Should I expect any UB when writing/reading to the char array like this even when it is not initialized to zero with the inserted memset() that comes along with initializing the char array with = { 0 }?

Coder1337
  • 139
  • 3
  • 11
  • 1
    See [Has C++ standard changed with respect to the use of indeterminate values and undefined behavior in C++14?](https://stackoverflow.com/q/23415661/1708801). The contents will be indeterminate, if we produce an indeterminate values (except for unsigned narrow char types) that is UB but writing to them gives them a valid value. – Shafik Yaghmour Mar 15 '18 at 16:11
  • 4
    It is safe to work with, assuming you treat it as being in a garbage state to begin with. If you do unsafe things with it (e.g., treating it as containing valid data), then it is not safe. – Eljay Mar 15 '18 at 16:12
  • 2
    If `memset` can write safely to uninitialized memory, then why should `strcpy` not be able to write safely to uninitialized memory?? – Jabberwocky Mar 15 '18 at 16:15
  • 1
    @MichaelWalz what is done in assembly does not necessarily have the same semantics as they do in C++. The implementation may be using platform specific assumptions that may not be valid in general. So the logic is slightly flawed w/o deeper analysis. It is dangerous to extrapolate from the implementation especially wrt to UB. – Shafik Yaghmour Mar 15 '18 at 16:19
  • 1
    @ShafikYaghmour IMO the question remains the same if all references to assembly code is removed. In any (conforming to standard) C/C++ implementation you _can_ write to uninitialized local variables, otherwise local variables could not be used at all. – Jabberwocky Mar 15 '18 at 16:22
  • 1
    @MichaelWalz the conclusion that it is safe to write to is correct ... the implication that if the implementation does X it should be safe for you to do X is not correct in the "general case". We can not reason about UB based on the implementation since it is not constrained as the user is. – Shafik Yaghmour Mar 15 '18 at 16:24
  • Reading uninitialized variables is Undefined Behaviour. If you do that, the program has no meaning and the compiler is allowed to do whatever it wants (but is not required to warn you). So no. It is *not* safe to read, but it is ok to write. – Jesper Juhl Mar 15 '18 at 16:27

2 Answers2

4

It's perfectly safe to work with it as an array with garbage data. This means writing into it is safe, reading from it is not. You simply just don't know what is in it yet. The function strcpy doesn't read from the array it gets (or more specifically, from the pointer it gets) it just writes onto it. So it's safe.

After you are done with writing into your char buffer. When you come to use it, you are going to go through it until you encounter a null (0) character. That null character will be set there when you wrote into it last. After that null character comes garbage if you didn't initialize it, and comes 0's if you did. In both cases, it doesn't matter since you are not going to read past the null character.

See: http://www.cplusplus.com/reference/cstring/strcpy/

it uses a very similar example to the code you provided.

Megadardery
  • 149
  • 7
  • 1) you should mention that reading uninitialized data is Undefined Behaviour. 2) I'd recommend cppreference.com as a superiour reference site than cplusplus.com. – Jesper Juhl Mar 15 '18 at 16:31
3

The line

char dataBuffer[100];

calls the variable dataBuffer into existence, and thus also associates memory with it. However, as an optimization, this memory is not initialized. C is designed not to perform any unnecessary work, and you are working in the C subset of C++ here.

That said, if your compiler can prove that you don't actually use the memory, it does not need to allocate it. But such an optimization would not be detectable from within your running, standard compliant code by definition. Your code will run as if the memory had been allocated. (This as-if rule is the basis for pretty much all optimizations that your compiler is allowed to perform.)

Your strcpy() and strcat() calls are fine, as they do not overrun the allocated buffer. But better forget that strcpy() and strcat() exist, there are better, safer functions to use nowadays.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106