0

Consider this snippet:

void init_seed(char *key)
{
    char *seed = key;

    size_t seed_len = strlen(seed);

    // Make sure the seed is at least 12 bytes long
    if (seed_len < 12) {
        char new_seed[13];

        for (int i = 0; i < 12; i++)
            new_seed[i] = seed[i % seed_len];

        new_seed[12] = '\0';
        seed = new_seed;
    }

    /* Use the seed variable */
}

The reason I declared it this way is that I do not want to use malloc() in this function, because it will complicate the function quite a lot.

The function works as intended (gcc 4.8.4). However, does declaring new_seed inside the if statement and then assigning it to seed cause undefined behavior?

Pieter van den Ham
  • 4,381
  • 3
  • 26
  • 41
  • 6
    Yes, this is undefined behavior for accessing an object after the end of its lifetime. – EOF Nov 01 '16 at 14:49
  • 2
    You want to know if using an object past its storage duration (lifetime). What did you find out reading your C book? – too honest for this site Nov 01 '16 at 14:50
  • 1
    You might have no choice than to use `malloc` (or `strdup`) and `realloc`. Or think about a redesign? – Some programmer dude Nov 01 '16 at 14:52
  • I'm sorry for the seemingly stupid question - perhaps I lacked the right Google search terms :). Thanks! – Pieter van den Ham Nov 01 '16 at 14:55
  • 2
    Simply move the array declaration outside the if statement and it will work fine. 13 bytes more or less on the stack is no big deal. – Lundin Nov 01 '16 at 14:56
  • 1
    Run the program using Valgrind (or alike) and see it complaining. – alk Nov 01 '16 at 14:58
  • 1
    Your code doesn't do what it's intended to do: if you give it an input key of length 4, you'll read 8 bytes in the memcpy which is also UB. – Yexo Nov 01 '16 at 15:00
  • @Yexo Ah I see, thank you – Pieter van den Ham Nov 01 '16 at 15:56
  • As it stands after the edit `new_seed` isn't `0`-terminated if it had been padded after the `strcpy()`, that is if `strlen()` returns less then 12. – alk Nov 01 '16 at 16:15
  • this is tagged `c++`, but the concept is the same: http://stackoverflow.com/questions/6441218/can-a-local-variables-memory-be-accessed-outside-its-scope/6445794#6445794 – yano Nov 01 '16 at 16:16
  • @alk Thanks, fixed – Pieter van den Ham Nov 01 '16 at 16:21
  • @yano I saw that question, however, my brain did somehow not manage to see the connection between these two questions – Pieter van den Ham Nov 01 '16 at 16:22
  • heh oh ok .. it's slightly different (that question talks about using a function variable outside of the function, whereas here you only have one function. It's the same scoping concept though.. if you have a variable defined in `{....}` you cannot use it outside of `{....}` without further action (return it, copy it somewhere else, etc). What I linked is the "classic" SO answer to the variable scope question. – yano Nov 01 '16 at 16:25
  • @yano My understanding was that a function's stack is released whenever it returns. Since the `new_seed` is still within the function, I didn't think this would create problems (well, it doesn't, only _theoretical ub_ problems, it seems). – Pieter van den Ham Nov 01 '16 at 16:29
  • 1
    it's not theoretical UB, it is UB. Once a variable goes out of scope (even within a function), `c` makes no requirements on what should happen to it. The reality is, in most cases, if you access your data immediately after it goes out of scope, it will still be there, since most compilers aren't going to generate code to scrub the stack and destroy data, that would be costly. For a sensitive-data application, however, that may be exactly what you want to do. The bottom line is you no longer own that memory and you cannot depend on its contents once it's out of scope. – yano Nov 01 '16 at 16:38
  • @yano: If code does a `goto` to a location before the declaration and then falls through the declaration a second time, would pointers to the object remain valid? By my understanding, if code did a `goto` before and then used a second `goto` to skip the declaration, the array would be required to hold the same contents; if the declaration is re-executed, the values in the array would become Indeterminate but pointers to it would remain valid. Does that jibe with your understanding? – supercat Nov 02 '16 at 17:00
  • @supercat Hmmm, can't say I entirely understand what you're asking,, something generically along the lines of "What happens to scoped variables if I use `goto` to randomly jump around in my code?" This warrants its own question if there isn't a dupe, code up specifically what you're describing there. I've only ever used `goto` to jump to a label in the same `{....}` code block, so I don't know for sure what happens if you jump out of that block. My guess is the same scoping rules would apply. – yano Nov 02 '16 at 18:15
  • @yano: If a `goto` leaves a block, the lifetime of any automatic variables declared therein ends; that has always been clear. What's unclear are the weird corner cases where code uses a `goto` to a location which is within the same block as a variable whose address has been taken. – supercat Nov 02 '16 at 18:27
  • @supercat "whose address has been taken". Sorry, still not following this. My understanding is, if I have `{ int a; .... }`, there's one `int a` with the same address in that entire block (shadowing aside since I don't know those rules). You can have as many pointers as you want pointing to `int a`, but there's still just one integer there with one address. Sorry, I don't think I'm helping much, and based on our rep difference I may not be able to anyway... ask a question! Seeing some code would go a long way to understanding your question. – yano Nov 02 '16 at 18:47

1 Answers1

3

Yes.

Once new_seed is out of scope, you no longer own any memory that was allocated for it.

So the behaviour on dereferencing the newly assigned value of seed is undefined.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 1
    Bathsheba, for clarity I suggest you change "_any memory that was allocated for it_" to "_the memory of any automatic variable that was used for it_" ("allocated" generally suggests by use of `malloc`.) – Paul Ogilvie Nov 01 '16 at 15:00