3

While reading this I saw a UB that I don't understand, hoping you can clarify

size_t f(int x)
{
    size_t a;
    if(x) // either x nonzero or UB
        a = 42;
    return a; 
}

I guess the UB is due to a not having an initialized value, but isn't that it's defined behavior? Meaning, f(0) will return the value held by variable a, whatever it is (I consider this to be something like rand()). Must we know what value the code snippet returns for the code to have a well-defined-behavior?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
CIsForCookies
  • 12,097
  • 11
  • 59
  • 124
  • The return value of `rand()` is defined since it is based on `srand()`. You will not be able to know what `a` is, hence undefined – Erik W May 15 '17 at 13:25
  • @ErikW sure rand is defined, no doubt about it. But since you can't know rand()'s ret val, how is it different? – CIsForCookies May 15 '17 at 13:27
  • 2
    In addition to Sourav's answwer: don't forget **compiler may assume anything it wants**. Also that it won't/can't happen. In your example compiler may decide that UB _can't happen_ and then `x` will always be non-zero. You might read it as: if `x` is `0` then I can do anything I want and then I behave like it's non-zero and (as in your linked example) always return 42 (or format your hard-drive if generated code is slightly faster...) – Adriano Repetti May 15 '17 at 13:29
  • 5
    **Undefined** is not the same as _random_. – too honest for this site May 15 '17 at 13:39
  • 1
    @ErikW "...since it is based on srand()"? That is completely irrelevant. – too honest for this site May 15 '17 at 13:44
  • @Olaf I mean that the behavior (return value) is reproducible with the help of `srand` – Erik W May 15 '17 at 13:47
  • 1
    @ErikW And that is incorrect. `rand` has defined behaviour by itself; no need to use `srand`. – too honest for this site May 15 '17 at 13:49

3 Answers3

5

Meaning, f(0) will return the value held by variable a, whatever it is...

Well, in your case,

  • a is automatic local variable
  • it can have trap representation
  • it does not have its address taken.

So, yes, this, by definition causes undefined behavior.

Quoting C11, chapter §6.3.2.1

[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.


Community
  • 1
  • 1
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • trying to understand here... 1) a is automatic local variable and I didn't init it, hence UB? 2) tried to read about it here (http://stackoverflow.com/a/6725981/3512538) but didn't understand, could you elaborate? 3) I don't understand why this is relevant. Do you mean that a has no address, hence it can't hold values, hence UB? – CIsForCookies May 15 '17 at 13:33
  • 2
    @CIsForCookiesRead and understand the text carefully and completely. Don't just pick some keywords! – too honest for this site May 15 '17 at 13:40
  • 1
    As far as paragraph 6.3.2.1 is concerned, it doesn't matter whether the object's type affords trap representations. – John Bollinger May 15 '17 at 13:50
  • @JohnBollinger but does paragraph 6.3.2.1 links reasons 1 & 3 together? I understand that uninitialized automatic local variable can cause UB, but is it connected to it's (non-existing) address? and if so, how? – CIsForCookies May 15 '17 at 13:57
  • @CIsForCookies, Sourav helpfully provided the relevant text of paragraph 6.3.2.1 right in this answer. Did you consider *reading* it? Is something about it unclear? – John Bollinger May 15 '17 at 14:06
  • 1
    Interesting. That sentence from the standard seems to mean that `int a; int *p = &a; return a;` doesn't exhibit UB because of the use of the `&`. I'm sure that can't be what's intended. I wonder if there's another requirement that is still violated. (I think that the intention of the 'register' qualification is so that `scanf("%d", p);` or `scanf("%d", &a);` (with no variable `p` at all) doesn't necessarily cause UB.) But it feels a little edge-case-like. – Jonathan Leffler May 15 '17 at 15:27
2

Supplemental to @SouravGhosh's answer, it is important to understand that having undefined behavior is a property of certain combinations of language constructs and of certain runtime evaluations a program may perform, as specified by the standard. It is not a function of an analysis of what a compiler or program might do; in fact, it is more the opposite: a license to compilers and programs, releasing them from any particular constraint.

Therefore, although the standard is fairly logical and consistent about declaring UB, it is not much useful to approach the question from the direction of questioning why a particular construct has UB or why a particular evaluation may or does exhibit UB. There are reasons for the standard specifying what it does, but the primary answer to why a thing has UB is always "because the standard says so."

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
2

Undefined Behavior is a license for an implementation to process code in whatever way the author judges to be most suitable for the intended purpose. Some implementations included logic to trap in cases where an automatic variable was read without having been written first, even if the types otherwise had no trap representations; the authors of the Standard were almost certainly aware of such behavior and judged it useful. The Standard specifies only one situation where things may trap, but only in defined fashion (conversion from a larger integer type to a smaller one); in all other cases where things may trap, the authors of the Standard simply left the behavior Undefined rather than trying to go into any detail about how particular traps work, whether they are recoverable, etc.

Additionally, automatic variables are often mapped to registers that are larger than the variables in question, and even types which don't have trap representations may behave oddly in such cases. Consider, for example:

volatile uint16_t v;
uint32_t x(uint32_t a, uint32_t b)
{
  uint16_t temp;
  if (b) temp=v;
  return temp;
}

If b is non-zero, then temp will get loaded with v, and the act of loading v will cause temp to hold some value 0-65535. If b is zero, however, the compiler can't load temp with v (because of the volatile qualifier). If temp had been assigned to a 32-bit register (on some platforms, it might logically be assigned the same one used for a), the function may behave as though temp held a value which is larger than 65535. The simplest way for the Standard to allow for such a possibility is to say that returning temp in the above situation would be Undefined Behavior. Not because it would be expecting that implementations would do anything particularly wonky in cases where the caller ends up ignoring the return value (if the caller was going to use the return value, the caller presumably wouldn't have passed b==0) but because leaving things to implementers' judgment is easier than trying to formulate perfect one-size-fits-all rules for such things.

Modern C implementers no longer treat Undefined Behavior as an invitation to exercise judgment, but rather as an invitation to assume no judgment is required. Consequently, they may behave in ways that can disrupt program execution even if the value of the uninitialized value is used for no purpose except to pass it through code that doesn't know if it's meaningful, to code that ultimately ignores it.

supercat
  • 77,689
  • 9
  • 166
  • 211