C - is an indeterminate value indeterminable?

Question

According to this post an indeterminate value is:

3.17.2
1 indeterminate value
either an unspecified value or a trap representation

According to google, the definition of indeterminate is:

Not certain, known, or established
Left doubtful; vague.

According to thefreedictionary, determinable is:

capable of being determined

According to merriam-webster, to determine (in the particular context) is:

to find out or come to a decision about by investigation, reasoning, or calculation

So, common sense dictates that even though an indeterminate value is unknown during compile time, it is perfectly determinable during runtime, e.g. you can always read whatever happens to occupy that memory location.

Or am I wrong? If so, why?

EDIT: To clarify, I post this in relation to what became a heated argument with a user who attempted to convince me that an indeterminate value is indeterminable, which I very much doubt.

EDIT 2: To clarify, by "determinable" I don't mean a stable or usable value, even if it is a garbage value for uninitialized memory the value of that garbage can still be determined. I mean that trying to determine that value will still yield in some value rather than ... no action. So this value must come from some memory, allocated as storage for the still indeterminate value, I highly doubt a compiler will actually use say a random number generator just for the sake of coming up with some arbitrary value.

The was a defect report that came several months after your question and recently had a proposed solution that directly covers this topic and basically says indeterminate values are not stable, they can change values from evaluation to evaluation. See my answer for details. — Shafik Yaghmour, Aug 01 '14 at 20:02

AnT stands with Russia · Answer 1 · 2017-12-19T17:43:28.420

17

The fact that it is indeterminate not only means that it is unpredictable at the first read, it also means that it is not guaranteed to be stable. This means that reading the same uninitialized variable twice is not guaranteed to produce the same value. For this reason you cannot really "determine" that value by reading it. (See DR#260 for the initial discussion on the subject from 2004 and DR#451 reaffirming that position in 2014.)

For example, a variable a might be assigned to occupy a CPU register R1 withing a certain timeframe (instead of memory location). In order to establish the optimal variable-to-register assignment schedule the language-level concept of "object lifetime" is not sufficiently precise. The CPU registers are managed by an optimizing compiler based on a much more precise concept of "value lifetime". Value lifetime begins when a variable gets assigned a determinate value. Value lifetime ends when the previously assigned determinate value is read for the last time. Value lifetime determines the timeframe during which a variable is associated with a specific CPU register. Outside of that assigned timeframe the same register R1 might be associated with a completely different variable b. Trying to read an uninitialized variable a outside its value lifetime might actually result in reading variable b, which might be actively changing.

In this code sample

{
  int i, j;

  for (i = 0; i < 10; ++i)
    printf("%d\n", j);

  for (j = 0; j < 10; ++j)
    printf("%d\n", 42);
}

the compiler can easily determine that even though object lifetimes of i and j overlap, the value lifetimes do not overlap at all, meaning that both i and j can get assigned to the same CPU register. If something like that happens, you might easily discover that the first cycle prints the constantly changing value of i on each iteration. This is perfectly consistent with the idea of value of j being indeterminate.

Note that this optimization does not necessarily require CPU registers. For another example, a smart optimizing compiler concerned with preserving valuable stack space might analyze the value lifetimes in the above code sample and transform it into

{
  int i;

  for (i = 0; i < 10; ++i)
    printf("%d\n", <future location of j>);
}

{
  int j;

  for (j = 0; j < 10; ++j)
    printf("%d\n", 42);
}

with variables i and j occupying the same location in memory at different times. In this case the first cycle might again end up printing the value of i on each iteration.

edited Dec 19 '17 at 17:43

answered Jun 30 '13 at 21:16

AnT stands with Russia

312,472
42
525
765

Will I get an arbitrary value from a register variable as from a regular one, or something else? – dtech Jun 30 '13 at 21:21
2

@ddriver: I'm not sure what you mean by "register variable" here. Any variable can be stored in a register without your knowledge, which means that you can observe the same behavior from any variable. Also, theoretically the same technique can be applied to save stack memory, so the effect might be observed even without any register-related optimizations. – AnT stands with Russia Jun 30 '13 at 22:00
I mean a value that never gets out of the register and into cache/ram, I suspect registers hold leftover garbage much like any other memory. So it will still have a valid binary representation. – dtech Jun 30 '13 at 22:07
3

@ddriver: Yes, but that's not the point. The point is that the tight relationship between variable and register does not begin until a *determinate* value is *written* into the variable. *Reading* a variable does not begin that relationship. So, until that moment the register is free to change as much as it wants for any external reasons, meaning that each read might easily produce a different value. – AnT stands with Russia Jun 30 '13 at 22:09
2

Out of curiosity, I tested your sample code: with `gcc` it prints all `j` equal to `0`, but `clang` prints 2 or 3 different values, even in `-O0` builds. – rodrigo Jun 30 '13 at 22:46
@rodrigo: GCC literally replaces `j` with `0` in the first `printf`. A more convoluted example is needed to trick GCC into that behavior, if it is at all possible. – AnT stands with Russia Jun 30 '13 at 23:00
1

For me GCC 4.8 produces consistent results. All `j`s are 2686856. MSVC2012 won't even compile... – dtech Jun 30 '13 at 23:43
Things are worse than what you describe. The code `i=*p; j=i; doStuff(); k=i;` could be replaced with `j=*p; doStuff(); k=*p;` if the compiler could determine that `doStuff` could not alter `*p` *in any case where it would have a determinate value. Consequently, even though j and k were both loaded from i, and nothing should have been able to alter i, there would be no guarantee j and k would be the same. – supercat May 07 '16 at 16:20
This answer does not cite anything authoritative or comments from the C committee. What reason is there to believe it? – Eric Postpischil Dec 19 '17 at 13:03
@Eric Postpischil: The awesome power of the author's authority that trumps that of any stinking committee :) Meanwhile, I ordered some of my underlings to add references to DR#260 and DR#451 to the answer. – AnT stands with Russia Dec 19 '17 at 17:50

score 11 · Answer 2 · edited Jun 20 '20 at 09:12

11

Two successive reads of an indeterminate value can give two different values. Moreover reading an indeterminate value invokes undefined behavior in case of a trap representation.

In a DR#260, C Committee wrote:

An indeterminate value may be represented by any bit pattern. The C Standard lays down no requirement that two inspections of the bits representing a given value will observe the same bit-pattern only that the observed pattern on each occasion will be a valid representation of the value.

[...] In reaching our response we noted that requiring immutable bit patterns for indeterminate values would reduce optimization opportunities. For example, it would require tracking of the actual bit-patterns of indeterminate values if the memory containing them were paged out. That seems an unnecessary constraint on optimizers with no compensatory benefit to programmers.

edited Jun 20 '20 at 09:12

Community

1
1

answered Jun 30 '13 at 21:18

ouah

142,963
15
272
331

There is no requirement for the value to be persistent, only to get it, obviously an indeterminate value is useless is any practical context. My point was that even if the programmer left it undetermined the value is still technically determined by a binary representation, whatever it may be. – dtech Jun 30 '13 at 22:04
3

@ddriver I understand what you're saying and I think the top 3 answerers understand it as well. But the thing is, if you read an uninitialized variable, the compiler may not even bother to bring you a (garbage) binary representation. See Pascal's link, it addresses and explains exactly what you're asking. – Theodoros Chatzigiannakis Jun 30 '13 at 22:23
@TheodorosChatzigiannakis - I understand this, but is it standard or implementation specific? – dtech Jun 30 '13 at 22:37
2

@ddriver C11 says that reading an uninitialized automatic object invokes undefined behavior *6.3.2.1p2) "If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined."* See DR#338 for more information http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm – ouah Jun 30 '13 at 22:43
@ddriver The fact that the value is indeterminate is standard. The fact that reading it invokes undefined behavior is standard. What actually happens is implementation specific. – Theodoros Chatzigiannakis Jun 30 '13 at 22:43
@ddriver "Reading it invokes UD" means that when the compiler detects access to such memory, it *may* allow you to read a garbage binary representation, but it may as well remove the expression completely (likely), it may link it to an RNG, it may replace it with code to crash the process or it may invoke demons from a hell dimension, all without violating the standard. Some of these potential approaches (like crashing the process) don't yield any kind of value at all, so the value can truly be *non-determinable*. Remember, the compiler deals in semantics first and foremost. – Theodoros Chatzigiannakis Jun 30 '13 at 22:58
So, all in all, depending on the implementation and specific usage context I can either get random garbage through a binary representation or some other random garbage which is not any binary representation or if extra lucky, a crash? – dtech Jun 30 '13 at 23:04
@ddriver Strictly speaking, no, I wouldn't say "either/or", as these were simply a few examples of what could happen - but the list is definitely not exhaustive. All in all, I'd say that invoking undefined behavior practically means that *anything* can happen, depending on *any* number of factors. – Theodoros Chatzigiannakis Jun 30 '13 at 23:11
@TheodorosChatzigiannakis - but why? The scenario can very easily be narrowed down to well defined behavior, what is the benefit of having "anything can happen" versus defined and safe behavior? Also, I wouldn't use "anything" since I can think of plenty of things that could not possibly happen in this scenario, not in a quadrillion years... I am asking because I myself implement (so far a very naive) compiler and decided to do exactly that, just fetch the value from memory, whatever it may be. If the value is referenced, its location is not reused for other values. – dtech Jun 30 '13 at 23:20
@ddriver: Undefined behavior is just a catch all for "who gives a crap? Just don't do that". So: who gives a crap? Just don't do that. That's all it means, you won't find any more meaning. Reason about it all you want, as far as the language is concerned: just don't. – GManNickG Jun 30 '13 at 23:25
It is a loose end, that is what it is, and loose ends ... are never good in my book. So I'll keep it nice and tied in my implementation. – dtech Jun 30 '13 at 23:26
4

@ddriver Often things are left undefined in order to give the compiler some space to perform performance and memory optimizations. (There may be other benefits too that I can't think of right now.) As for why, there isn't really any definitive answer, other than that the designers of the C language valued those benefits enough to allow this tradeoff. Other languages have less UD because their designers valued predictability a lot more. It's not a mistake of the designers of C (nor is it an oversight or a loose end), it's simply a language design decision. – Theodoros Chatzigiannakis Jun 30 '13 at 23:27
@TheodorosChatzigiannakis: The authors of the Standard thought that people producing implementations should be better able to judge the needs of their customers than the people on the Committee. Unfortunately, compiler writers have interpreted the permission to exercise judgment as though the authors of the Committee had determined that compiler users have almost no needs worthy of respect. – supercat Feb 20 '17 at 23:27
The DR quoted in this answer says only that the bits representing a value may be different in different observations, not that the value may change. E.g., padding bits may be different. It even says the bits will be a valid representation of **the** value, not **a** value. – Eric Postpischil Dec 19 '17 at 13:06
@EricPostpischil: Interesting point, but modern interpretations of the Standard go way beyond that. If every attempt to read an indeterminate value of a type with no padding bits or trap representations were guaranteed to yield arbitrary value of that type, but repeated reads were not required to be consistent, code needing consistency could read a value and write it back. Unfortunately, many optimizers will strip out code like `x=x;` entirely rather than replacing it with an intrinsic that would mean "if x is indeterminate, make it unspecified; else do nothing", even... – supercat Dec 19 '17 at 19:17
...when a later optimizer won't achieve such semantics otherwise, and compiler writers seem to have convinced the Committee that fixing such broken "optimizations" would require an impractical amount of work. – supercat Dec 19 '17 at 19:18

score 5 · Answer 3 · answered Jun 30 '13 at 21:20

5

The C90 standard made it clear that reading from an indeterminate location was undefined behavior. More recent standards are not so clear any more (indeterminate memory is “either an unspecified value or a trap representation”), but compilers still optimize in a way that is only excusable if reading an indeterminate location is undefined behavior, for instance, multiplying the integer in an uninitialized variable by two can produce an odd result.

So, in short, no, you can't read whatever happens to occupy indeterminate memory.

answered Jun 30 '13 at 21:20

Pascal Cuoq

79,187
7
161
281

I post it in a limited context, the value only needs to be determined, not used in a situation it can potentially fail, like dividing by an indeterminate value. – dtech Jun 30 '13 at 21:25
1

@ddriver In `unsigned int j; … j *= 2;` it is the using of an indeterminate value, not the multiplication by two, that is undefined behavior and causes the result of a multiplication by two to be odd. Unsigned multiplication is always defined. – Pascal Cuoq Jun 30 '13 at 21:29
The linked article contains some great clarification of various misconceptions we often have about uninitialized locals. +1. – Theodoros Chatzigiannakis Jun 30 '13 at 21:42

Shafik Yaghmour · Answer 4 · 2014-08-02T13:08:14.353

We can not determine the value of an indeterminate value, even under operations that would normally lead to predictable values such as multiplication by zero. The value is wobbly according to the new language proposed(see edit).

We can find the details for this in defect report #451: Instability of uninitialized automatic variables which had a proposed resolution bout a year after this question was asked.

This defect report covers very similar ground to your question. Three questions were address:

Can an uninitialized variable with automatic storage duration (of a type that does not have trap values, whose address has been taken so 6.3.2.1p2 does not apply, and which is not volatile) change its value without direct action of the program?
If the answer to question 1 is "yes", then how far can this kind of "instability" propagate?
If "unstable" values can propagate through function arguments into a called function, can calling a C standard library function exhibit undefined behavior because of this?

and provided the following examples with further questions:

unsigned char x[1]; /* intentionally uninitialized */
printf("%d\n", x[0]);
printf("%d\n", x[0]);
Does the standard allow an implementation to let this code print two different values? And if so, if we insert either of the following three statements
x[0] = x[0];
x[0] += 0;
x[0] *= 0;
between the declaration and the printf statements, is this behavior still allowed? Or alternatively, can these printf statements exhibit undefined behavior instead of having to print a reasonable number.

The proposed resolution, which seems unlikely to change much is:

The answer to question 1 is "yes", an uninitialized value under the conditions described can appear to change its value.
The answer to question 2 is that any operation performed on indeterminate values will have an indeterminate value as a result.
The answer to question 3 is that library functions will exhibit undefined behavior when used on indeterminate values.
These answers are appropriate for all types that do not have trap representations.
This viewpoint reaffirms the C99 DR260 position.
The committee agrees that this area would benefit from a new definition of something akin to a "wobbly" value and that this should be considered in any subsequent revision of this standard.
The committee also notes that padding bytes within structures are possibly a distinct form of "wobbly" representation.

Update to address edit

Part of the discussion includes this comment:

Strong sentiment formed, in part based on prior experience in developing Annex L, that a new category of "wobbly" value is needed. The underlying issue is that modern compilers track value propagation, and uninitialized values synthesized for an initial use of an object may be discarded as inconsequential prior to synthesizing a different value for a subsequent use. To require otherwise defeats important compiler optimizations. All uses of "wobbly" values might be deemed undefined behavior.

So you will be able to determine a value but the value could change at each evaluation.

So, if you can determine the value, be that even an unstable garbage value, this value must come from somewhere? E.g. the compiler has allocated some storage for it and you get a value from that address, even if the variable is indeterminate you can get a value out of the binary representation of its storage location. I think that logically, if it is impossibly to determine an indeterminate value, then the compiler would not even allow at least direct access, but this is not done since obviously you can do it indirectly, so even indeterminate values must come from some storage location. — dtech, Aug 02 '14 at 13:35
@ddriver: Many kinds of programs that deal with things like sparse arrays and hash tables may benefit from loose guarantees regarding the values of uninitialized storage. Such code may, for example, be able to initialize a million-entry sparse-array data structure using two stores rather than a million. If an implementation can guarantee that `uint32_t x=(volatile uint32_t)(table->arr[y]);` will at worst set `x` to some arbitrary number 0 to 4294967295 even if `table->arr[y]` is uninitialized, even if it offers no guarantee that repeated reads to the same uninitialized element... — supercat, Feb 20 '17 at 23:18
...would yield the same value, that would allow code to save a million stores at what should be minimal cost to useful optimizations. Unfortunately, compiler writers have lost sight of what should be a guiding principle of optimization: first, do no harm. — supercat, Feb 20 '17 at 23:23

score 1 · Answer 5 · answered Aug 02 '14 at 13:51

When the standard introduces a term like indeterminate, it is a normative term: the standard's definition applies, and not a dictionary definition. This means that an indeterminate value is nothing more or less than an unspecified value, or a trap representation. Ordinary English meanings of indeterminate are not applicable.

Even terms that are not defined in the standard may be normative, via the inclusion of normative references. For instance section 2 of the C99 standard normatively includes a document called ISO/IEC 2382−1:1993, Information technology — Vocabulary — Part 1: Fundamental terms..

This means that if a term is used in the standard, and is not defined in the text (not introduced in italics and expained, and not given in the terms section) it might nevertheless be a word from the above vocabulary document; in that case, the definition from that standard applies.

Yes, the standard "shadows" the definition for "indeterminate", but I am not aware of the standard introducing new meaning to "determine" or "determinable". The big question IMO is where does the value for an indeterminate value come from, and my assumption is from allocated auto stack storage. I've written a compiler myself and I allocate storage upon first usage of a variable, be that writing to it, reading from it or reading its address, cuz I think that's what makes sense. But a lot of people seem to infer that might not be the case, so I wonder if not from storage, then from where? — dtech, Aug 03 '14 at 10:17
@ddriver "value for an indeterminate value" is semantically incorrect. You probably mean "what is the value for an object that has an indeterminate value", and the answer is "indeterminate". — M.M, May 24 '15 at 06:06

supercat · Answer 6 · 2018-08-29T16:56:11.417

The authors of the Standard recognized that there are some cases where it might be expensive for an implementation to ensure that code that reads an indeterminate value won't behave in ways that would be inconsistent with the Standard (e.g. the value of a uint16_t might not be in the range 0..65535. While many implementations could cheaply offer useful behavioral guarantees about how indeterminate values behave in more cases than the Standard requires, variations among hardware platforms and application fields mean that no single set of guarantees would be optimal for all purposes. Consequently, the Standard simply punts the matter as a Quality of Implementation issue.

The Standard would certainly allow a garbage-quality-but-conforming implementation to treat almost any use of e.g. an uninitialized uint16_t as an invitation to release nasal demons. It says nothing about whether high-quality implementations that are suitable for various purposes can do likewise (and still be viewed as high-quality implementations suitable for those purposes). If one is needs to accommodate implementations that are designed to trap on possible unintended data leakage, one may need to explicitly clear objects in some cases where their value will ultimately be ignored but where the implementation couldn't prove that it would never leak information. Likewise if one needs to accommodate implementations whose "optimizers" are designed on the basis of what low-quality-but-conforming implementations are allowed to do, rather than than what high-quality general-purpose implementations should do, such "optimizers" may make it necessary to add otherwise-unnecessary code to clear objects even when code doesn't care about the value (thus reducing efficiency) in order to avoid having the "optimizers" treat the failure to do so as an invitation to behave nonsensically.

C - is an indeterminate value indeterminable?

6 Answers6

Linked

Related