Connection between uninitialized variables and type safety

Question

I would like to ask why using variables that are not initialized is considered non type-safe?

I'm reading Bjarne Stroustrup's beginner book(Programming Principles and Practice Using C++) from the C++ book guide on this site.

There is a part in the book about type-safety that states :

A program - or a part of a program - is type-safe when objects are used only according to the rules for their type. For example, using a variable before it has been initialized is not considered type-safe.

Then the book provides the following code as an example:

 int main() {
        double x; // we "forgot" to initialize
                  // the value of x is undefined

        double y = x; // the value of y is undefined
        double z = 2.0+x; // the meaning of + and the value of z are undefined
 }

I understand that a local variable that is not initialized will have an indeterminate value and reading this variable will cause undefined behavior. What I do not understand is how is it connected to type-safety. We still know the types from the variable's definition.

Why does the comment in the above code states that the meaning of + is undefined when both 2.0 and x are double, and the + is defined for double + double?

Explaining UB to new programmers is not easy. Grouping uninitialized variables in the type-unsafe bracket as a didactic device is not that wrong. — Hans Passant, Jun 23 '18 at 14:52
Reference for [evaluation of indeterminate value being UB](https://stackoverflow.com/q/23415661/1708801) — Shafik Yaghmour, Jun 23 '18 at 19:38
The Stroustrup's sitation is a defintion: "Let's say that *type-safe* means that an object is used according to the rules for their type". Then the rule for a type can be divided in 2 categories: 1- those that are always validated by the compiler; 2- those that the compiler may fail to validate. For the case of undertermined value access, most of the compilers will complain about your example code, but they could not perform this validation if you passed the undetermined value to a function defined in an other translation unit. — Oliv, Jun 24 '18 at 08:17

Joseph D. · Accepted Answer · 2018-06-23T16:58:04.097

2

Undefined behavior means the output could be what you expect or some indeterminate value that may be outside the valid range of a type.

One clear example of undefined behavior is signed integer overflow:

unsigned int i;   // uninitialized
int x = i + 2;    // indeterminate value
if (x + 1 > x) {} // undefined behavior due to signed overflow

x can have a value outside int valid range if i holds a max value of unsigned int.

Thus, type safety is not guaranteed for expressions having indeterminate values.

edited Jun 23 '18 at 16:58

answered Jun 23 '18 at 14:37

Joseph D.

11,804
3
34
67

1

I have trouble seeing how overflow is related to the topic at hand. If `i` is indeterminate who cares whether or not it overflows? – Brian Cain Jun 23 '18 at 15:46
1

@BrianCain, for the updated example, `i` affects how `x` will be used. if `i` is indeterminate, usage of `x` is UB. – Joseph D. Jun 23 '18 at 15:56
1

But... overflowing is still UB with determinate values. I don't see the relation. – Passer By Jun 23 '18 at 16:55
@PasserBy, what I am trying to say is an `int` can be assigned to a `UINT_MAX` indeterminate value, which is clearly outside its type range. – Joseph D. Jun 23 '18 at 16:56
Thank you for the answer. Is it correct to say that an uninitialized variable with an indeterminate value can cause undefined behavior in an expression so it is not guaranteed that the variables in the expression will have the behavior that is defined by their types and this causes the program to be non type-safe? – CyberMarmot Jun 23 '18 at 16:57
Indeterminate value and random value is different. Compilers can do much more aggressive things with indeterminate values. – Passer By Jun 23 '18 at 16:57
@PasserBy, replaced the *"random"* word. thank you for pointing out! – Joseph D. Jun 23 '18 at 16:58

MikeTheCoder · Answer 2 · 2018-06-26T14:32:01.047

@codekaizer and @Shankar are right: undefined behavior is, by definition, not type safe behavior. How that applies to primitive types is a little harder to wrap your head around, though. It seems reasonable that any appropriately long sequence of bits could be a valid int. As @BoPersson pointed out below, this is not strictly true and implementations are free to include values which cause interrupts under arithmetic. For integers this practically only applies to 0 when used to divide, but that does not mean the standard doesn't allow for an integer version of something like floating point NaN on a suitably unusual architecture.

The reader may find an example with virtual functions more intuitively illustrative of why uninitialized variables aren't type safe. Consider:

struct Base {
    virtual int foo() const =0;
};

struct DerivedA : public Base {
    int foo() const override { return 10; }
};

struct DerivedB : public Base {
    int foo() const override { return -10; }
};

int main() {
    Base* abstractStructPtr;
    std::cout << abstractStructPtr->foo() << std::endl;
    return 0;
}

The type of abstractStructPtr means you can call foo() on it. The expression is valid: abstractStructPtr has a type, that is why you can call foo(). However, the implementation of foo() lives in derived classes.

Since abstractStructPtr isn't initialized, the data it points to isn't guaranteed to be structured in such a way that it can fulfill the call to foo(). In other words, while the type of absractStructPtr is Base*, there is no guarantee that the data that is pointed to is actually a Base object of any kind. Calling foo() is thus undefined behavior and not type safe. Anything could happen; practically it will probably just crash via a memory access violation, but it might not! Kablooey.

If it's polymorphic, dereferencing a *null pointer* throws a [bad_typeid](https://en.cppreference.com/w/cpp/types/bad_typeid). But if it's not, it [has a type](https://wandbox.org/permlink/NcsJGodm8f4OK8ni). — Joseph D., Jun 23 '18 at 16:44
@codekaizer `bad_typeid` is thrown only within an `typeid`, hence the name. — Passer By, Jun 23 '18 at 16:53
@PasserBy. right. I'm just elaborating the statement *"abstractStructPtr wasn't initialized means the data it points to doesn't have a type"* — Joseph D., Jun 23 '18 at 16:55
*"any sequence of bits"* is not required to be a valid value, even for an `int`. The standard allows [trap representations](https://en.cppreference.com/w/cpp/types/numeric_limits/traps) where incorrect values might terminate the program. This may indeed be what Bjarne had in mind. — Bo Persson, Jun 24 '18 at 08:48
Updated and edited some more, @BoPersson. I know the person who asked the question has chosen an answer, but I want to get this right. — MikeTheCoder, Jun 25 '18 at 15:24
I've clarified trap representations as they apply to integers. — MikeTheCoder, Jun 26 '18 at 14:32

score 0 · Answer 3 · answered Jun 23 '18 at 14:25

0

Even though 'x' was declared double, since it was not initialized, it has a random bit pattern in memory and that pattern might not represent any valid double precision number. Hence the "meaning of z" is undefined.

answered Jun 23 '18 at 14:25

Any random bit pattern can represent a double. – yakobyd Jun 23 '18 at 15:06
1

@yakobyd: That's why he said a "valid double precision number". There's no guarantee in the standard that "any random bit pattern" can represent a valid double. Even under IEEE-754, NaN is not exactly "valid". – Nicol Bolas Jun 23 '18 at 15:07

Connection between uninitialized variables and type safety

3 Answers3