5

Why does the following code run?

#include <iostream>
class A {
    int num;
    public:
        void foo(){ num=5; std::cout<< "num="; std::cout<<num;}
};

int main() {
    A* a;
    a->foo();
    return 0;
}

The output is

num=5

I compile this using gcc and I get only the following compiler warning at line 10:

(warning: 'a' is used uninitialized in this function)

But as per my understanding, shouldn't this code not run at all? And how come it's assigning the value 5 to num when num doesn't exist because no object of type A has been created yet?

Apoorva Iyer
  • 590
  • 1
  • 7
  • 15
  • possible duplicate of [Where exactly does C++ standard say dereferencing an uninitialized pointer is undefined behavior?](http://stackoverflow.com/questions/4285895/where-exactly-does-c-standard-say-dereferencing-an-uninitialized-pointer-is-und) – Nawaz Mar 19 '11 at 06:30
  • +1. An additional question: If you didn't have the member `num` should you be able to expect this work? e.g., if it only contained `std::cout << "num=";`, so was locally stateless (I am legitimately asking, not just providing food for thought) – Merlyn Morgan-Graham Mar 19 '11 at 06:39
  • @Merlyn Morgan-Graham: No. Here the code de-references an uninitialized pointer. This is undefined behavior (ie the program can potentially do anything (including seeming to work)). – Martin York Mar 19 '11 at 06:55
  • @Martin: I guess I can see the spec not defining this, so you shouldn't rely on it, but what might a compiler do that would cause this not to work (assuming no inheritance, and no data members) – Merlyn Morgan-Graham Mar 19 '11 at 08:26
  • @Martin: Don't worry, I don't plan on doing this. I simply want to understand in which cases the compiler might generate code that could cause this to break. I understand UB and dereferencing uninitialized data. However, this "dereference operator" doesn't (necessarily) dereference anything. It would only be used to initialize the `this` pointer, and the `this` pointer is unused in my scenario. Of course the second anyone adds/uses object state it will break, but that isn't what I'm interested in. Besides vtables, why would any C++ compiler dereference an unused this pointer? – Merlyn Morgan-Graham Mar 20 '11 at 00:13
  • @Merlyn Morgan-Graham: In addition to the explicit a->foo() there is an implicit this->num. It is a fruitless exercise asking what the compiler will do. The standard leaves so much leeway for the compiler in-order to allow for the maximum optimizations that you can not make any predictions. What works on compiler A may will fail on compiler B because it uses a completely different strategy to implement the functionality. Even the same compiler may react completely differently with different flags. – Martin York Mar 20 '11 at 00:29
  • The fact that the actual memory location is not de-referenced to retrieve stuff does not make any difference. I have seen hardware that will try and pre-load memory into the local cache when an address is loaded into one of the special address register. If that memory does not belong to you it will generate a page fault. So just calling the function on this hardware may potentially generate a page fault even if no members are accessed. – Martin York Mar 20 '11 at 00:30
  • @Martin: So, my original scenario was to remove num entirely (your responses made me realize I might not have made that clear). You have pretty much answered my question with the info about that hardware pre-loading memory. I guess my remaining question is will a C++ compiler optimize out passing `this` if it never gets used? – Merlyn Morgan-Graham Mar 20 '11 at 07:27
  • @Merlyn Morgan-Graham: That is unknowable. Every compiler can and does use different optimization techniques. – Martin York Mar 20 '11 at 16:28

7 Answers7

4

The code produces undefined behavior, because it attempts to dereference an uninitialized pointer. Undefined behavior is unpredictable and follows no logic whatsoever. For this reason, any questions about why your code does something or doesn't do something make no sense.

You are asking why it runs? It doesn't run. It produces undefined behavior.

You are asking how it is assigning 5 to a non-existing member? It doesn't assign anything to anything. It produces undefined behavior.

You are saying the output is 5? Wrong. The output is not 5. There's no meaningful output. The code produces undefined behavior. Just because it somehow happened to print 5 in your experiment means absolutely nothing and has no meaningful explanation.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
2

A* a; is an uninitialized pointer.

the value you see is garbage, and you are luck you did not end up with a crash.

there is no initialization here.

there is no assignment here.

your class happens to be simple enough that more serious issues are not exhibited.

A* a(0); would lead to a crash. an uninitialized pointer would lead to a crash in some cases, and is more easily reproduced with more complex types.

this is the consequence of dealing with uninitialized pointers and objects, and it points out the importance of compiler warnings.

justin
  • 104,054
  • 14
  • 179
  • 226
  • I doubt that the value is garbage. If I change the value of num in foo from num = 5 to num = _**any number**_, I get that number at the output. – Apoorva Iyer Mar 19 '11 at 06:18
  • it certainly *is* garbage - by writing the value of num, you are overwriting the address of something else nearby in memory. – justin Mar 19 '11 at 06:20
  • Apoorva Iyer, you obviously don't understand the concept of "undefined behavior." – titaniumdecoy Mar 19 '11 at 06:20
  • Oh that way. So i'm overwriting some other garbage address! I thought you meant that the value of num was garbage. This makes more sense. Thank you! – Apoorva Iyer Mar 19 '11 at 06:22
  • also, it's the value a 'acquires' by lack of assignment that is garbage (specifically). beyond that, you are either accessing a valid or invalid memory address (== crash or very mysterious behaviour). – justin Mar 19 '11 at 06:22
  • exactly! so what you see is (by chance) a valid address when_it_does_not_crash. that is how you are able read/write it without bad access. but doing so will just read/write a memory block you have no intention of reading/writing. – justin Mar 19 '11 at 06:27
2

You haven't initialized *a.

Try this:

#include <iostream>

class A
{
    int num;
    public:
        void foo(){ std::cout<< "num="; num=5; std::cout<<num;}
};

int main()
{
    A* a = new A();
    a->foo();
    return 0;
}

Not initializing pointers (properly) can lead to undefined behavior. If you're lucky, your pointer points to a location in the heap which is up for initialization*. (Assuming no exception is thrown when you do this.) If you're unlucky, you'll overwrite a portion of the memory being used for other purposes. If you're really unlucky, this will go unnoticed.

This is not safe code; a "hacker" could probably exploit it.

*Of course, even when you access that location, there's no guarantee it won't be "initialized" later.


"Lucky" (actually, being "lucky" makes it more difficult to debug your program):

// uninitialized memory 0x00000042 to 0x0000004B
A* a;
// a = 0x00000042;
*a = "lalalalala";
// "Nothing" happens

"Unlucky" (makes it easier to debug your program, so I don't consider it "unlucky", really):

void* a;
// a = &main;
*a = "lalalalala";
// Not good. *Might* cause a crash.
// Perhaps someone can tell me exactly what'll happen?
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
  • 1
    I deliberately haven't initialized *a. The point was that I can set the value of 'num' inside foo() without initializing *a! – Apoorva Iyer Mar 19 '11 at 06:20
1

This is what I think happens.

a->foo(); works because you are just calling A::foo(a).

a is a pointer type variable that is in main's call stack. foo() function may throw a segmentation error when location a is accessed, but if it does not, then foo() just jumps some locations from a and overwrites 4 bytes of memory with the value 5. It then reads out the same value.

Am I right or wrong? Please let me know, I am learning about call stacks and would appreciate any feedback on my answer.

Also look at the following code

#include<iostream>
class A {
    int num;
    public:
        void foo(){ num=5; std::cout<< "num="; std::cout<<num;}
};

int main() {

    A* a;
    std::cout<<"sizeof A is "<<sizeof(A*)<<std::endl;
    std::cout<<"sizeof int is "<<sizeof(int)<<std::endl;
    int buffer=44;
    std::cout<<"buffer is "<<buffer<<std::endl;
    a=(A*)&buffer;

    a->foo();
    std::cout<<"\nbuffer is "<<buffer<<std::endl;
    return 0;
}
SingerOfTheFall
  • 29,228
  • 8
  • 68
  • 105
hnharsh
  • 11
  • 2
1
A* a;
a->foo();

That invokes undefined behaviour. Most commonly it crashes the program.

The section §4.1/1 from the C++03 Standard says,

An lvalue (3.10) of a non-function, non-array type T can be converted to an rvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the lvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior. If T is a non-class type, the type of the rvalue is the cv-unqualified version of T. Otherwise, the type of the rvalue is T.

See this similar topic: Where exactly does C++ standard say dereferencing an uninitialized pointer is undefined behavior?


And how come it's assigning the value 5 to num when num doesn't exist because no object of type A has been created yet.

It's called being lucky. But it wouldn't happen always.

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
0

Upon object creation, the class members are allocated for that particular object even if you don't use the keyword new, since the object is pointer to class. So your code runs fine and gives you the value of num, but GCC issues a warning because you've not instantiated the object explicitly.

Kushal
  • 3,112
  • 10
  • 50
  • 79
0

I'll point (hehe) you to a previous answer of mine to a very similar question: Tiny crashing program

Basically you're overwriting the envs stack variable with your pointer because you haven't added envs to the main declaration.

Since envs is an array of arrays (strings), it's actually very much allocated, and you're overwriting the first pointer in that list with your 5, then reading it again to print with cout.

Now this is an answer to why it happens. You should obviously not rely on this.

Community
  • 1
  • 1
Blindy
  • 65,249
  • 10
  • 91
  • 131