0

I was trying to learn about dangling pointers, so I made a quick test including inner local scope, within the main function, and a pointer defined outside that inner local scope. Inside of it i'm defining and initializing local variable, and I'm assigning it's address as value of the pointer.

Here is the example:

#include <stdio.h>

int main()
{
    int *ptr = NULL;

    //Start of local scope    
    {
        int a = 10;
        ptr = &a;
        printf("Address of a: %p\nValue of ptr: %p\n", &a, ptr);
    }
    //End of local scope

    printf("\nDereferenced value of ptr: %d", *ptr);

    return 0;
}

The output is:

Address of a: 0x7ffcacf6146c
Value of ptr: 0x7ffcacf6146c

Dereferenced value of ptr: 10

I was expecting some segmentation fault error or undefined behaviour, since the local scope is left, the local variable - terminated, and so I expected it's value to be erased.

Indeed - the variable is terminated, it's impossible to access it outside the scope, since it no longer exists. But the value stored in it continues to exist on the same address. After the local scope is left isn't it supposed the value to be erased alongside the variable, to which is assigned to? Isn't the memory location, which is occupied by the variable, cleansed from it's contents, after the end of the local scope is reached?

Isn't it supposed that this memory location, once freed, to be returned at OS disposal, thus making it inaccessible from the program? Or it remains to program's disposal until program termination occurs, and execution control is reverted back to OS?

One more code example. Let's modify the above example, and just define (without initializing) another variable of the same type, but this time outside of the scope, after it. On all tests I did - it occupied the same memory location, and even more - is being initialized with the same value, just because occupies the memory location on which the value was stored through the previous variable.

#include <stdio.h>

int main()
{
    int *ptr = NULL;
    
    //Start of local scope
    {
        int a = 10;
        ptr = &a;
        printf("Address of a: %p\nValue of ptr: %p\n", &a, ptr);
    }
    //End of local scope

    int b;

    printf("\nAddress of b: %p\nValue of b: %d\n", &b, b);

    printf("\nDereferenced value of ptr: %d", *ptr);

    return 0;
}

Output is:

Address of a: 0x7fff5f9faecc
Value of ptr: 0x7fff5f9faecc

Address of b: 0x7fff5f9faecc
Value of b: 10

Dereferenced value of ptr: 10
Sandrious
  • 118
  • 8
  • 3
    *I was expecting some segmentation fault error or **undefined behaviour*** - well, you got it. Any behavior can be seen when it is undefined. – Eugene Sh. Jan 05 '23 at 16:03
  • If you left a book in a drawer in your hotel room and you checked out but still have the room key, is it guaranteed that the book will be gone if you try to use the key an hour after you check out? – dbush Jan 05 '23 at 16:08
  • "I drove off the road and my car didn't explode!" Sometimes you drive off a cliff, other times it's on to a lawn. – tadman Jan 05 '23 at 16:09
  • https://godbolt.org/z/xfcjq1GT5 – OldProgrammer Jan 05 '23 at 16:10
  • And that's why undefined behavior is so perniciously dangerous - it can be undetected until something completely unrelated and maybe even completely random exposes it. – Andrew Henle Jan 05 '23 at 16:10
  • it is one of the undefined behavior, your are lucky enough else it will result in segmentation fault. – Hackaholic Jan 05 '23 at 16:12
  • 1
    @Sandrious Remember, C is a relatively low-level language, designed for simplicity and efficiency. It sounds like you expected some kind of error, but think about it: what combination of mechanisms would have generated that error? There's generally no mechanism (either in the compiler, or in the code that the compiler generates) to check for this kind of error. Similarly, there's generally no mechanism to explicitly wipe out the memory formerly used by now-returned functions. – Steve Summit Jan 05 '23 at 16:22
  • 1
    (cont'd.) Any such mechanisms would require (a) more time to implement, (b) more time to run, and (c) would exact costs on correct programs, to little purpose. So, instead, C's philosophy in cases like these (where your program does something undefined) is generally to do nothing special, and to just let whatever happens, happen. Since "undefined behavior" means "anything can happen", that's fine. But it sometimes means that "whatever happens" is not only that you get no error, but that, in this case, the dangling value is still there. – Steve Summit Jan 05 '23 at 16:22
  • @SteveSummit I don't expect some complex mechanism to review my program, but I know that the compiler can display segmentation faults either through visible error message, or by terminating the program when the point of the access violation is reached. My initial expectations were that when I try to dereference the address, after the local scope is left, that the address is returned back to the OS and trying to access it via the pointer will cause exactly some sort of segmentation fault. I was extremely surprised when happened exactly the opposite. – Sandrious Jan 05 '23 at 20:47
  • @Sandrious Two things: (1) It is (as you acknowledge later) the CPU and MMU, not the compiler, that is going to cause segmentation faults. But, (2) segmentation faults happen when you try to access memory that isn't there. The stack frame of a function that has just exited, however, is still very much there, because it's (in all probability) about to be used by the next function that gets called. Memory is automatically allocated to the stack as it grows, but I've never heard of an OS deallocating (that is, removing the page table entries for) portions of the stack as it shrinks. – Steve Summit Jan 05 '23 at 20:51
  • @SteveSummit And not just the opposite, but the address was still accessible, and the value was there too. I was expecting at least to be deleted and the memory location - empty. And by "undefined behaviour" I was expecting the empty location, when dereferenced, to produce random numbers, different for every run of the program. None of this happened. The variable got deleted, the value remains on the address. The pointer access it, new variable occupies the same address and automatically accepts the value, which is there... I'm completely confused. – Sandrious Jan 05 '23 at 20:56
  • @SteveSummit And also - have in mind that I'm not familiar with Assembly language. – Sandrious Jan 05 '23 at 20:56
  • @Sandrious Completely understand about the "not familiar with Assembly language", and I'm sorry if my explanations seemed to assume that familiarity. The key is in your statements "I was expecting at least to be deleted and the memory location - empty" and "I was expecting the empty location, when dereferenced, to produce random numbers". There is, basically, no such thing as "deletion" in this sense. Nor is there any such thing as an "empty" location. Memory either exists or it doesn't, and if it exists, it always contains *some* bit pattern. And it's almost never "random". – Steve Summit Jan 05 '23 at 21:02
  • @SteveSummit I apology, I didn't mean that you assumed that familiarity. Yet, I'm willing to learn Assembly someday (maybe NASM), but I think to advance in "C" first. Despite this, I already see in what close relation the two languages exist, so I think learning assembly will be higly benefitial for me, will help me to understand computers more properly. How the computer decides with what bit pattern to fill memory locations, if they are uninitialized? – Sandrious Jan 06 '23 at 12:02
  • @Sandrious *How the computer decides with what bit pattern to fill memory locations, if they are uninitialized?* Wrong question. "Uninitialized" meant that nobody decided to fill them with anything! They'll typically contain bit patterns which might as well be random. – Steve Summit Jan 20 '23 at 17:45
  • Another analogy: Imagine that a house burns down. It's a bad fire; much of the building collapses. Then a bulldozer comes along and knocks the rest of the wreckage down, since it's unsafe. Then a truck comes along and takes most of the wreckage away. Then it's just a vacant lot for a while, with a few bits of brick and wood scattered around, and if you dug maybe you'd find parts of the former building's foundation, or basement. – Steve Summit Jan 20 '23 at 17:45
  • Then I buy the vacant lot, and I hire an architect and some carpenters, and ask them to build a new building there. Now: *How do they decide what the existing condition of the lot is, for them to build on?* They don't! They're absolutely stuck with whatever condition the lot is in now, and it's up to them to start digging or bulldozing to put the lot in the condition they need to build the new building they've designed for me. – Steve Summit Jan 20 '23 at 17:45
  • Now, with all of that said, in the computer world, there are a few things we can usually say. (1) When an operating system runs a brand-new program in a brand-new process, all memory starts out as 0, because the operating system explicitly clears it all, because it would be Bad if there were any data values left over from the previous program that ran. – Steve Summit Jan 20 '23 at 17:45
  • (2) A function that runs for the first time in a new part of the stack — that is, deeper into the stack than any of this program's functions have called before — will also find 0's, because again, the operating system should initialize new stack frames to 0 as it allocates them. – Steve Summit Jan 20 '23 at 17:45
  • (3) A function that finds itself running with its stack frame in a region of the stack that *was* previously used — that is, not deeper than any function before it — will find that its memory contains whatever random or non-random garbage that some other, recent function left behind. (This is a lot like the burned-out building analogy.) – Steve Summit Jan 20 '23 at 17:46
  • (4) Memory that you obtain with `malloc` might start out full of all 0's, or it might contain some random garbage from the previous time your program used (and then freed) that memory. It's difficult to say, and completely unpredictable. – Steve Summit Jan 20 '23 at 17:46
  • Finally, (5) code that does not run under an operating system — that is, embedded code, or an operating system itself — finds that its memory initially contains whatever the actual, hardware memory circuits start up as, which as far as I know depends on the particular memory technology. It might be all 0 bits for some technologies, and all 1 bits for others, or random bits for yet others. – Steve Summit Jan 20 '23 at 17:46
  • So, bottom line, your program has to be careful to initialize its variables appropriately; it has to take care to not use uninitialized variables. Note, too, the important distinction concerning default initialization in C between ["static" and "automatic" duration variables](https://stackoverflow.com/questions/51329671). – Steve Summit Jan 20 '23 at 17:46
  • Also, be aware that everything I've said in this overlong series of comments is how things "typically" work under "conventional" computer architectures and operating systems. But in the end the only thing we can say for sure about the initial value of an uninitialized variable is that it is unpredictable. – Steve Summit Jan 20 '23 at 17:52

0 Answers0