0

So I have a little program

#include <iostream>
using namespace std;

void lol() {
    cout << "How did we get here?"<<std::endl;
}

int main()
{
   long a, b, z[10];
   cin >> a >> b;
   z[a] = b;
}

You can run it via online compiler here

The program has no purpose, but it has one bug or feature - I do not know what it is. So, if you write something like this main 13 2015 you'll probably get nothing, but if you enter two magic number 13 and 4196608 you'll get an error. Moreover the program executes function void lol() and prints the line How did we get here?.

I've run nm ./main and found my function void lol() with the address 0000000000400900 which equals 4196608(base of the system of numeration is 10).

That means that the program "jumps" for some reason to this address and executes the function void lol(). Moreover, if I change the first number, nothing will happen. main 10 4196608, main 11 4196608, main 12 4196608, main 14 4196608, main 15 4196608 -- all the same, no errors, but as soon as I enter number 13 I get this interesting behaviour.

Can anyone explain what's going on here?

AstroCB
  • 12,337
  • 20
  • 57
  • 73
Ascelhem
  • 413
  • 3
  • 21
  • 6
    wow. you discovered undefined behaviour. Well, it's best you learn early. But rest assured, it's not "interesting". It's a pain you'll be avoiding for all the time spent programming – sehe Feb 03 '15 at 13:14
  • 2
    Could you give me some more information on which system you are running your test? I could not replicate your behavior in my machine. – André Puel Feb 03 '15 at 13:21
  • You have discovered one way how buffer overflows cause security flaws. – drescherjm Feb 03 '15 at 13:38
  • 1
    I would try it, but I'm not sure if Raid works on nasal demons. – Martin James Feb 03 '15 at 13:48

2 Answers2

7

If the input for a is a number above 9 (or negative), you are accessing z[a] incorrectly (out of bound index, buffer overflow) since you declared an array long z[10]

This is typical undefined behavior (UB).

UB is very bad, see this answer of mine, or for more background:

The only way to explain some actual undefined behavior is to dive into all the implementation specific details (compiler, optimization, operating system, machine code, processor, etc....). You could spend years on this. (perhaps in your case the return address on the call stack has been overwritten by the address of lol function).

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • I understand that it is an undefined behavior, but why it happens when I enter number 13? Not 11, 10 or 15, but 13? – Ascelhem Feb 03 '15 at 13:15
  • @Ascelhem if you understand assembly, you can study the output of your compiler and maybe figure it out. – eerorika Feb 03 '15 at 13:19
  • 2
    My guess is that you are messing with the stack unwinding code used by the exception system. – André Puel Feb 03 '15 at 13:20
  • 2
    I didnt know that the `ret` instruction takes the address from the stack, so @basile-starynkevitch guess is more accurate. – André Puel Feb 03 '15 at 13:31
1

Using the information that Basile Starynkevitch gave us, I made some experimentation whose results strongly suggests that you are messing with the return address.

I created an intermediate function main2() which will return to main(), so we know what we expect in the stack position regarding the return address of the function. My code prints the previous value in z[a] and I compare it with the memory position of the caller, i.e. the main() function:

#include <iostream>
#include <string>
using namespace std;

void lol() {
    cout << "How did we get here?"<<std::endl;
}

int main(int argc, char** argv);

void main2()
{
   long a, b, z[10];
   b = reinterpret_cast<long>(&lol);
   a = 15; //The offset depends on your machine, I found out 15 by trial and error
   std::cout << "z[a] was " << z[a] << std::endl;
   std::cout << "main() was " << reinterpret_cast<long>(&main) << std::endl;
   z[a] = b;
}

int main(int argc, char** argv) {
    main2();
} 

The output is:

z[a] was 4196677
main() was 4196657
How did we get here?
Segmentation fault (core dumped)

I dont know the size of each instruction once compiled to x86 64bits, but the th instruction of main implementation in the asssembly, is the call instruction:

main:
.LFB1022:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movl    %edi, -4(%rbp)
    movq    %rsi, -16(%rbp)
    call    _Z5main2v

which could explain the offset of 20 bytes in main address and what we had originally in z[a].

André Puel
  • 8,741
  • 9
  • 52
  • 83