0

I am practicing reverse engineering software. I am using Microsoft Visual Studio. I created an empty project and then created an empty file which I called main.cpp. I then wrote the following code, compiled

int main()
{
    char* str = "hello matthew";

    int x = 15;

    return 0;
}

When I brought the release version of the executable over to BinText and IdaPro, the string "hello matthew" was no where to be found. I could also never find the value 15 either in base 10 or hexadecimal.

I cannot begin to understand reverse engineering if I cannot find the references to the values I am looking for in the executable.

My theory is that because my program does absolutely nothing that the compiler just omitted it all, but I do not know for sure. Does anyone know why I cannot locate that string or the value 15 in the executable when I disassemble it?

Matthew
  • 3,886
  • 7
  • 47
  • 84
  • 8
    Yes, they have been optimized out. Either do something with them or do a debug build. – Jester May 15 '18 at 22:17
  • 2
    It’s a common optimization to remove unused variables. – Pete Becker May 15 '18 at 22:18
  • 1
    They have probably been optimised away in the release version. Add a statement to output them. –  May 15 '18 at 22:18
  • 3
    "base 10 or hexadecimal" is irrelevant at this point, it's all binary. And it is nearly-guaranteed that all single bytes will appear *somewhere* in an executable ... the question is whether they appear anywhere *meaningfully*. – o11c May 15 '18 at 22:37
  • 1
    Matt Godbolt's CppCon2017 talk [“What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”](https://youtu.be/bSkpMdDe4g4) would be a good starting point for looking at compiler output. http://godbolt.org/ has MSVC installed, as well as gcc/clang/ICC (and some non-x86 compilers). – Peter Cordes May 16 '18 at 05:28

2 Answers2

1

I cannot begin to understand reverse engineering ...

The first step is to actually understand how the program is built out.

Before you can understand how to reverse a program, you need to understand how it's compiled and built; reversing a binary built for Windows is vastly different from reversing a binary for a *nix system.

To that, since you're using Visual Studio, you can see this answer (option 2) explaining how to enable the assembly output of your code. Alternatively if you're compiling via command line, you can pass /FAs and /Fa to generate the assembly inlined with the source.

Your code produces the following assembly:

; Listing generated by Microsoft (R) Optimizing Compiler Version 18.00.40629.0 

    TITLE   C:\Code\test\test.cpp
    .686P
    .XMM
    include listing.inc
    .model  flat

INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES

CONST   SEGMENT
$SG2548 DB  'hello matthew', 00H
CONST   ENDS
PUBLIC  _main
; Function compile flags: /Odtp
; File c:\code\test\test.cpp
_TEXT   SEGMENT
_x$ = -8                        ; size = 4
_str$ = -4                      ; size = 4
_main   PROC

; 2    : {

    push    ebp
    mov ebp, esp
    sub esp, 8

; 3    :     char* str = "hello matthew";

    mov DWORD PTR _str$[ebp], OFFSET $SG2548

; 4    : 
; 5    :     int x = 15;

    mov DWORD PTR _x$[ebp], 15          ; 0000000fH

; 6    : 
; 7    :     return 0;

    xor eax, eax

; 8    : }

    mov esp, ebp
    pop ebp
    ret 0
_main   ENDP
_TEXT   ENDS
END

While this is helpful to understand how and what your code is doing, one of the best way to start reversing, is to throw a binary in a debugger, like attaching Visual Studio to an executable, and viewing the assembly as the program is running.

It can depend on what your after since a binary could potentially be obfuscated; that is to say that there could be strings within the binary, but they could be encrypted or just scrambled so as to be unreadable until decrypted/unscrambled by some function within the binary.

So just searching for strings won't necessarily give you anything, and trying to search for a specific binary value in the assembled code is like trying to find a needle in a stack of needles. Know why your trying to reverse a program, then attack that vector.

Does anyone know why I cannot locate that string or the value 15 in the executable when I disassemble it?

As has been mentioned, and as you have guessed, the "release" binary you're searching through was optimized, and the compiler just removed the unused variables so the assembly was essentially returning 0.

I hope that can help.

txtechhelp
  • 6,625
  • 1
  • 30
  • 39
  • *so the assembly was essentially pushing the stack/instruction pointers around, then pop'ing them back...* Unlikely; MSVC omits the frame pointer with optimization enabled. You should just get `xor eax,eax` / `ret` for `main`. (Yup, that's what happens: https://godbolt.org/g/8XJrT5) IDK why you compiled obsolete 32-bit code instead of x86-64 code, but even there MSVC doesn't waste instructions on a stack frame in a function that just returns zero. – Peter Cordes May 16 '18 at 03:28
  • @PeterCordes .. I built "obsolete" 32-bit code because it was on a 32-bit machine :) that aside, yes, MSVC will optimize out all the useless bits, but OP doesn't seem to know where to start, so getting them in the right direction, they can then start the journey to amass the knowledge you and I have ;) – txtechhelp May 16 '18 at 03:34
1

the main reason is that your code does nothing useful with x and str, so they are entirely redundant!!, and no need for them to even exist in your code! so the compiler automatically removes them from the compiled code "optimization"!!.

if you really want to see them in the compiled code under debuggers, you need to use them or simply tell the compiler not to optimize this part of the code!!

This is how to tell the compiler not to optimize these variable's locations by using volatile qualifier

#include <iostream>

int main(int argc, char** argv) {
    const char* volatile str = "hello matthew";
    volatile int x = 15;
    return 0;
}

this shows that your variables are included in the compiled code in IDA Pro

or as I also said just use them!!!

#include <iostream>

int main(int argc, char** argv) {
    const char* str = "hello matthew";
    int x = 15;
    std::cout << str << x;
    return 0;
}