1

I have a working C program that has the simple function that returns a d character encoded in a byte array.

char foo() {
  return 'd';
}

char byte_array[] = {0xb8,0x64,0x00,0x00,0x00,0xc3};

Then, it executes this function from the byte_array and prints its output.

#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>

char byte_array[] = {0xb8,0x64,0x00,0x00,0x00,0xc3};

int main() {
  void *addr = (void*)((unsigned long)byte_array & ((0UL - 1UL) ^ 0xfff)); /*get memory page*/
  int ans = mprotect(addr, 1, PROT_READ|PROT_WRITE|PROT_EXEC); /*set page attributes*/

  if (ans) {
    perror("mprotect");
    exit(EXIT_FAILURE);
  }

  char (*func)();
  func = (char (*)()) byte_array;
  char function_return = (char)(*func)();

  printf("%c\n", function_return);

  return 0;
}

How can I change this code in order to handle functions like the following?

const char* foo() {
  return "string";
}

I've tried this way, but it just prints a weird character to the console:

const char* (*func)();
func = (const char* (*)()) byte_array;
const char* function_return = (const char*)(*func)();
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Try printing the `char` returned from you call to `func()` as hex with the `%hhx` format specifier instead of `%c` - you'll see the hex code for the `char` returned. Also, `char (*func)();` declares `func` to be a pointer to a function that takes an indeterminate number of arguments, and not a function that takes no arguments. That would be `char (*func)(void);` – Andrew Henle Feb 13 '22 at 13:49
  • 1
    @AndrewHenle there is no simply way of doing it. – 0___________ Feb 13 '22 at 13:59
  • @0___________ I'd consider using a local variable or allocating one via `malloc()` easy. – Andrew Henle Feb 13 '22 at 14:01
  • 1
    @AndrewHenle how do you want to call `malloc`? bytecode will not know the reference to it. Accessing local array via reference after function return is UB. – 0___________ Feb 13 '22 at 14:02
  • @0___________ `char *tmpArray = malloc(sizeof(byte_array)); memcpy(tmpArray,byte_array,sizeof(byte_array));` Geez. – Andrew Henle Feb 13 '22 at 14:03
  • 1
    @AndrewHenle you do not understand. Bytecode has all addresses "fixed" no relocation or symbols. You cant use any functions in it – 0___________ Feb 13 '22 at 14:04
  • 1
    Nevermind the fact that the posted code calls `mprotect()` to set the permissions needed to execute the bytecode. – Andrew Henle Feb 13 '22 at 14:04
  • So if it has alignment requirements, align it properly. How hard is that to get right? – Andrew Henle Feb 13 '22 at 14:05
  • 1
    bytecode does not know where malloc is located in the "main" program. You cant call it. There is no simple way of doing it – 0___________ Feb 13 '22 at 14:06
  • @0___________ What on God's good Earth are you talking about?!?! No one's trying to call `malloc()` from bytecode. – Andrew Henle Feb 13 '22 at 14:07
  • 1
    @AndrewHenle he wants to return reference to the char array with the string – 0___________ Feb 13 '22 at 14:09
  • You'd be much better off compiling a shared library and using `dlopen()` and friends – Shawn Feb 13 '22 at 14:15
  • @0___________ No, **read the question**: "I have a working c program that has the simple function that returns `d` char encoded in a byte array. `char foo() { return 'd'; }`" – Andrew Henle Feb 13 '22 at 14:15
  • 1
    @AndrewHenle rather you read the question: `How to change this code in order to handle functions like that one` – 0___________ Feb 13 '22 at 14:17

2 Answers2

1

gcc and similar

String literals are stored in the .rodata segment. It is very unlikely your program to have .rodata at the same address as when you compile your "bytecode".

There is no simple workaround as you also cant have this array stored in the .data segment for exactly the same reason as when you put your data into .rodata segment.

I have found some workaround I believe:

const char * __attribute__((noinline)) foo(void)
{
    code_start:
    asm volatile("call get_ip");
    asm volatile("get_ip:");
    asm volatile("pop %rax");
    asm volatile("jmp_start:");
    asm volatile("add $str_start, %rax");
    asm volatile("sub $get_ip, %rax");
    asm volatile("jmp str_end");
    str_start:
    asm volatile("str_start:");
    asm volatile(".string \"Hello world\"");
    asm volatile("str_end:");
}
0___________
  • 60,014
  • 4
  • 34
  • 74
  • So is there simply no way to return more then one character? – Jakub Wrobel Feb 13 '22 at 14:27
  • @JakubWrobel There is no simple way of returning reference to object – 0___________ Feb 13 '22 at 14:32
  • @JakubWrobel it has to be a part of the code. there is no guarantee that the reference returned can be read by the program at all. – 0___________ Feb 13 '22 at 15:01
  • while compiling your workaround i get `relocation R_X86_64_32S against .text can not be used when making a PIE object recompile with -fPIE`. Unfortunately adding `-fPIE` doesn't help. – Jakub Wrobel Feb 13 '22 at 15:07
  • @JakubWrobel: It's totally normal for shellcode to contain a string like `/bin/sh` and generate its address in a register in a position-independent way (unlike the code in this answer which uses both absolute addresses separately as 32-bit immediates for `add`, instead of `lea str_start(%rip), %rax` / `ret` in an `__attribute__((naked))` function or hand-written asm. Or at worst, use the call/pop trick (which is the only way to read EIP in 32-bit mode) and `add $str_start - get_ip, %rax`, if you didn't want to take advantage of x86-64 RIP-relative addressing.) – Peter Cordes Feb 13 '22 at 17:12
  • See [How to load address of function or label into register](https://stackoverflow.com/q/57212012) – Peter Cordes Feb 13 '22 at 17:12
  • 1
    This is very clunky. If you want to use `call` at all, put the `call` right in front of the string so it pushes the actual string address. (Either `call` forward over the string data, or call backward if you want shellcode that doesn't contain a `00` byte.) Otherwise just use a RIP-relative LEA: `lea str_start(%rip), %rax`. – Peter Cordes May 08 '22 at 12:11
  • Either way, don't use the *absolute* addresses (`$str_start` and `$get_ip`) as immediates separately. That will happen to work if you get it to link, since it only matters that the distance is right. Thus you could `add $str_start - get_ip, %rax` since the distance will be an assemble-time constant even though each address might be too big for a 32-bit-sign-extended immediate. – Peter Cordes May 08 '22 at 12:12
  • Also, it's really awkward and not officially supported to use separate `asm` statements for each instruction and jump around between them. Just put this in one asm statement at global scope, or a separate `.s` file since you just want the machine code bytes anyway. `foo: call str_end; .string "hi" ;` `str_end: pop %rax; ret`. Or re-arrange that so the call is backwards, so the rel32 doesn't contain zeros. – Peter Cordes May 08 '22 at 12:15
1

You have to write the function in asm yourself; a compiler-generated function to return a pointer to a string-literal will put it in .rodata, not with the machine code bytes of the function.

Keep in mind that the pointer will be to byte inside this array, unless you want to define the contents of the array in an asm source file or with inline asm so you can get the linker to include a reference to a string literal in .rodata. Then you're just writing a function called byte_array, and declaring it weird in C (as a char array instead of a function). Which you can totally do.

But usually the only point of having machine-code bytes in an array like this is for them to be self-contained, and position-independent so they'd still work if copied somewhere. Like shellcode, although apparently you're not worried about avoiding 00 bytes like most shellcode would. (Many paths for getting bytes into buffer overflows involve 0-terminated C strings.)

The most straightforward option is an LEA with a RIP-relative addressing mode; an x86-64 feature that makes it trivial to get a nearby (+-2GiB) address into a register in a fully position-independent way.

# AT&T syntax, assemble with   as  or gcc -c foo.s
# .section .data    # or wherever you want to put this.  Or just assemble to get the bytes
byte_array:
    lea  .Lmystring(%rip), %rax      # not shellcode safe, forward LEA has zeros
    ret
 # *not* in a separate section like one would normally use; contiguous with the code
 .Lmystring: .asciz "Hello world\n"

You could put this inside an asm("... \n" "...\n"); statement at global scope in your C source and declare extern char byte_array[]. An assembler doesn't care whether you write ret or .byte 0xc3, regardless of what section you're in, and the C compiler certainly doesn't know.

Or you can just assemble it in a .s by itself and look at the machine code with a disassembler:

$ gcc -c foo.s
$ objdump -drwC -Mintel foo.o

0000000000000000 <byte_array>:
   0:   48 8d 05 01 00 00 00    lea    rax,[rip+0x1]        # 8 <byte_array+0x8>
   7:   c3                      ret    
   8:   48                      rex.W
   9:   65 6c                   gs ins BYTE PTR es:[rdi],dx
   b:   6c                      ins    BYTE PTR es:[rdi],dx
   c:   6f                      outs   dx,DWORD PTR ds:[rsi]
   d:   20 77 6f                and    BYTE PTR [rdi+0x6f],dh
  10:   72 6c                   jb     7e <byte_array+0x7e>
  12:   64 0a 00                or     al,BYTE PTR fs:[rax]
# total bytes: 0x15 = 21

The bytes after the c3 ret are of course just the ASCII codes for the string, but the coding-space for x86 machine code is almost all used; most byte sequences will decode as something. (There are some illegal combinations, though, e.g. some instructions like prefetcht0 require their operand to be memory, not a register, and there are several illegal 1-byte opcodes like the bytes that are aaa or push ds in 32-bit mode.)


To make this shellcode safe avoiding 00, the LEA has to come after the string, so the 4-byte rel32 is negative, so the high bytes are 0xff not 0x00. You can jmp forward over the string, or you could put the string earlier in your payload before a nop slide. Or for your byte array case, instead of calling to the first byte of the array, (char (*)()) (byte_array + 11) or whatever.

But if you do either of those, it can't end with a 00 so you'd have to store a 0, e.g. after xor-zeroing a register. That means the bytes have to be in write+exec memory, which a normal build won't do, although you're using mprotect anyway. Making the array local to a function (so the bytes get copied to the stack) and compiling with gcc -zexecstack will also do that: How to get c code to execute hex machine code?

One terminating 00 byte at the end of your shellcode does get copied by strcat and similar buffer overflows, so we don't need to do anything special to make sure the terminator for the .asciiz makes it through.

Another thing you can do is lea string+0x20202020(%rip), %rax / sub $0x20202020, %rax to make all the bytes of the LEA non-zero (and printable ASCII.)


call/pop shellcode trick

It could save a byte of code-size to use a common shellcode trick of using a call to push the address of the byte following it (where you put the string), and then pop that.

(@0___________'s answer uses a very cumbersome variation on that where they use call/pop to get RIP, like 32-bit mode where RIP-relative addressing isn't available. Then separately add one absolute address and subtract another, instead of just adding the difference.)

# .section .data
byte_array:                    # very inefficient, use LEA instead
            jmp .Ldo_call     # the call has to be backwards to avoid 00 bytes in its rel32, but jmp rel8 can go forward
.Lback:     pop  %rax
            ret
.Ldo_call:  call .Lback       # pushes string address, jumps backwards
            .asciz "Hello world\n"

This is the standard shellcode version, making sure the call goes backwards

This one is bad for performance because it has a call not matched with a ret, and the call's offset isn't +0 (which is special-cased for return-address prediction for the call/pop trick that 32-bit PIC code uses to find its own EIP.)

If you don't care about being shellcode, just call forward over the bytes whose address you want to push.

# .section .data
byte_array:
      call 0f            # using GAS local label syntax, forward to next 0:
   .asciz "Hello world\n"
0:    pop  %rax
      ret

The call rel32 itself will have the last 3 bytes being 0, and the .asciz contains a 0 byte before the 58 pop rax / C3 ret, as you can see:

   0:   e8 0d 00 00 00          call   12 <byte_array+0x12>
     5:   48                      rex.W
     ... ASCII bytes ...
     f:   64 0a 00                or     al,BYTE PTR fs:[rax]
  12:   58                      pop    rax
  13:   c3                      ret
# total bytes: 0x14 = 20.

So it's one byte shorter than the LEA version, same was if both were using a forward jmp rel8 to avoid 00 bytes. (Although as I said, for LEA that alone doesn't solve the problem if you want your string to be 0 terminated.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847