0

So in our university assignment we were asked to change the output of two sequential printf("%s", s1); printf("%s", s2); functions without touching the variables. The intent was for us to use the theoretical knowledge of the memory layout of the process on Linux based systems.

The expected output is for the first printf call to output both s1 and s2 split by space(s), and for the output of the second to remain as intended by the original application. The values and sizes of s1, and s2 are known.

My initial idea was to malloc(0) and subtract the offset equal to the length of the string (+1 for the chunk size value) , and then cast that as a char *. Since the 2 string values are very small (definitely less than 4KiB, which is the page size) I was hoping that there'd be just one region allocated for heap, and hence it is linear.

But from what it seems I get values of zero, which means I'm looking through an un-initialized memory, or something different then the strings I hoped to locate.

Below is the code in question:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void heap_attack() { 
    // alternatively it can have signature of 
    // void heap_attack(void * v);
    // but the task is assumed to be solvable with the original signature
    
}

int main(int argc, char const *argv[])
{
    char * s1 = malloc(9);
    char * s2 = malloc(9);

    if (s1 == NULL || s2 == NULL) return EXIT_FAILURE;

    strcpy(s1, "s0000000");
    strcpy(s2, "s1111111");

    heap_attack();

    printf("student 1: %s\n", s1);
    printf("student 2: %s\n", s2);

    free(s1);
    free(s2);

    return 0;
}

My implementation of the heap_attack begins as follows:

void heap_attack() {
  char * heap_end = malloc(0) - 1; // 1 for the size fragment preceding writable space
  char * s1 = heap_end - 9;
  printf("%s", s1); // here i expected to see the s1111111 string printed to stdout
}
trincot
  • 317,000
  • 35
  • 244
  • 286
Nikolai Savulkin
  • 697
  • 5
  • 12
  • Show the full program you are asked to modify the output for. You might have misunderstood the assignment. – Eugene Sh. May 18 '22 at 16:53
  • @EugeneSh. I've added it – Nikolai Savulkin May 18 '22 at 17:04
  • try malloc(9) instead? – user253751 May 18 '22 at 17:05
  • @user253751 why would that make a difference? I use that malloc just to grab the current limit of the heap? – Nikolai Savulkin May 18 '22 at 17:07
  • OK, malloc-ing some dummy buffer and going from there seems to be the right direction. It is just `malloc` is not expected to allocate any valid memory for size `0`, so replace this `0` with some positive value you can offset later. – Eugene Sh. May 18 '22 at 17:07
  • Re "*The intent was for us to use the theoretical knowledge of the memory layout.*" "*subtract the offset equal to the length of the string*", Then you haven't learned your lesson. [This](https://godbolt.org/z/nxW83vv4n) shows the difference between `student1` and `student2` isn't 9, so why would it be 9 between one of them and you're new block? – ikegami May 18 '22 at 17:09
  • @EugeneSh. i just had similar idea so i malloced 1 byte and offset 2 instead of 1, however that still goes to zero straight away. – Nikolai Savulkin May 18 '22 at 17:09
  • @ikegami why so? in the first line i malloc 0 (or 1) and then offset 1 (or 2) to get to the end of the string, passing the freshly allocated size fragment. Then I offset again by 9, which is length of the string + null terminator. I thought one char consumes exactly one byte, so it should be 9 bytes offset from the end of the string? – Nikolai Savulkin May 18 '22 at 17:13
  • It's not the size of the string that's wrong. The string is indeed 9 bytes in size. – ikegami May 18 '22 at 17:14
  • If I were to guess from the numbers given, it allocates in blocks of 16 bytes, and there's 16 bytes of overhead, possibly preceding the start address. – ikegami May 18 '22 at 17:16
  • uh, note that my earlier link accidentally used `-fsanitize=address`, which you could have an effect. That said, removing that still gives the same delta. – ikegami May 18 '22 at 17:17
  • @ikegami could you elaborate on why you reason in such manner? I also did did try to offset assuming that size fragment is equal to sizeof(size_t), but i never considered that it over allocates the actual data part. I mean what would be the reason for that? – Nikolai Savulkin May 18 '22 at 17:19
  • Student exercises like this one, involving exploiting specific software, should be associated with specific documentation for the specific software provided by the instructor. They are not appropriate for general-knowledge questions on Stack Overflow, certainly not if a specific implementation of the C standard library is not identified, including its specific version number. – Eric Postpischil May 18 '22 at 17:21
  • Re "*could you elaborate on why you reason in such manner?*", First of all, the memory needs to be suitably aligned for any data type, since malloc has no idea how the memory will be used. Here, we're probably talking about 64-bit alignment, so the amount of memory used will have to be a multiple of 8. Therefore, at least 16 bytes are reserved to store 9 bytes of data. – ikegami May 18 '22 at 17:36
  • Then there's necessarily overhead for each block (size, etc). Storing it in the block makes more sense than storing it elsewhere. So that's some of the space used up. 8 bytes seems kinda small. 16 is probably the minimum. // 16 bytes for data and 16 bytes for overhead sums up exactly to the amount seen. So it makes it a good guess. Doesn't mean it's right, though. – ikegami May 18 '22 at 17:37
  • @EricPostpischil This assignment was not instructed to be tested on a single setup, while It might be more specific when it comes to general case, this question still involves common theoretical knowledge, and at least in my humble opinion is adequate. Perhaps the C tag should be accompanied by some more general "theoretical" one, but aside of that it doesnt look improper to me. – Nikolai Savulkin May 18 '22 at 17:37
  • Related reading: [The Lost Art of Structure Packing](http://www.catb.org/esr/structure-packing/), sections 3, 4 and 5. – ikegami May 18 '22 at 17:41
  • 1
    @HerbalTea: Re “This assignment was not instructed to be tested on a single setup”: That is a defect in the assignment. The C standard does not specify any requirements for the layout of allocated memory, so an assignment like this **must** depend on specifics of a C implementation. – Eric Postpischil May 18 '22 at 17:42

2 Answers2

1

Assuming that you are working on GNU/Linux using glibc, which is the most common setup, then you can make some assumptions that will help you solve the problem.

  1. As you say, both allocated chunks will reside into the same newly initialized (linear) heap, which usually spans multiple pages (several KBs). FYI: page size on Linux x86 is 4K, not 4M.
  2. Right after initialization of the heap, consecutive allocations (malloc() calls without any free() in the middle) will allocate contiguous chunks of memory, so the first allocated string will be right before the second.
  3. You can know the structure used by the glibc allocator by looking at the source code (pick the correct version, running /lib/x86_64-linux-gnu/libc.so.6 will print the version). You can also look at this other answer of mine where I briefly explain the internal layout of a malloc_chunk.
  4. Either by looking at the source code or by testing, you can notice that malloc'd chunks are actually rounded up in size to multiples of 2 * size_t.

Assumptions 1 and 2 (which again we can make in this specific environment) guarantee that:

  • s2 > s1 (that is, s2 is in memory after s1)
  • There should be exactly (s2 - s1 - strlen(s2) - 1 bytes between the two strings, and this value will not change unless strlen(s2) changes.
  • The next allocation malloc(x) will be after s2, and always at the same fixed offset from s2, which you can easily calculate once and then use (assuming s2 keeps the same length).

Assumption 3 above will help you figure out the actual size of the chunks for the calculations you need. For malloc(9) the corresponding chunk (header included) would be 32 bytes (16 for header + 16 for data assuming sizeof(size_t) == 8). Furthermore malloc_usable_size() will give you the exact size excluding the header.

Filling up those bytes with spaces will accomplish what you wish for. However, in doing so, you will destroy the chunk header for s1, and any later attempt at freeing the (corrupted) chunk will most likely cause a crash. You can avoid free() altogether in your case since you don't really need it. The operating system will reclaim memory anyway on program termination.

A possible implementation of your heap_attack() would be:

void heap_attack(void) {
    char *top_chunk = malloc(1);
    char *top_chunk_header = top_chunk - 2 * sizeof(size_t);
    char *s2 = top_chunk_header - 16;       // assuming strlen(s2) <= 15
    char *s1 = s2 - 2 * sizeof(size_t) - 16 // assuming strlen(s1) <= 15;

    // Fill memory between s1 + 8 and s2 with spaces
}
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
1
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>

#define STRING_DISTANCE         (0x20)
#define ARBITRARY_STACK_OFFSET  (20)

// The strategy is as follows: We can tell the difference from the contigous heap blocks (0x20) in advance (at least for the same machine)
// So given s1 - we would like to simply overwrite everything between with spaces - that way when it prints s1, it will print spaces until s2 and its null terminator
// The only challenge is to find s1, the way we do that here is by simply traveling up the stack and finding our two strings, assuming we know their offsets.
void heap_attack()
{ 
    size_t a = 0;
    void * s1 = 0;

    for (uintptr_t * stack_ptr = &a; a < ARBITRARY_STACK_OFFSET; a += 1)    // Travel up the stack, from a variable in our frame, to the function calling us.
    {
        if (stack_ptr[a] - (uintptr_t)s1 == STRING_DISTANCE)
        {
            printf("stack offset - %lu\ns1 - %p\n", a, (void *)stack_ptr[a]);
            break;
        }

        s1 = stack_ptr[a];
    }

    for (char * x = (char *)s1 + strlen(s1); x < s1 + STRING_DISTANCE; x += 1)
    {
        *x = ' ';
    }
}

int main()
{
    char * s1 = malloc(9);
    char * s2 = malloc(9);

    printf("%p - %p\n", s1, s2);

    if (s1 == NULL || s2 == NULL) return EXIT_FAILURE;

    strcpy(s1, "s0000000");
    strcpy(s2, "s1111111");

    heap_attack();

    printf("student 1: %s\n", s1);
    printf("student 2: %s\n", s2);

    // I disabled the frees because I corrupted the blocks, corrupting the blocks is neccessary so it would print spaces between them
    // If we don't want to corrupt the blocks, another option would be to also override the format string, but that's outside the scope of this challenge imo
    // free(s1);
    // free(s2);


    return 0;
}

And if we choose the heap solution:

void heap_attack()
{ 
    char * s1 = (char *)malloc(9) - STRING_DISTANCE * 2;

    for (char * x = (char *)s1 + strlen(s1); x < s1 + STRING_DISTANCE; x += 1)
    {
        *x = ' ';
    }
}
malkaroee
  • 187
  • 1
  • 6
  • 1
    So what you do there, is in fact go up the stack and obtain the heap pointers from the stack, rather then traversing the heap (if its even possible, i don't know now), right? – Nikolai Savulkin May 18 '22 at 17:29
  • @HerbalTea yes, exactly. It's definitely possible to go through the heap as well, with a 9 allocation - and it probably is the intended solution. The reason I didn't go down that path is actually an interesting bug I had, I tried to allocate another 9-lengthed string in `heap_attack()`, but it fell in another page. Eventually I found out that the reason for it is actually me printing the pointers in main! I will now post the heap solution :) – malkaroee May 18 '22 at 17:37
  • 1
    Thank you for clarification, In fact the stack based solution looks more elegant to me as it does not require that much traversing. I would base my solution on that one instead (ofc crediting the original source). Thanks you, that is a nice one to remember – Nikolai Savulkin May 18 '22 at 17:41