0

I'm learning C and don't understand how one could handle case when memory areas retuned by malloc are overlapping. Here is a little demo program

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

void print_mem(char *from, int64_t amount) {
    int step = 0x8;
    for (int i = 0; i < amount; i = i + step) {
        printf("%x = ", from + i);
        for (int j = 0; j < step; ++j) {
            int c = *(from + i + j);
            if (isprint(c) == 0) {
                c = ' ';
            }
            printf("%02x (%c) ", *(from + i + j), c);
        }
        printf("\n");
    }
}

int main() {
    // len = 48
    // 222222222222222222222222222222222222222222222222
    // 111111111111111111111111111111111111111111111111

    char *str1 = (char *) malloc(0x10 * sizeof(char));
    printf("str1: 0x%x\n", str1);
    char *str2 = (char *) malloc(0x10 * sizeof(char));
    printf("str2: 0x%x\n", str2);

    printf("\n\nInitial memory layout\n");
    print_mem(str1, 0x80);

    printf("\n\nType str2: ");
    scanf("%s", str2);
    printf("Memory after scanf str2\n");
    print_mem(str1, 0x80);

    printf("\n\nType str1: ");
    scanf("%s", str1);
    printf("Memory after scanf str1\n");
    print_mem(str1, 0x80);

    printf("\n\nstr2 = %s\n", str2);
    printf("str1 = %s\n", str1);

    return 0;
}

And the output

str1: 0x64c010
str2: 0x64c020


Initial memory layout
64c010 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c018 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c020 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c028 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c030 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c038 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c040 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c048 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c050 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c058 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c060 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c068 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c070 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c078 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c080 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c088 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 


Type str2: 222222222222222222222222222222222222222222222222
Memory after scanf str2
64c010 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c018 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c020 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c028 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c030 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c038 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c040 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c048 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c050 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c058 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c060 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c068 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c070 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c078 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c080 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c088 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 


Type str1: 111111111111111111111111111111111111111111111111
Memory after scanf str1
64c010 = 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 
64c018 = 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 
64c020 = 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 
64c028 = 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 
64c030 = 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 
64c038 = 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 31 (1) 
64c040 = 00 ( ) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c048 = 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 32 (2) 
64c050 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c058 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c060 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c068 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c070 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c078 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c080 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 
64c088 = 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 00 ( ) 


str2 = 11111111111111111111111111111111
str1 = 111111111111111111111111111111111111111111111111

Tried scanf("%s10", str2); but it also not worked as expected when I input long lines

zvar
  • 3
  • 4
  • 4
    The memory allocated by `malloc` is *not* initialized in any way. It's contents is *indeterminate*. Therefore it might contain values that are not equal to the string null terminator and printing the contents of the strings could then lead to *undefined behavior*. – Some programmer dude Jan 28 '23 at 11:08
  • 3
    And your code is going out of bounds of the allocate memory, which *also* leads to undefined behavior. In short, your code does bad things, and its behavior can't be relied upon and there are no conclusions to be made from building or running it. – Some programmer dude Jan 28 '23 at 11:10
  • 4
    Those allocation **don't** overlap. They are just placed in consecutive location in the memory. This is a typical and likely desired behavior even though C standard puts not constraints on it. Why do you want to "handle" this? A valid program should not care about it. – tstanisl Jan 28 '23 at 11:10
  • 3
    As for the overlapping bit, there's no overlap. Each `malloc` call will allocate a unique and non-overlapping memory from the heap. Any overlap you see comes from the bad behavior of your code. – Some programmer dude Jan 28 '23 at 11:11
  • 3
    On a different note, to print a `void *` pointer (you *must* cast your pointers) use the `%p` format specifier, not `%x` which is for `unsigned int`. Mismatching format and and argument type also leads to undefined behavior. – Some programmer dude Jan 28 '23 at 11:12
  • They aren't overlapping. You allocate (in the most jerk way possible) 16 bytes for `str1` and 16 bytes for `str2` (allocations you erroneously fail to check btw). When you then input data into the memory at `str2` does it impact memory at `str1`? No! Because they do not overlap. When you input data into the memory at `str1` it *does* overwrite memory at `str2` but that's because you have a buffer overflow. Don't use `scanf()` for strings. – torstenvl Jan 28 '23 at 11:21
  • Don't cast the return of `malloc()` and don't multiply by the size of a type. You're just being cruel to people reading your code. https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc – torstenvl Jan 28 '23 at 11:24

1 Answers1

0

From The Open Group Base Specifications:

The malloc() function shall allocate unused space for an object whose size in bytes is specified by size and whose value is unspecified.

The order and contiguity of storage allocated by successive calls to malloc() is unspecified. The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object in the space allocated (until the space is explicitly freed or reallocated). Each such allocation shall yield a pointer to an object disjoint from any other object. The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer shall be returned.z If the size of the space requested is 0, the behavior is implementation-defined: the value returned shall be either a null pointer or a unique pointer.

Each allocation yields a pointer to an object disjoint from any other object. There's no memory overlapping. The order and contiguity is also not specified, they may or may not be stored in consecutive memory locations.

What you have here is a heap-based buffer overflow. You're overwriting the contents of str2 when you write pass the end of str1, as the allocations are placed in memory consecutively in this case. Does this mean that the allocations overlap? No, they do not.

That being said, you should:

  1. Check the return value of malloc(). It returns NULL on failure.
  2. Not cast the result of malloc() and family.¹ It's redundant and may hide a bug.

Aside: The correct format specifier to print a pointer is %p, and the pointer must be cast to void *.

printf("str1: %p\n", (void *) str1); 

Else the code invokes undefined behaviour.

Re:

Tried scanf("%s10", str2); but it also not worked as expected when I input long lines

Answer: scanf() should not be used as a user-input interface. Switch to fgets. The call to scanf() can be replaced with:

fgets (buf, sizeof buf, stdin);

But if you insist upon using scanf(), then

scanf("%s10", str2); 

should be:

scanf ("%9s", str2);

And check its return value.

[1] — Do I cast the result of malloc?

Harith
  • 4,663
  • 1
  • 5
  • 20
  • 1
    Thanks for answering. As I understand the general idea - it is my responsibility to care that memory populated correctly. I should somehow check the case when there are only 16 bytes was allocated and more then 16 bytes tried to be written into memory. C language not cares about it, it just allocates requested amount of memory – zvar Jan 28 '23 at 19:18