printf("sub_str before pattern: %s\r\n", sub_s1 - source_str); // Memory corruption
You're taking the difference of two pointers, and printing it as though it was a pointer to a string. In practice, on your machine, this probably calculates a meaningless number and interprets it as a memory address. Since this is a small number, when interpreted as an address, on your system, this probably points to unmapped memory, so your program crashes. Depending on the platform, on the compiler, on optimization settings, on what else there is in your program, and on the phase of the Moon, anything could happen. It's undefined behavior.
Any half-decent compiler would tell you that there's a type mismatch between the %s
directive and the argument. Turn those warnings on. For example, with GCC:
gcc -Wall -Wextra -Werror -O my_program.c
char *new_str = (char *)malloc(…);
strcat(new_str, '\0');
strcat(new_str, "…");
The first call to strcat
attempts to append '\0'
. This is a character, not a string. It happens that since this is the character 0, and C doesn't distinguish between characters and numbers, this is just a weird way of writing the integer 0
. And any integer constant with the value 0 is a valid way of writing a null pointer constant. So strcat(new_str, '\0')
is equivalent to strcat(new_str, NULL)
which will probably crash due to attempting to dereference the null pointer. Depending on the compiler optimizations, it's possible that the compiler will think that this block of code is never executed, since it's attempting to dereference a null pointer, and this is undefined behavior: as far as the compiler is concerned, this can't happen. This is a case where you can plausibly expect that the undefined behavior causes the compiler to do something that looks preposterous, but makes perfect sense from the way the compiler sees the program.
Even if you'd written strcat(new_str, "\0")
as you probably intended, that would be pointless. Note that "\0"
is a pointless way of writing ""
: there's always a null terminator at the end of a string literal¹. And appending an empty string to a string wouldn't change it.
And there's another problem with the strcat
calls. At this point, the content of new_str
is not initialized. But strcat
(if called correctly, even for strcat(new_str, "")
, if the compiler doesn't optimize this away) will explore this uninitialized memory and look for the first null byte. Because the memory is uninitialized, there's no guarantee that there is a null byte in the allocated memory, so strcat
may attempt to read from an unmapped address when it runs out of buffer, or it may corrupt whatever. Or it may make demons fly out of your nose: once again it's undefined behavior.
Before you do anything with the newly allocated memory area, make it contain the empty string: set the first character to 0. And before that, check that malloc
succeeded. It will always succeed in your toy program, but not in the real world.
char *new_str = malloc(…);
if (new_str == NULL) {
return NULL; // or whatever you want to do to handle the error
}
new_str[0] = 0;
strcat(new_str, …);
¹ The only time there isn't a null pointer at the end of a "…"
is when you use this to initialize an array and the characters that are spelled out fill the whole array without leaving room for a null terminator.