2

So I read that strcat function is to be used carefully as the destination string should be large enough to hold contents of its own and source string. And it was true for the following program that I wrote:

#include <stdio.h>
#include <string.h>

int main(){
    char *src, *dest;
    printf("Enter Source String : ");
    fgets(src, 10, stdin);
    printf("Enter destination String : ");
    fgets(dest, 20, stdin);
    strcat(dest, src);
    printf("Concatenated string is %s", dest);
    return 0;
}

But not true for the one that I wrote here:

#include <stdio.h>
#include <string.h>

int main(){
    char src[11] = "Hello ABC";
    char dest[15] = "Hello DEFGIJK";
    strcat(dest, src);
    printf("concatenated string %s", dest);
    getchar();
    return 0;
}

This program ends up adding both without considering that destination string is not large enough. Why is it so?

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
  • 14
    Undefined behavior includes "appears to work". – Andrew Henle Aug 15 '18 at 14:22
  • 1
    You probably expected some kind of error popping up. While there was no error reported, your code is still wrong in the sense that: It is not defined what your code will do when executed. This might be a subtle point but an important point. See for example: https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior or https://en.wikipedia.org/wiki/Undefined_behavior – Micha Wiedenmann Aug 15 '18 at 14:26
  • Welcome to the site! Check out the [how-to-ask page](https://stackoverflow.com/help/how-to-ask) for more about asking questions that will attract quality answers. You can [edit your question](https://stackoverflow.com/posts/51860543/edit) to include more information if commenters request it. Thanks for including the code you tried --- what output did you get from both programs? – cxw Aug 15 '18 at 14:27
  • Possible duplicate of [Undefined, unspecified and implementation-defined behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior) – Micha Wiedenmann Aug 15 '18 at 14:29
  • 2
    Your first program is even worse then your second one. Where do the pointers `src` and `dest` point?? – Steve Summit Aug 15 '18 at 14:31
  • It is your job to determine if the destination is large enough. – Christian Gibbons Aug 15 '18 at 14:31
  • 1
    Where do you think the input is written by `fgets(src, 10, stdin);`; especially in case of the first code example? That is an important foundation for the discussion where strcat operates. – Yunnosch Aug 15 '18 at 14:34
  • Err... this is because the "strcat function in C assumes the destination string is large enough to hold contents of source string and its own”. As in, this is the caller's responsibility. Not the function, not the compiler. – Lundin Aug 15 '18 at 14:46
  • Also please study [Crash or “segmentation fault” when data is copied/scanned/read to an uninitialized pointer](https://stackoverflow.com/questions/37549594/crash-or-segmentation-fault-when-data-is-copied-scanned-read-to-an-uninitializ). – Lundin Aug 15 '18 at 14:48

5 Answers5

10

The strcat function has no way of knowing exactly how long the destination buffer is, so it assumes that the buffer passed to it is large enough. If it's not, you invoke undefined behavior by writing past the end of the buffer. That's what's happening in the second piece of code.

The first piece of code is also invalid because both src and dest are uninitialized pointers. When you pass them to fgets, it reads whatever garbage value they contain, treats it as a valid address, then tries to write values to that invalid address. This is also undefined behavior.

One of the things that makes C fast is that it doesn't check to make sure you follow the rules. It just tells you the rules and assumes that you follow them, and if you don't bad things may or may not happen. In your particular case it appeared to work but there's no guarantee of that.

For example, when I ran your second piece of code it also appeared to work. But if I changed it to this:

#include <stdio.h>
#include <string.h>

int main(){
    char dest[15] = "Hello DEFGIJK";
    strcat(dest, "Hello ABC XXXXXXXXXX");
    printf("concatenated string %s", dest);
    return 0;
}

The program crashes.

dbush
  • 205,898
  • 23
  • 218
  • 273
8

I think your confusion is not actually about the definition of strcat. Your real confusion is that you assumed that the C compiler would enforce all the "rules". That assumption is quite false.

Yes, the first argument to strcat must be a pointer to memory sufficient to store the concatenated result. In both of your programs, that requirement is violated. You may be getting the impression, from the lack of error messages in either program, that perhaps the rule isn't what you thought it was, that somehow it's valid to call strcat even when the first argument is not a pointer to enough memory. But no, that's not the case: calling strcat when there's not enough memory is definitely wrong. The fact that there were no error messages, or that one or both programs appeared to "work", proves nothing.

Here's an analogy. (You may even have had this experience when you were a child.) Suppose your mother tells you not to run across the street, because you might get hit by a car. Suppose you run across the street anyway, and do not get hit by a car. Do you conclude that your mother's advice was incorrect? Is this a valid conclusion?

In summary, what you read was correct: strcat must be used carefully. But let's rephrase that: you must be careful when calling strcat. If you're not careful, all sorts of things can go wrong, without any warning. In fact, many style guides recommend not using functions such as strcat at all, because they're so easy to misuse if you're careless. (Functions such as strcat can be used perfectly safely as long as you're careful -- but of course not all programmers are sufficiently careful.)

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
4

The strcat() function is indeed to be used carefully because it doesn't protect you from anything. If the source string isn't NULL-terminated, the destination string isn't NULL-terminated, or the destination string doesn't have enough space, strcat will still copy data. Therefore, it is easy to overwrite data you didn't mean to overwrite. It is your responsibility to make sure you have enough space. Using strncat() instead of strcat will also give you some extra safety.

Edit Here's an example:

#include <stdio.h>
#include <string.h>

int main()
{
    char s1[16] = {0};
    char s2[16] = {0};
    strcpy(s2, "0123456789abcdefOOPS WAY TOO LONG");
      /* ^^^ purposefully copy too much data into s2 */
    printf("-%s-\n",s1);
    return 0;
}

I never assigned to s1, so the output should ideally be --. However, because of how the compiler happened to arrange s1 and s2 in memory, the output I actually got was -OOPS WAY TOO LONG-. The strcpy(s2,...) overwrote the contents of s1 as well.

On gcc, -Wall or -Wstringop-overflow will help you detect situations like this one, where the compiler knows the size of the source string. However, in general, the compiler can't know how big your data will be. Therefore, you have to write code that makes sure you don't copy more than you have room for.

Community
  • 1
  • 1
cxw
  • 16,685
  • 2
  • 45
  • 81
  • 3
    The primary advantage of `strncat` is that it motivates you to think about the issue of buffer size. Using that function instead of `strcat` is in no way a silver bullet, however. You still need to perform exactly the same analysis, and you can still produce undefined behavior if you get that wrong. – John Bollinger Aug 15 '18 at 14:36
  • "*Never*" might be an exaggeration - but then again, perhaps not: if you're in a position to make guarantees, you almost certainly know the length of the initial string, so could more efficiently use `strcpy()` or `memcpy()` instead. – Toby Speight Aug 15 '18 at 14:36
  • 2
    @TobySpeight edited. I personally think "never use `strcat`" is valid advice for a *beginning* C programmer, and as far as I can tell from the question the OP is just starting out. – cxw Aug 15 '18 at 14:40
  • I think I agree with you - it's very rare that you don't know the sizes of both strings but do have an upper bound on the total length. The only case I can think of is where you've formatted numbers with a known range (but even that can catch you out when the surrounding code changes). – Toby Speight Aug 15 '18 at 14:42
  • 4
    @JohnBollinger: I think it is even more valid to advise "never use `strncat()`". Do you know what's wrong with the code: `char data[16] = ""; strncat(data, "ABCDEFGHIJKLMNOP", sizeof(data));`? Most people don't realize that's a buffer overflow — which is why I regard `strncat()` as lethal. If you know enough about the lengths of the strings to be able to use `strncat()` safely, you know enough to use `memmove()` (or `memcpy()` — but that's a separate discussion) instead. – Jonathan Leffler Aug 15 '18 at 15:31
3

Both snippets invoke undefined behavior - the first because src and dest are not initialized to point anywhere meaningful, and the second because you are writing past the end of the array.

C does not enforce any kind of bounds checking on array accesses - you won't get an "Index out of range" exception if you try to write past the end of an array. You may get a runtime error if you try to access past a page boundary or clobber something important like the frame pointer, but otherwise you just risk corrupting data in your program.

Yes, you are responsible for making sure the target buffer is large enough for the final string. Otherwise the results are unpredictable.

John Bode
  • 119,563
  • 19
  • 122
  • 198
1

I'd like to point out what is actually happening in the 2nd program in order to illustrate the problem.

It allocates 15 bytes at the memory location starting at dest and copies 14 bytes into it (including the null terminator):

    char dest[15] = "Hello DEFGIJK";

...and 11 bytes at src with 10 bytes copied into it:

    char src[11] = "Hello ABC";

The strcat() call then copies 10 bytes (9 chars plus the null terminator) from src into dest, starting right after the 'K' in dest. The resulting string at dest will be 23 bytes long including the null terminator. The problem is, you allocated only 15 bytes at dest, and the memory adjacent to that memory will be overwritten, i.e. corrupted, leading to program instability, wrong results, data corruption, etc.

Note that the strcat() function knows nothing about the amount of memory you've allocated at dest (or src, for that matter). It is up to you to make sure you've allocated enough memory at dest to prevent memory corruption.

By the way, the first program doesn't allocate memory at dest or src at all, so your calls to fgets() are corrupting memory starting at those locations.