1

I understand that assigning memory allocation for string requires n+1 due to the NULL character. However, the question is what if you allocate 10 chars but enter an 11 char string?

#include <stdlib.h>
int main(){
    int n;
    char *str;
    printf("How long is your string? ");
    scanf("%d", &n);
    str = malloc(n+1);
    if (str == NULL) printf("Uh oh.\n");
    scanf("%s", str);
    printf("Your string is: %s\n", str);
}

I tried running the program but the result is still the same as n+1.

Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • 1
    _" if you allocate 10 chars but enter an 11 char string"_ you have _undefined behavior_ so just don't let that happen. You can't trust anything such a program does. – Ted Lyngmo Jan 13 '23 at 17:15
  • 1
    The program asked how long a string you intend to enter, and you ***lied*** to it, telling the program you would only enter 10 characters, but you actually entered 11. That is ***undefined behavior***. Anything *can* happen. Due to architectural reasons, the most common result is the program **appears** to work properly, even though the behavior is not guaranteed. – abelenky Jan 13 '23 at 17:18
  • Nomenclature: "... due to the NULL character." --> `NULL` is the _null pointer constant_, best used in pointer contexts. What is best here is _null character_ as that matches how the C spec describes it. – chux - Reinstate Monica Jan 13 '23 at 19:14

4 Answers4

1

If you allocated a char* of 10 characters but wrote 11 characters to it, you're writing to memory you haven't allocated. This has undefined behavior - it may happen to work, it may crash with a segmentation fault, and it may do something completely different. In short - don't rely on it.

Mureinik
  • 297,002
  • 52
  • 306
  • 350
1

If you overrun an area of memory given you by malloc, you corrupt the RAM heap. If you're lucky your program will crash right away, or when you free the memory, or when your program uses the chunk of memory right after the area you overran. When your program crashes you'll notice the bug and have a chance to fix it.

If you're unlucky your code goes into production, and some cybercriminal figures out how to exploit your overrun memory to trick your program into running some malicious code or using some malicious data they fed you. If you're really unlucky, you get featured in Krebs On Security or some other information security news outlet.

Don't do this. If you're not confident of your ability to avoid doing it, don't use C. Instead use a language with a native string data type. Seriously.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
1

what if you allocate 10 chars but enter an 11 char string?

scanf("%s", str); experiences undefined behavior (UB). Anything may happen including "I tried running the program but the result is still the same as n+1." will appear OK.

Instead always use a width with scanf() and "%s" to stop reading once str[] is full. Example:

char str[10+1];
scanf("%10s", str);

Since n is variable here, consider instead using fgets() to read a line of input.

Note that fgets() also reads and saves a trailing '\n'.
Better to use fgets() for user input and drop scanf() call altogether until you understand why scanf() is bad.

str = malloc(n+1);
if (str == NULL) printf("Uh oh.\n");
if (fgets(str, n+1, stdin)) {
  str[strcspn(str, "\n")] = 0; // Lop off potential trailing \n
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • [`strchr`](https://en.cppreference.com/w/cpp/string/byte/strchr) may come in handy instead of `strcspn` tough – Ted Lyngmo Jan 13 '23 at 19:34
  • @TedLyngmo `strchr()` is more likely to [use wrong](https://stackoverflow.com/a/27729970/2410359). How would you suggest using it? – chux - Reinstate Monica Jan 13 '23 at 19:38
  • I suspect a trick question :-) `strchr` to find it, dereference the pointer, assign '`\0'` – Ted Lyngmo Jan 13 '23 at 19:47
  • @TedLyngmo As `strchr(str, '\n')` may return `NULL` due to `fgets()` and a full buffer, last line lacks a `'\n'` or a read _null character_, so an untested pointer with "dereference the pointer, assign '\0'" risks trouble. – chux - Reinstate Monica Jan 13 '23 at 19:54
  • `if (fgets(str, n+1, stdin))` - makes that a non-issue, right? – Ted Lyngmo Jan 13 '23 at 19:55
  • @TedLyngmo Nej. `fgets(str, n+1, stdin)` returns non-`NULL` and `strchr(str, '\n')` returns `NULL` when the last line of input lacked a `'\n'` or `fgets()` read a _null character_. Not common, yet possible. `strcspn(str, "\n")` handles both cases and `strcspn(str, "\n\r")` nicely handles reading from _foreign_ shells where unexpected line endings occur. – chux - Reinstate Monica Jan 13 '23 at 19:57
  • ( _"Nej"_ ? - What? Are you from Sweden too? :-) ) - I don't mean that normal checking should be skipped. `strchr`, check, then assign. – Ted Lyngmo Jan 13 '23 at 20:15
  • 1
    @Ted No, more like a _kärlek förlorad_. – chux - Reinstate Monica Jan 13 '23 at 20:39
0

When you write 11 bytes to a 10-byte buffer, the last byte will be out-of-bounds. Depending on several factors, the program may crash, have unexpected and weird behavior, or may run just fine (i.e., what you are seeing). In other words, the behavior is undefined. You pretty much always want to avoid this, because it is unsafe and unpredictable.

Try writing a bigger string to your 10-byte buffer, such as 20 bytes or 30 bytes. You will see problems start to appear.

Lught
  • 134
  • 6
  • and if you write 5000 bytes you will almost surely get a problem. The more bytes you overwrite, the more likely it is to overwrite an important one. – user253751 Jan 13 '23 at 17:24
  • _"Depending on several factors, the program may crash, have unexpected and weird behavior, or may run just fine (i.e., what you are seeing)."_ - Such a program **_has_** UB. Period. It's also really hard to determine that "it's running fine" just by seeing the expected output on the screen. It could also reformat the harddisk in the background. – Ted Lyngmo Jan 13 '23 at 17:25
  • @user253751 Or will hit some unmapped address... – Eugene Sh. Jan 13 '23 at 17:25
  • @TedLyngmo "Such a program has UB. Period" That is exactly what I said: "In other words, the behavior is undefined." – Lught Jan 13 '23 at 18:42
  • _"the program **may** [...] have unexpected ..."_ is what I objected to. It's better to be clear. The program _has_ undefined behavior and can do just about anything. – Ted Lyngmo Jan 13 '23 at 18:43
  • Please re-read what I wrote. Your objection makes no sense. – Lught Jan 13 '23 at 18:44
  • What factors are you taking into account? What factor would make you conclude that the program "runs fine"? – Ted Lyngmo Jan 13 '23 at 18:47
  • That does not matter. In the simple example OP posted, it is possible that the overflowed memory may have no adverse effect on the rest of the program. That is all I am saying. – Lught Jan 13 '23 at 18:52