-1

I'm studying C and in particular pointers and memory access. I wrote this simple program to count the number of chars in a string, but the "problem" is that it works when it should give me error.
More Specifically, even if I enter more characters than what is expected (5) the function mylen is still able to correctly count the number of characters. I tried also with very large numbers (>50). Why?

#include <stdio.h>    

int mylen(char*);    

int main(){    
    char mystring[5];    
    printf("Enter a string: ");    
    scanf("%s", &mystring);    
    int lenght = mylen(mystring);    
    printf("The string is %d chars long", lenght);    
    return 0;
}    
int mylen (char *ptr) {    
    int len;    
    len = 0;    
    char t;    
    while (*ptr != '\0'){    
        ptr++;    
        len ++;    
    }    
    return len;    
}  

P.S. I'm aware there are more efficient and secure ways to read a string, like getline(), but I'm curious about this specific issue. Thanks for any clarification.

shark_sh
  • 111
  • 1
  • 12
  • 2
    This is undefined behavior as far as C is concerned. Sometimes it may produce the results you expect, sometimes it may not. It's compiler implementation dependent and technically unrelated to the C standard. – h0r53 Jun 06 '23 at 16:33
  • If you are asking "When I provide a string of length 10, it still correctly computes the string length of 10, why?" It's because you are still likely writing beyond the boundary of the array. You're just overwriting adjacent data, which is extremely dangerous. But in your use case you still likely wrote `X` bytes onto the stack. You just overwrote `X - 5` bytes adjacent to your 5 byte buffer (this assumes `char` = 1 bytes, which it generally does but it's system dependent technically). – h0r53 Jun 06 '23 at 16:35
  • 2
    *it works when it should give me error* There's no "should" involved in any way, shape, or form when you invoke undefined behavior. – Andrew Henle Jun 06 '23 at 16:41
  • buffer overrun can produce all kinds of side effects including variable corruption and serious security risks. You might see some data corruption if you change the print statement to read `printf("The string %s is %d chars long", mystring, lenght);` – Tim Randall Jun 06 '23 at 16:41
  • Let's see if I understood it well: there is not a strong correlation like right code ->no errors / wrong code ->error, but more likely right code ->predictable behaviour / wrong code ->unpredictable or undefined behaviour. More specifically, in this code, since `scanf()` doesn't have itself a limit, it simply writes N adjacent chars in the memory, starting from the `&string` address, while in the subsequent function, since the only breaking condition is on the `'\0'` sequence, the behaviour is similar. Is it right? – shark_sh Jun 07 '23 at 07:43

1 Answers1

2

This is undefined behavior as far as C is concerned. Sometimes it may produce the results you expect, sometimes it may not. It's compiler implementation dependent and technically unrelated to the C standard.

If you are asking "When I provide a string of length 10, it still correctly computes the string length of 10, why?" It's because you are still likely writing beyond the boundary of the array. You're just overwriting adjacent data, which is extremely dangerous. But in your use case you still likely wrote X bytes onto the stack. You just overwrote X - 5 bytes adjacent to your 5 byte buffer (this assumes char = 1 bytes, which it generally does but it's system dependent technically).

Try giving a very long string, such as 100 or more characters. You'll likely encounter a segmentation fault due to writing beyond the stack frame and corrupting the return address. Or you may overwrite a stack canary and produce a Stack Smashing Detected error.

h0r53
  • 3,034
  • 2
  • 16
  • 25