2
int main ()
{
   char *strA = "abc";
   int tam_strA = strlen(strA);
   
   char strB[tam_strA];
   strB[0] = 'a';
   strB[1] = 'b';
   strB[2] = 'c';
   strB[3] = 'd';
   strB[9] = 'z';
   
   printf("%c", strB[9]);
   
   return 0;
}

It prints 'z' normally. Why it doesn't return segmentation fault error? Since I'm trying to access an index that shouldn't exist because the size (amount of indexes) of strB is equal to tam_strA which is equal to 3.

Also, is there any difference/problem on doing char strB[strlen(strA)]; instead?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 2
    Undedined behaviour means anything can happen. Your code demonstrates one possible behaviour; there are a myriad others. You might get a crash if you printed via `strA`. – Jonathan Leffler Dec 21 '21 at 05:35
  • If `strB[9]` is in memory that you do not own, and the hardware is configured to cause a system fault, that is what will happen. But if `strB[9]` is in memory that you do own, it won't cause a problem unless there is competition for use of that memory location. If you don't intefere with any other use (or vice-versa), there won't be a problem. Suppose you go to the theatre and sit in a seat you didn't book. You might enjoy the whole show if no-one else wants to use the same seat, otherwise, there will be trouble. – Weather Vane Dec 21 '21 at 07:10
  • [What is undefined behavior and how does it work?](https://software.codidact.com/posts/277486) – Lundin Dec 21 '21 at 07:34
  • C does not protect you from shooting yourself in the foot (it's called **Undefined Behaviour** when you do so). The idea behind C not checking everything for you is that you can check yourself when you need to ... and when you don't need to your compiled program is streamlined by design. – pmg Dec 21 '21 at 10:37
  • Please choose a title which provides information about your question rather than a generic statement. Thank you. – user438383 Dec 21 '21 at 10:45

2 Answers2

3

C language does not have a specification that stops you from accessing invalid memory, nor does it guarantee a segmentation fault. The only promise which is made is that, if you attempt to access invalid memory, that will cause undefined behavior.

Segmentation fault is one of the possible outcomes, NOT the ONLY one.

That said, the only problem with

 char strB[strlen(strA)];

is that, strB will not be long enough to hold the content in strA, because it will lack one byte to hold the null-terminator. Sure, byte-wise use will be fine, but if you want to copy the content (or any content of the same length as strA) and use strB as a string, you'll run past the allocated memory (in absence of the null terminator), invoking undefined behavior.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
1

You only get Segmentation fault when accessing the memory that you do not own. You own your entire stack. strB[9] is a valid memory access in the eyes of the OS. The reason that you shouldn't do this is because the compiler doesn't know that you're using that memory, so it might decide to use that memory for other uses. It's also good for improving readability and minimising mistakes from the programmer. And, the standard defines the using of undeclared memory to be undefined behaviour, so you can't use it safely. Declaring a variable like int x;(or an array) tells the compiler that you will use the memory at x.

This is actually related to this question: Why does the first element outside of a defined array default to zero?. Read the much more detailed answers over there.

Shambhav
  • 813
  • 7
  • 20
  • "You own your entire stack" This is not entirely correct, see https://en.wikipedia.org/wiki/Stack_buffer_overflow. Also there isn't necessarily an OS on the computer running a C program. – Lundin Dec 21 '21 at 09:22
  • @Lundin There's a difference between memory ownership in the views of the compiler and of the OS. As Segmentation Fault occurs due to the OS, I'm talking about that. Have you even read my answer properly? – Shambhav Dec 21 '21 at 09:37
  • `SIGSEGV` is defined as (C17 7.14) "an invalid access to storage". *nix-like OS treats that in a certain way but that's just OS-specific behavior. – Lundin Dec 21 '21 at 09:41
  • @Ludin The entire stack is valid access to memory unless we're taking of an obscure platform or an alternate universe. Accessing invalid stack memory is and forever will be "undefined behaviour". – Shambhav Dec 21 '21 at 09:48
  • @Lundin And if there's no OS, there's no memory management and so, there's no invalid memory access. Your point makes no sense to me. You are trying to use the precise meaning of "invalid memory access" but it's made implicitly clear that I'm not using the precise meaning. The lines "strB[9] is a valid memory access in the eyes of the OS. The reason that you shouldn't do this is because the compiler doesn't know that you're using that memory, so it might decide to use that memory for other uses." seem to make everything clear. I just don't get you. – Shambhav Dec 21 '21 at 09:51
  • The link I posted mention the use of so-called stack canaries, check it out. Also C doesn't specify the direction in which the stack grows. On a system with down-counting stack (which is most common), `strB[9]` need not be a stack memory address. – Lundin Dec 21 '21 at 09:53
  • As for when there's no OS, there can still be a MMU even in bare metal microcontroller applications. – Lundin Dec 21 '21 at 09:54
  • @Lundin Surely that doesn't happen while accessing your own stack. – Shambhav Dec 21 '21 at 09:59
  • It is perfectly possible and even considered good practice to insert stack canaries in valid stack memory to protect against stack overflows. If a canary gets overwritten it should result in some form of critical error. The best solution is to map the stack so that it overflows into memory where write access results in a hardware exception though, but that isn't possible on all systems. – Lundin Dec 21 '21 at 10:01
  • @Lundin If you read my answer properly, you'll see the line "The reason that you shouldn't do this is because..." Nobody is saying that doing such thing is fine. Yes it's possible, that doesn't invalidate my answer. – Shambhav Dec 21 '21 at 10:03
  • Except the compiler (libs) _could_ know. For example if I roll out my own definition of `alloca` for some embedded system, I could have it check in the end if the `alloca` call killed the canary, then throw an error. Signal handling isn't usually used in such systems, but there is no reason why I couldn't be using `SIGSEGV`. – Lundin Dec 21 '21 at 10:07
  • Also unrelated to this whole discussion, the C compiler might not even generate correct machine code for doing pointer arithmetic outside the bounds of an array, since it is UB. Then everything called stacks and memory protection becomes irrelevant. – Lundin Dec 21 '21 at 10:08
  • @Lundin Are you proposing that the user might be using canaries and it throws SIGSEV and that invalidates my answer? The problem with that is that SIGSEV is normally thrown when it's Segementation fault, it doesn't cause Segmentation fault. And we're talking about Segementation fault here, not SIGSEV. – Shambhav Dec 21 '21 at 10:10
  • @Lundin Yep. That's "undefined behaviour", I haven't claimed the contrary. Okay, I'll state it in my answer. – Shambhav Dec 21 '21 at 10:12
  • All I'm saying is that from a general C programming point of view, you can't go make any assumptions about what will happen. If the OP had mentioned a very specific OS in their question then it would be another story. – Lundin Dec 21 '21 at 10:13
  • @Lundin You own the stack in every OS and accessing any part of it wouldn't be Segmentation fault. That fact is consistent. In the other things, you're right but not in this one. – Shambhav Dec 21 '21 at 10:18
  • Even hosted OS `alloca` works like I described, see for example https://stackoverflow.com/questions/69406966/how-does-alloca-work-on-a-memory-level and https://stackoverflow.com/questions/1629685/when-and-how-to-use-gccs-stack-protection-feature – Lundin Dec 21 '21 at 10:21
  • @Lundin How's that relevant though? – Shambhav Dec 21 '21 at 10:24
  • The particular memory write might not *in itself* cause a segfault, but corrupting a location might cause one indirectly. For example, you corrupt a return address on the stack, and when that function (which might be way below the offending one in the call hierarchy) returns, it is to an invalid address that causes a segfault. Or it may be to effectively random code at a valid address, and *that* code causes a segfault by illegal memory access. The segfault may be completely unrelated to the original error. – Weather Vane Dec 21 '21 at 10:49
  • @WeatherVane Yep. That's undefined behaviour, anything can happen. It could fill your RAM with 1s. The chance is very low but anything could happen if timed just right(wrong) and in the right(wrong) place. Saying it's possible is like saying you it's possible to go into the enemy camp while open firing. You could; you could do it harmlessly but... That's undefined behaviour, it's a catch phrase to shorten ISO documents and answers. – Shambhav Dec 21 '21 at 15:17
  • @ShambhavGautam no, it isn't. It is the outcome of what the C Standard terms as 'undefined behaviour' which is that *the standard* does not define what will happen. It is also untrue that anything your imagination can dream up is a possible outcome, although some strange things can and do happen. You also say "`strB[9]` is a valid memory access in the eyes of the OS" but it might not be if it maps to forbidden memory. The C standard has nothing to say about stacks or heaps, how or where data is stored. – Weather Vane Dec 21 '21 at 16:23
  • @WeatherVane `strB[9]` is surely stack memory in this case. It's not certain but if we start stating these little facts, won't it take a day? The RAM could be filled with 1s if you corrupt own code and then the code does that. Or is it not possible? "undefined behaviour" is a catch all phrase though, the standard not defining it means anything can happen unless the compiler defines it. – Shambhav Dec 22 '21 at 01:52