1

I am failed to understand this to deep details as to why program1 would segfault and program2 would not.

program1:

void cpy(char* p, char* q) {
  while(*p++ = *q++);
}

int main() {
  char* p;
  char* q = "Bhargava";
  cpy(p, q); 
  std::cout << "p = " << &p << std::endl;
}

Program2:

void cpy(char* p, char* q) {
  while(*p++ = *q++);
}

int main() {
  char* p;
  char* q = "Bhargava";
  cpy(p, q); 
  std::cout << "p = " << p << std::endl;
}

What is the harm while printing the address of the variable 'p' here in program1?

Hemant Bhargava
  • 3,251
  • 4
  • 24
  • 45
  • 8
    Both of these have undefined behavior - they both may or may not crash (or do any number of "illogical" things), it's "random" – UnholySheep Apr 09 '20 at 18:48
  • @UnholySheep, Exactly, who would be allocating memory in program2, Right? I had this confusion. – Hemant Bhargava Apr 09 '20 at 18:50
  • char* p; there is no memory space for the characters you are copying to this. – QuentinUK Apr 09 '20 at 18:50
  • @QuentinUK, RIght. I concur. And then in what cases would you expect #2 to work? Why is it giving me right result? I would neer expect it to. – Hemant Bhargava Apr 09 '20 at 18:52
  • 1
    *Undefined behavior* means that *anything* can happen - including a (seemingly) correct result – UnholySheep Apr 09 '20 at 18:53
  • Try putting another string between p and q. It is just a fluke of what the compiler has produced. With different compilers you'd get different results. – QuentinUK Apr 09 '20 at 18:56
  • 1
    Both programs have well-defined behavior. They are invalid because C++ does not allow conversion of array of const-qualified chars (string literal) to pointer to non-const-qualified char that happens on line `char* q = "Bhargava";`. – user7860670 Apr 09 '20 at 18:58
  • 1
    Side note: `char* q = "Bhargava";` is also invalid code. `"Bhargava"` is a string literal a `const char` array of the correct size to hold the string. Because it is `const`, you cannot legally assign it to a non-`const` pointer. Some compilers allow it for legacy reasons, but they shouldn't. It's a nasty source of bugs. – user4581301 Apr 09 '20 at 18:59
  • You should turn on as many warnings as possible. If you can't explain why each one is ok, then don't expect the program to do anything meaningful. – cigien Apr 09 '20 at 19:02

2 Answers2

2

Both programs have undefined behavior already at the call

cpy(p, q);

because p is of type char* and default-initialized with automatic storage duration, meaning that its value is indeterminate. Copying an indeterminate value (here into the function parameter) that is not of type std::byte or unsigned narrow character type has already undefined behavior.

The arithmetic on and dereferencing of this indeterminate value in cpy then continues to cause operations with undefined behavior, which is probably of more practical relevance than the undefined behavior mentioned above, given that it is clear that if p is considered to have some random value, accessing the memory at that random address should not be allowed by the operating system (the cause of a segmentation fault).

Undefined behavior means that you have no guarantee on the program behavior. The program might fail with an error or it might not and it might produce seemingly correct output or it might not.

In practice, the compiler is probably optimizing away the dereferencing of p in one of the programs, but not the other, so that there never will be memory access in the compiled program that will cause the segmentation fault, but as mentioned above you have no guarantees and the compiler can output anything.


The line

char* q = "Bhargava";

is not allowed since C++11 and should at least be producing a compiler diagnostic to that effect. Even before C++11, it was always deprecated in standard-C++. "Bhargava" has type const char[N] for some N, so it can be assigned to const char*, but not to char*. The latter was just allowed by the language initially for backwards-compatibility reasons with C.

walnut
  • 21,629
  • 4
  • 23
  • 59
2

Both programs exhibit Undefined Behavior because p is not pointing at any valid memory, its value is uninitialized and thus indeterminate, so calling cpy() with p as the destination will write to random memory, if not just crash outright.

That being said, the reason why << &p works is because &p is the address of the p variable itself, which is a valid address. The type that &p returns is char**, which operator<< does not have a specific overload for, but it does have an overload for void*, which char** is implicitly convertible to. That overload simply prints out the address as-is.

The reason why << p does not work is because operator<< does have a specific overload for char*. That overload treats the address as the start of a C-style null-terminated string, and will print out characters starting at the address until a '\0' character is reached. But, since p does not point to a valid C string, the behavior is undefined.

To make both programs work correctly, you need to do this instead:

Program1:

void cpy(char* p, const char* q) {
  while(*p++ = *q++);
}

int main() {
  char buffer[10];
  char* p = buffer;
  const char* q = "Bhargava";
  cpy(p, q); 
  std::cout << "p = " << &p << std::endl;
}

Program2:

void cpy(char* p, const char* q) {
  while(*p++ = *q++);
}

int main() {
  char buffer[10];
  char* p = buffer;
  const char* q = "Bhargava";
  cpy(p, q); 
  std::cout << "p = " << p << std::endl;
}
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770