2

I couldn't understand why var is coming 6, how is it calculating

#include <iostream>
using namespace std;
  
int main()
{
    char *A[] = { "abyx", "dbta", "cccc"};
    int var = *(A+1) - *A+1;
    cout << "1: " << (*(A+1)) << "\n";
    cout << "2: " << (*A+1) << "\n";
    cout << "char: " << var << "\n";
    cout << &A[0][1] - &A[1][0] << std::endl;
}
Hemant Kr
  • 61
  • 6
  • 1
    The two items are not `*(A+1)` and `*A+1` - the two items are `*(A+1)` and `*A` and so the difference is 5 (not 6) because there are 5 characters in each string (don't forget the '\0' at the end of each string. – Jerry Jeremiah Nov 29 '21 at 07:32
  • 2
    Nice question. The full analysis of this is not trivial - you're also on the edge of undefined behaviour here too. I can't answer as I've been up all night with poorly children! – Bathsheba Nov 29 '21 at 07:34
  • The distance between the array elements is `&A[1] - &A[0]` or, equivalently, `A + 1 - A`, which is obviously 1. You're subtracting the values of the array elements, not their locations. (The problem will probably become clearer if you use an array of `int` instead.) – molbdnilo Nov 29 '21 at 08:49

1 Answers1

6

To begin, this code is not well-formed C++. What you have is an array of pointers, where each pointer holds the address of a string literal. Because the pointers are non-constant, but string literals are constant, any sensible C++ compiler should at least emit a warning.

But anyway, what you have is an array that might look like this in memory (on a 64-bit system):

// char *A[] = { "abyx", "dbta", "cccc" };

        01234567
       +--------+
0x0000 | A[0]   | ----> address of "abyx\0"
       +--------+
0x0008 | A[1]   | ----> address of "dbta\0"
       +--------+
0x0010 | A[2]   | ----> address of "cccc\0"
       +--------+

Each of the cells in the array there is a pointer value. It points off somewhere in memory to wherever your program stored these string literals. And to foreshadow, you don't know where it did that, or how they are arranged.

Let's compare that to if it had been defined as a fixed-size array of char values:

// char B[][5] = { "abyx", "dbta", "cccc" };

        01234567
       +--------+
0x0000 |abyx~dbt| (the value '~' denotes a NUL byte)
       +--------+
0x0008 |a~cccc~?|
       +--------+

It seems that by pure chance (and by no means guaranteed), your compiler has arranged the string literals in memory like the second example, and so you ended up with something like:

char B[][5] = { "abyx", "dbta", "cccc" };
char *A[] = { B[0], B[1], B[2] };

Let's put that aside for now (but keep it in mind) and talk about your var calculation:

int var = *(A+1) - *A+1

You have grouped these things together with whitespace, but you need to be aware that the addition and subtraction operators at the same nesting level will be evaluated left-to-right. If I add an absurd amount of parentheses to illustrate the order of evaluation, it is in fact this:

int var = ((*(A+1)) - (*A)) + (1);

So what that does is take the pointer A[1], subtract the pointer A[0], then add 1. Because by pure luck, these pointers were arranged by your compiler like the array B, then you get the 5-character difference between the two NUL-terminated string pointers, plus 1, which is 6.

If you actually wanted to subtract the value *A+1 from *(A+1) then you need to put it in parentheses:

int var = *(A+1) - (*A+1);  // equivalent to *(A+1) - *A - 1

But, again, the resulting value is still not guaranteed because of the way you are doing arithmetic on unrelated pointers.

Let's circle back to why I went to the trouble of drawing out bits of memory and comparing with some other array named B. Two reasons:

  1. To illustrate that the addresses of these string literals as used in your program are not predictable (and as a result, pointer arithmetic has undefined behavior)

  2. To show one possible representation of that data (which could be made explicit if you defined A the way I defined B) and ensure you understand that the strings are NUL-terminated in memory (meaning there's one extra character allocated).

Hopefully this clears up some confusion.

paddy
  • 60,864
  • 6
  • 61
  • 103