53

As the heading says, What is the difference between

char a[] = ?string?; and 
char *p = ?string?;  

This question was asked to me in interview. I even dont understand the statement.

char a[] = ?string?

Here what is ? operator? Is it a part of a string or it has some specific meaning?

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
Sachin Mhetre
  • 4,465
  • 10
  • 43
  • 68
  • 14
    Bet the interviewer meant " instead of ?. The ? symbol is used for the tertiary operator, but this is not the valid syntax for it. –  Feb 27 '12 at 05:08
  • 12
    This is probably a case of [mojibake](http://en.wikipedia.org/wiki/Mojibake). This isn't C++. – André Caron Feb 27 '12 at 05:10
  • Surely doesn't compiler on any C or C++ compiler. – Mohammad Dehghan Feb 27 '12 at 05:11
  • @Sachin `return blah ? doFoo() : doOtherFoo();` is one example of the ? operator. It is functionally equivalent to `if (blah) return doFoo(); else return doOtherFoo();`. –  Feb 27 '12 at 05:12
  • 9
    It's possible that the question was using the begin/end quotes, and your font for some reason couldn't find them, so rendered them as `?`s. – Nicol Bolas Feb 27 '12 at 05:28
  • 6
    My guess: Code was copied into MS Word, quotes were converted, and then somehow converted back. Or there is a missing `#define ? "`. Don't know if that compiles, though. – Residuum Mar 09 '12 at 14:11
  • 3
    [difference-between-char-str-string-and-char-str-string](http://stackoverflow.com/questions/3862842/difference-between-char-str-string-and-char-str-string) – Bo Persson Mar 12 '12 at 08:22
  • @jpaugh the question was modified (question marks were replaced with double quotes where applicable) to improve visibility on Google and it was explicitly asked to not roll it back. Why did you do that? Totally unwise move. – golem Aug 06 '15 at 03:14
  • The question had been modified to the point of being unintelligble! I had to read the original question to even understand what was being asked. – jpaugh Aug 06 '15 at 03:16
  • Specifically, the question marks were *part of the OP's question*, so removing them made it impossible to understand what was being asked. – jpaugh Aug 06 '15 at 03:18

8 Answers8

105

The ? seems to be a typo, it is not semantically valid. So the answer assumes the ? is a typo and explains what probably the interviewer actually meant to ask.


Both are distinctly different, for a start:

  1. The first creates a pointer.
  2. The second creates an array.

Read on for more detailed explanation:

The Array version:

char a[] = "string";  

Creates an array that is large enough to hold the string literal "string", including its NULL terminator. The array string is initialized with the string literal "string". The array can be modified at a later time. Also, the array's size is known even at compile time, so sizeof operator can be used to determine its size.


The pointer version:

char *p  = "string"; 

Creates a pointer to point to a string literal "string". This is faster than the array version, but string pointed by the pointer should not be changed, because it is located in a read only implementation-defined memory. Modifying such an string literal results in Undefined Behavior.

In fact C++03 deprecates[Ref 1] use of string literal without the const keyword. So the declaration should be:

const char *p = "string";

Also,you need to use the strlen() function, and not sizeof to find size of the string since the sizeof operator will just give you the size of the pointer variable.


Which version is better and which one shall I use?

Depends on the Usage.

  • If you do not need to make any changes to the string, use the pointer version.
  • If you intend to change the data, use the array version.

Note: This is a not C++ but this is C specific.

Note that, use of string literal without the const keyword is perfectly valid in C. However, modifying a string literal is still an Undefined Behavior in C[Ref 2].

This brings up an interesting question,
What is the difference between char* and const char* when used with string literals in C?


For Standerdese Fans:
[Ref 1]C++03 Standard: §4.2/2

A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ] For the purpose of ranking in overload resolution (13.3.3.1.1), this conversion is considered an array-to-pointer conversion followed by a qualification conversion (4.4). [Example: "abc" is converted to “pointer to const char” as an array-to-pointer conversion, and then to “pointer to char” as a qualification conversion. ]

C++11 simply removes the above quotation which implies that it is illegal code in C++11.

[Ref 2]C99 standard 6.4.5/5 "String Literals - Semantics":

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Community
  • 1
  • 1
Alok Save
  • 202,538
  • 53
  • 430
  • 533
  • 2
    Any technical reasoning for the downvote is very much appreciated. – Alok Save Aug 27 '12 at 06:32
  • What is meant by ''This is faster''? – Karlis Olte Mar 12 '15 at 12:42
  • I disagree with `If you do not need to make any changes to the string, use the pointer version.` - if you do not need to make any changes to the string, you probably want to use `const char a[] = "string";`, i.e. just add a `const`. It avoids a relocation when the dynamic linker does its work during startup (on Linux, at least). See [How to write shared libraries](http://www.akkadia.org/drepper/dsohowto.pdf) section 2.4.1 for a longer discussion. – Frerich Raabe Dec 03 '15 at 14:31
90

The first one is array the other is pointer.

The array declaration char a[6]; requests that space for six characters be set aside, to be known by the name a. That is, there is a location named a at which six characters can sit. The pointer declaration char *p; on the other hand, requests a place which holds a pointer. The pointer is to be known by the name p, and can point to any char (or contiguous array of chars) anywhere.

The statements

 char a[] = "string";
 char *p = "string"; 

would result in data structures which could be represented like this:

     +---+---+---+---+---+---+----+
  a: | s | t | r | i | n | g | \0 |
     +---+---+---+---+---+---+----+
     +-----+     +---+---+---+---+---+---+---+ 
  p: |  *======> | s | t | r | i | n | g |\0 |    
     +-----+     +---+---+---+---+---+---+---+ 

It is important to realize that a reference like x[3] generates different code depending on whether x is an array or a pointer. Given the declarations above, when the compiler sees the expression a[3], it emits code to start at the location a, move three elements past it, and fetch the character there. When it sees the expression p[3], it emits code to start at the location p, fetch the pointer value there, add three element sizes to the pointer, and finally fetch the character pointed to. In the example above, both a[3] and p[3] happen to be the character l, but the compiler gets there differently.

Source: comp.lang.c FAQ list · Question 6.2

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • I bet @Sachin considered cases like `const char *p = "string"; p = "another string"; printf("%c", p[3]);` (Yes, `char *p = "string";` will be a compile error) – nodakai Mar 15 '12 at 15:20
12
char a[] = "string";

This allocates the string on the stack.

char *p = "string";

This creates a pointer on the stack that points to the literal in the data segment of the process.

? is whoever wrote it not knowing what they were doing.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 6
    Given the question's asking for details on something trivial, I think the answer should explore the possibilities in more depth. Specifically, "`char a[]`... allocates...on the stack" assumes it's inside a function and not global, and further refers to `a[]` while not mentioning that inside a function there's actually a run-time copy of the entire text from the constant data segment to the stack. `char*` usage which creates a non-`const` pointer - on the stack or as a global in the data segment - and initialises it at runtime or (probably) compile time respectively to address the const text. – Tony Delroy Feb 29 '12 at 12:41
  • 1
    This answer is wrong. The 2nd code snippet creates a compilation error :P – BЈовић Mar 09 '12 at 11:29
  • 2
    @VJovic: Indeed it is.Declaring a pointer to string literal without `const` qualifier is deprecated in C++03, So the second snippet is not legal C++ code. – Alok Save Mar 09 '12 at 14:10
  • 1
    It compiles with some compilers (for example Microsoft Visual C++ 2010), so when saying it creates compilation error, you should be more specific - write compiler version, or (as was mention in other answer) that this is against c++ standard (C++03 C++11). – Dainius Dec 31 '12 at 14:06
8

Stack, heap, datasegment(and BSS) and text segement are the four segments of process memory. All the local variables defined will be in stack. Dynmically allocated memory using malloc and calloc will be in heap. All the global and static variables will be in data segment. Text segment will have the assembly code of the program and some constants.

In these 4 segements, text segment is the READ ONLY segment and in the all the other three is for READ and WRITE.

char a[] = "string"; - This statemnt will allocate memory for 7 bytes in stack(because local variable) and it will keep all the 6 characters(s, t, r, i, n, g) plus NULL character (\0) at the end.

char *p = "string"; - This statement will allocate memory for 4 bytes(if it is 32 bit machine) in stack(because this is also a local variable) and it will hold the pointer of the constant string which value is "string". This 6 byte of constant string will be in text segment. This is a constant value. Pointer variable p just points to that string.

Now a[0] (index can be 0 to 5) means, it will access first character of that string which is in stack. So we can do write also at this position. a[0] = 'x'. This operation is allowed because we have READ WRITE access in stack.

But p[0] = 'x' will leads to crash, because we have only READ access to text segement. Segmentation fault will happen if we do any write on text segment.

But you can change the value of variable p, because its local variable in stack. like below

char *p = "string";
printf("%s", p);
p = "start";
printf("%s", p);

This is allowed. Here we are changing the address stored in the pointer variable p to address of the string start(again start is also a read only data in text segement). If you want to modify values present in *p means go for dynamically allocated memory.

char *p = NULL;
p = malloc(sizeof(char)*7);
strcpy(p, "string");

Now p[0] = 'x' operation is allowed, because now we are writing in heap.

rashok
  • 12,790
  • 16
  • 88
  • 100
  • 1
    great explanation, only thing I am not sure about, will string literal be stored in text segment or in data segment? – ADJ Jun 10 '13 at 13:48
  • String literals are read only data, so most of the compiler use to stor it in text segment. – rashok Jun 18 '13 at 04:58
  • @rajaashok So, the bytes for the string literal are stored in text segment, and this is true for every C compiler?, and for C++? – Wyvern666 Sep 04 '13 at 19:16
6

char *p = "string"; creates a pointer to read-only memory where string literal "string" is stored. Trying to modify string that p points to leads to undefined behaviour.

char a[] = "string"; creates an array and initializes its content by using string literal "string".

Matthew Murdoch
  • 30,874
  • 30
  • 96
  • 127
LihO
  • 41,190
  • 11
  • 99
  • 167
3

They do differ as to where the memory is stored. Ideally the second one should use const char *.

The first one

char buf[] = "hello";

creates an automatic buffer big enough to hold the characters and copies them in (including the null terminator).

The second one

const char * buf = "hello";

should use const and simply creates a pointer that points at memory usually stored in static space where it is illegal to modify it.

The converse (of the fact you can modify the first safely and not the second) is that it is safe to return the second pointer from a function, but not the first. This is because the second one will remain a valid memory pointer outside the scope of the function, the first will not.

const char * sayHello()
{
     const char * buf = "hello";
     return buf; // valid
}

const char * sayHelloBroken()
{
     char buf[] = "hello";
     return buf; // invalid
}
CashCow
  • 30,981
  • 5
  • 61
  • 92
1

a declares an array of char values -- an array of chars which is terminated.

p declares a pointer, which refers to an immutable, terminated, C string, whose exact storage location is implementation-defined. Note that this should be const-qualified (e.g. const char *p = "string";).

If you print it out using std::cout << "a: " << sizeof(a) << "\np: " << sizeof(p) << std::endl;, you will see differences their sizes (note: values may vary by system):

a: 7
p: 8

Here what is ? operator? Is it a part of a string or it has some specific meaning?

char a[] = ?string?

I assume they were once double quotes "string", which potentially were converted to "smart quotes", then could not be represented as such along the way, and were converted to ?.

justin
  • 104,054
  • 14
  • 179
  • 226
  • @Sachan Only rarely (e.g. when you must mutate the `char` buffer). – justin Feb 27 '12 at 05:54
  • If you don't need to mutate the `char` buffer, then it should be a `[static] const char a[] = "xyz";`, not `const char* p = "xyz";` nor `const char* const p = "xyz";` - the former implies p may be moved to point elsewhere and if that's not intended then it's better not to allow the possibility, and both ask the compiler for space for both the pointer and text - IMHO - just shows a lack of accurate mental model of what's being asked of the compiler, and wastes space and time in an unoptimised build. – Tony Delroy Feb 29 '12 at 12:49
  • @Justin: I have no intent or interest in being insulting. You say I'm wrong in many contexts - I only spoke about the "don't need to mutate context", so please explain the many scenarios in which I'm wrong, and how `[static] const char[]` results in "wasted space and time". – Tony Delroy Mar 01 '12 at 03:02
0

C and C++ have very similar Pointer to Array relationships...

I can't speak for the exact memory locations of the two statements you are asking about, but I found they articles interesting and useful for understanding some of the differences between the char Pointer declaration, and a char Array declaration.

For clarity:

C Pointer and Array relationship

C++ Pointer to an Array

I think it's important to remember that an array, in C and C++, is a constant pointer to the first element of the array. And consequently you can perform pointer arithmetic on the array.

char *p = "string"; <--- This is a pointer that points to the first address of a character string.

the following is also possible:

char *p;
char a[] = "string";

p = a; 

At this point p now references the first memory address of a (the address of the first element)

and so *p == 's'

*(p++) == 't' and so on. (or *(p+1) == 't')

and the same thing would work for a: *(a++) or *(a+1) would also equal 't'

meltdownmonk
  • 493
  • 2
  • 7
  • 17