15

I am trying to understand the relationship between strings, arrays, and pointers.

The book I am reading has a program in which it initializes a variable as follows:

char* szString= "Name";

The way I understand this, is that a C-style string is simply an array of chars. An array is simply a shorthand version of referring to the pointer (which stores the first value of the array) and an offset. I.e. array[5] in fact returns what is evaluated from expression *(array+5).

So, from my understanding and testing the szString is in fact initialized as a pointer which points to the first address of the array storing "Name". I can deduce this because the output to:

cout << *szstring;

is the character "N".

My understanding of the statement

cout << szstring;

outputting the characters "Name", is that the method cout interprets the argument szstring as a string type and prints out all the characters until the NUL character. On the other hand for argument *szstring a different version of this method is used that supports C-style strings.

Therefore, if I can initialize a char type pointer to address the first element in an array of chars (a C-style string), why can I not initialize an int type pointer to the first element in an array of integers as follows:

int* intArray = {1,2,3};
BusyProgrammer
  • 2,783
  • 5
  • 18
  • 31
  • 2
    "An array is simply a shorthand version of referring to the pointer " - no, it isn't. Read a good C or C++ textbook to find out what it actually is. –  Apr 22 '17 at 21:08
  • 7
    +1. I don't know why folks are downvoting this; it's an interesting question. I've always taken it for granted that `"Name"` will happily set aside static storage that we can take a pointer to, but that `{1,2,3}` will not; but now seeing this question, I realize I have no idea *why* that is. – ruakh Apr 22 '17 at 21:14
  • The title question has a problem: In C, (as post was tagged) an `int *` _can_ be initialized to an array of integers. `int *ptr = (int []){1,2,3};` – chux - Reinstate Monica Apr 22 '17 at 21:14
  • @Neil Butterworth Thanks, any references to suggest? – Filip Gajowniczek Apr 22 '17 at 21:15
  • @FilipGajowniczek [C11](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) – chux - Reinstate Monica Apr 22 '17 at 21:17
  • 3
    Yeah, I don't get the downvotes and close votes either. It's a legitimate question and it's not overly broad, as the close voters seem to think. – Carey Gregory Apr 22 '17 at 21:17
  • 1
    @chux why do I have to cast the expression {1,2,3} to an integer array type? – Filip Gajowniczek Apr 22 '17 at 21:19
  • 5
    Short answer: Because string literals have historically had special status in the C language. – Raymond Chen Apr 22 '17 at 21:19
  • 1
    @FilipGajowniczek There is no _cast_ in `(int []){1,2,3}`. It is just the way in C to declare a compound literal of `int` array type. – chux - Reinstate Monica Apr 22 '17 at 21:20
  • @chux So the difference here is that I was attempting to pass a list (I don't know if this is the proper term) of integer type literals rather than a single integer array type literal? – Filip Gajowniczek Apr 22 '17 at 21:26
  • Helpful reading: [Syntactic Sugar](https://en.wikipedia.org/wiki/Syntactic_sugar) – user4581301 Apr 22 '17 at 21:32
  • 1
    @FilipGajowniczek The original post was tagged C and C++. These 2 languages have diverged significantly since C99. As post is now C++ only, my C comment does not seem to apply. Note, NMDV, yet C & C++ tagging is a DV magnet. – chux - Reinstate Monica Apr 22 '17 at 21:45
  • 2
    _why can I not initialize an int type pointer to the first element in an array of integers_ -- `{1,2,3}` is not an array, it's called a [list initialization](http://en.cppreference.com/w/cpp/language/list_initialization) and it's context-dependent what it actually does. Btw, `char* str = { 'a', 'b', 'c' }` isn't allowed either. – zett42 Apr 22 '17 at 22:00
  • See [How can a char pointer be initialized with a string (Array of characters) but an int pointer not with an array of integer?](http://stackoverflow.com/q/35954500/3049655) – Spikatrix Apr 23 '17 at 02:21
  • Your declaration of `szString` is _not_ valid C++. There should be a `const` before (or after) `char`. – Marc van Leeuwen Apr 23 '17 at 04:27

3 Answers3

8

a C-style string is simply an array of chars

Correct.

An array is simply a shorthand version of referring to the pointer (which stores the first value of the array) and an offset.

No, not really.

the method cout interprets the argument szstring as a string type and prints out all the characters until the NUL character

cout is not a "method", but its operator<< works this way yes.

Why can a char pointer variable be initialized to a string but an int pointer variable can not be initialized to an array of integers?

The simple answer is that string literals are special, otherwise we would not be able to use them.

In many ways, including this way, the language standards dictate special handling for both string literals and char*s.

why can I not initialize an int type pointer to the first element in an array of integers

C++ could have ultimately extended the syntax of other pointer initialisations to do a similar thing, but it didn't actually need to because instead we have the far superior:

std::vector<int> myInts{1,2,3};
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 1
    Re: "string literals are special, otherwise we would not be able to use them": Obviously we're able to use array initalizers without their being "special" in this way. Can you clarify why string literals must be different? – ruakh Apr 22 '17 at 21:21
  • @ruakh: Okay, we would not be able to use them without massive inconvenience, endless copies etc. – Lightness Races in Orbit Apr 22 '17 at 21:22
  • 1
    @ruakh We would have to say things like `{ 'a','b',c', 0 }` instead of `"abc"`. –  Apr 22 '17 at 21:30
  • @BoundaryImposition So my understanding that an array is basically an indexed collection of items of the same type that is stored sequentially in memory is incorrect? I can do the following: inline `int nArray[5]={1,2,3,4,5};` inline `int* nArrayPtr= &nArray;` inline `cout << *(nArray+2);` would output 3 would it not? – Filip Gajowniczek Apr 22 '17 at 21:32
  • @FilipGajowniczek: No, that is totally correct (not withstanding [the bug with your second statement, which will break compilation](http://coliru.stacked-crooked.com/a/fc4a70ea0038f076)). But it's also not what you claimed in the question. If anything, your claim was backwards: a pointer is a shorthand way of referring to an array. – Lightness Races in Orbit Apr 22 '17 at 21:34
  • @NeilButterworth: Sorry, you seem to be confused. The topic here is that string literals have the special feature of (optionally) providing their own storage, so that you can write `char const * foo = "foo"`. That's not the same as asking why string literals exist at all. – ruakh Apr 22 '17 at 21:37
  • @ruakh: That's a good explanation – Lightness Races in Orbit Apr 22 '17 at 21:41
  • Yeah, and the cout doesn't use the pointer variable either oops. The claim I am trying to make is: the notation array(5) is interpreted by the C++ compiler as *(array +5). In other words, the address of the first element of the array in memory plus an offset of 5 objects? – Filip Gajowniczek Apr 22 '17 at 21:49
  • 1
    @FilipGajowniczek: That is how array subscripting is defined, yes. – Lightness Races in Orbit Apr 22 '17 at 21:50
  • @ruakh `char * p = { 'a','b',c', 0 };` also "provides its own storage". –  Apr 22 '17 at 22:24
  • @NeilButterworth: But it's not a literal. – Lightness Races in Orbit Apr 22 '17 at 22:27
  • @Bound And so what? I didn't say it was. –  Apr 22 '17 at 22:28
  • @NeilButterworth: That's not true: `char * p = { 'a','b',c', 0 }` is not even valid C++. (Maybe you're thinking of C?) – ruakh Apr 22 '17 at 22:34
  • @ruakh Argh, meant of course `char p[] = { 'a','b','c', 0 }; `; –  Apr 22 '17 at 22:41
  • 1
    @NeilButterworth: In `char p[] = { 'a','b','c', 0 }` it's `p` that's providing the storage (allocated on the stack), not the array initializer. It's the same as writing `char p[4]; p[0] = 'a'; p[1] = 'b'; p[2] = 'c'; p[3] = '\0';`. – ruakh Apr 22 '17 at 22:47
  • @ruakh OK, I see what you mean now, and you are right. –  Apr 22 '17 at 22:53
  • In what way is using library templates "far superior"? – ShadSterling Apr 23 '17 at 04:52
  • @Polyergic: So very many ways. Perform some research on why we prefer standard library containers in contemporary C++ over manual memory management, rigid arrays, and all that dren. – Lightness Races in Orbit Apr 23 '17 at 13:27
  • @BoundaryImposition, that's not an answer; A link with a summary would be nice. I learned C++ before the STL had taken over everything and haven't used it much since; now when I see code that avoids using pointers or primitive arrays I wonder why it was written C++. There are valuable features you gain that way, but they come at a cost, and it's not obvious why your way is universally superior to having the sort of primitive initializer the OP asked about. – ShadSterling Apr 23 '17 at 22:25
  • @Polyergic: It's all the answer you're going to get; I'm not your personal research assistant. :) – Lightness Races in Orbit Apr 23 '17 at 22:30
6

The short answer is that there exist character array literals, but no int array literals.

A string literal is a literal value of array type, and it is an lvalue, so that's something whose address you can take and store. The lifetime of the object designated by such a value is permanent, so pointers thus obtained are valid throughout the entire program.

By contrast, there is no literal of type "array of int", and no unnamed int array lvalues.

Don't confuse this with the braced initialization lists, which are not expressions and therefore not values! Braced lists can be used to initialize variables of array type, but they are not themselves values.

If anything, the only odd-man-out in the language grammar is that it is permissible to initialize a char array with a braced list containing a string literal: char a[] = {"foo"}; Think of this as a kind of copy initialization; a is a copy of the literal lvalue.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
1

As a beginner I had a similar question. Please look at this post and the answers.

This const char* szString= "Name" assigns to the pointer szString the address of the initial element of an array whose contents are "Name" (followed by a terminating '\0' null character).

There's no implicit conversion from int to int*, other that 0 being a special case, as a null pointer.

Community
  • 1
  • 1
BlooB
  • 955
  • 10
  • 23
  • Actually your line of code will break the compilation, because initialising a pointer to non-`const` `char` with a string literal is illegal. – Lightness Races in Orbit Apr 22 '17 at 21:36
  • I added const. My assumption was: As long as no attempt is made to change anything after assignment, the code will compile and run, please correct me if i am wrong. [here](http://www.c4learn.com/c-programming/difference-between-char-pointer-char-array/) – BlooB Apr 22 '17 at 21:54
  • I'm confused. My program compiled when I used `char* szString="Name";` – Filip Gajowniczek Apr 22 '17 at 21:58
  • @FilipGajowniczek: Your compiler is old/antiquated _and_ has warnings turned off. Not a good combination. – Lightness Races in Orbit Apr 22 '17 at 21:59
  • 1
    @dirty_feri: You are wrong. It was deprecated since C++98 (meaning "don't do it") and illegal since C++11 (meaning "you can't do it"). Don't reference some random out of date tutorial; reference _documentation_, the language standard, and a well peer-reviewed book written by an expert. – Lightness Races in Orbit Apr 22 '17 at 21:59
  • @BoundaryImposition I'm using visual studio 2017 – Filip Gajowniczek Apr 22 '17 at 22:00
  • 1
    @Filip: It's called "Visual Studio". It must have an extension. That is illegal C++. – Lightness Races in Orbit Apr 22 '17 at 22:00
  • @BoundaryImposition as you can tell I'm a newbie haha, I wouldn't even know how to get/check if I'm using an extension. – Filip Gajowniczek Apr 22 '17 at 22:03
  • @FilipGajowniczek: It's likely baked in. "Extension" in this context means something that Microsoft added to C++ when they created their implementation of the language standard. There are other examples of non-compliances throughout Visual Studio. – Lightness Races in Orbit Apr 22 '17 at 22:05
  • @BoundaryImposition so the issue with the statement is that the pointer is not of a constant char type? What is the difference between initializing it from a string literal (i.e. the letters which are hardcoded and not changeable per my understanding) and assign it from a variable. The char variable is always represented as a single byte in memory is it not? – Filip Gajowniczek Apr 22 '17 at 22:11
  • @BoundaryImposition for some reason it ran in my VS, i will check it out, thank you for correction – BlooB Apr 22 '17 at 22:15
  • 1
    @dirty_feri: See above. We've established that VS is non-compliant in this regard. – Lightness Races in Orbit Apr 22 '17 at 22:26
  • @FilipGajowniczek: I'm sorry, what specifically are you referring to now? Too many cross-conversations! – Lightness Races in Orbit Apr 22 '17 at 22:26
  • @BoundaryImposition I do not understand why we need to add the const to the char* szString variable. It seems to me like you are saying that because the szString variable is being assigned a string literal it must be of the constant type, but how does this differ from assigning a string from a variable. Isn't a pointer to a string always 4 bytes? Why does the source matter. – Filip Gajowniczek Apr 22 '17 at 22:43
  • 2
    @FilipGajowniczek: The source matters because string literals may not be modified, ever. Without the `const`, you just have to remember not to attempt modifying them; with it, the compiler enforces that rule for you. They should really have required `const` right from the first version of C++ but hey ho. (And, no, a pointer may be 4 bytes, 8 bytes, any other number of bytes; it depends on your system.) – Lightness Races in Orbit Apr 22 '17 at 22:50
  • @BoundaryImposition how does one even go about modifying a string literal is what I do not understand. They are hardcoded into the source code? You are correct the pointer to a char does vary depending on the hardware I had not thought of that. – Filip Gajowniczek Apr 22 '17 at 22:59
  • 1
    @FilipGajowniczek: Yes they are hardcoded (usually into the executable's "data" area) and that's why you can't successfully/safely modify them. But, if you leave out the `const` on a C++98 (or non-compliant) implementation, then you can certainly try: `szString[0] = '!';` Then watch the sparks fly. – Lightness Races in Orbit Apr 22 '17 at 23:25