I am trying to learn pointers in C but is getting mixed up with the following concepts:
char *string = "hello"
char *string2;
What is the main difference between:
A.) *string2 = string;
then
B.) string2 = "bye";
I am trying to learn pointers in C but is getting mixed up with the following concepts:
char *string = "hello"
char *string2;
What is the main difference between:
A.) *string2 = string;
then
B.) string2 = "bye";
Some pictures may help.
Assume the following memory map (addresses are completely arbitrary and don't reflect any known architecture):
Item Address 0x00 0x01 0x02 0x03 ---- ------- ---- ---- ---- ---- "hello" 0x00501234 'h' 'e' 'l' 'l' 0x00501238 'o' 0x00 "bye" 0x0050123A 'b' 'y' 0x0050123C 'e' 0x00 0x?? 0x?? ... string 0x80FF0000 0x00 0x50 0x12 0x34 string2 0x80FF0004 0x?? 0x?? 0x?? 0x??
This shows the situation after the declarations. "hello"
and "bye"
are string literals, stored as arrays of char
"somewhere" in memory, such that they are available over the lifetime of the program. Note that attempting to modify the contents of string literals invokes undefined behavior; you don't want to pass string literals (or pointer expressions like string
that evaluate to the addresses of string literals) as arguments to functions like scanf
, strtok
, fgets
, etc.
string
is a pointer to char
, containing the address of the string literal "hello"
. string2
is also a pointer to char
, and its value is indeterminate (0x??
represents an unknown byte value).
When you write
string2 = "bye";
you assign the address of "bye"
(0x0050123A) to string2
, so our memory map now looks like this:
Item Address 0x00 0x01 0x02 0x03 ---- ------- ---- ---- ---- ---- "hello" 0x00501234 'h' 'e' 'l' 'l' 0x00501238 'o' 0x00 "bye" 0x0050123A 'b' 'y' 0x0050123C 'e' 0x00 0x?? 0x?? ... string 0x80FF0000 0x00 0x50 0x12 0x34 string2 0x80FF0004 0x00 0x50 0x12 0x3A
Seems simple enough, right?
Now let's look at the statement
*string2 = string;
There are a couple of problems here.
First, a digression - declarations in C are centered around the types of expressions, not objects. string2
is a pointer to a character; to access the character value, we must dereference string2
with the unary *
operator:
char x = *string2;
The type of the expression *string2
is char
, so the declaration becomes
char *string2;
By extension, the type of the expression string2
is char *
, or pointer to char
.
So when you write
*string2 = string;
you're attempting to assign a value of type char *
(string
) to an expression of type char
(*string2
). That's not going to work, because char *
and char
are not compatible types. This error shows up at translation (compile) time. If you had written
*string2 = *string;
then both expressions have type char
, and the assignment is legal.
However, if you haven't assigned anything to string2
yet, its value is indeterminate; it contains a random bit string that may or may not correspond to a valid, writable address. Attempting to deference a random, potentially invalid pointer value invokes undefined behavior; it may appear to work fine, it may crash outright, it may do anything in between. This problem won't show up until runtime. Even better, if you assigned the string literal "bye"
to string2
, then you run into the problem described above; you're trying to modify the contents of a string literal. Again, that's a problem that's not going to show up until runtime.
There are some subtle inferences being made by other answerers, missing the POV of a newbie.
char *string = "hello";
Declares a pointer variable which is initialized to point at a character array (a good type match traditionally).
The statement
*string = "hello";
dereferences what should be a pointer variable and assigns a value to the pointed location. (It is not a variable declaration; that has to be done above it somewhere.) However, since string
has type char *
—so *string
has type char
—and the right side of the assignment is an expression with a pointer value, there is a type mismatch. This can be fixed in two ways, depending on the intent of the statement:
string = "hello"; /* with "char *" expressions on both sides */
or
*string = 'h'; /* with "char" expressions on both sides */
The first reassigns string
to point to memory containing a sequence of characters (hello\000
). The second assignment changes the character pointed to by string
to the char
value h
.
Admittedly, this is a slightly confusing subject which all C
programmers go through a little pain learning to grasp. The pointer declaration syntax has a slightly different (though related) effect than the same text in a statement. Get more practice and experience writing and compiling expressions involving pointers, and eventually my words will make perfect sense.
AFTER EDIT:
The difference is that A) will not compile, and if it did, it's undefined behavior, because you're dereferencing an uninitialized pointer.
Also, please don't change your question drastically after posting it.
*string
can be read as "whatever string
points to", which is a char
. Assigning "bye"
to it makes no sense.
A C string is just an array of characters. C string literals like "hello"
above could be viewed as "returning" a pointer to the first element of the character array, { 'h', 'e', 'l', 'l', 'o' }
.
Thus, char *string = "bye"
makes sense while char string = "bye"
doesn't.
char *
is a pointer to a character. Literals such as "hello"
returns a pointer to the first character of the string. Therefore, string = "bye"
is meaningful making string
point to the first character of string "bye"
.
*string
, on the other hand, is the character pointed by string
. It's not a pointer but an 8-bit integer. This is why the assignment *string = "bye"
is meaningless and will probably lead to a segmentation fault as the memory segment where "bye"
stored is read-only.