16

I am new to programming. I am learning C as my first programming language. I found something strange to understand.

I have learnt that in C we can represent a String as a sequence of characters like this (using a char array):

char status[10] = "Married";   

I have learnt that the problem of this approach is that we have to tell the size of the status array during compilation.

But now I have learned we can use a char pointer to denote an string like -

char status[10] = "Married";
char *strPtr;
strPtr = status;

I don't understand it properly. My questions are -

  1. How can I get char at index 4 (that is i in Married) using the strPtr?

  2. In status there is a null character (\0) at the end of the string represented by the char array - M-a-r-r-i-e-d-\0. So by using the null character (\0) we can understand the end of the string. When we use strPtr, how can we understand the end of the string?

KajolK
  • 203
  • 1
  • 3
  • 9
  • 2
    1) `*(strPtr+4)` will give you the char `i`. 2) `strPtr` is also pointing to `status` so it'll have the (same) null character (`\0`) at the end. – P.P Mar 31 '15 at 10:48
  • 2
    Because arrays decays to pointers to their first elements, arrays and pointers can often be used interchangeably, so using the dereference operator on an array will work, as will using the array-indexing operator on a pointer. Also, `*(arrayOrPointer + X)` is equivalent to `arrayOrPointer[X]`. – Some programmer dude Mar 31 '15 at 10:48
  • check [What does `sizeof(&array)` return?](http://stackoverflow.com/questions/15177420/what-does-sizeofarray-return/15177499#15177499) – Grijesh Chauhan Mar 31 '15 at 16:09

6 Answers6

15
char *strPtr;
strPtr = status;

Now your pointer strPtr is pointing to the first character in the array and you can do

int i =0;
while( strPtr[i] != '\0')
{
  printf("%c ",strPtr[i]);
  i++;
}

*strPtr is called dereferencing the pointer to get the value stored in the location the pointer is pointing to.

Make a note that

strPtr[4] = *(strPtr +4); 

Both will get you the value stored at the index 4 of the array.

Note the difference between a pointer and a array name:

----------------------------------
| s  | t  | r  | i  | n | g | \0 |
----------------------------------
  |
strPtr
status

strPtr ++ will make your pointer point to the next element in the array.

| s  | t  | r  | i  | n | g | \0 |
----------------------------------
       |
      strPtr

Whereas you can't do this for the array name

status++ is not allowed because an array is not a modifiable lvalue.

haccks
  • 104,019
  • 25
  • 176
  • 264
Gopi
  • 19,784
  • 4
  • 24
  • 36
  • Wow! that means we can use pointer name- `strPtr` as an array? I will try it now. Thanks for your reply. – KajolK Mar 31 '15 at 10:51
  • @KajolK Check the edits to get the difference between an array and a pointer – Gopi Mar 31 '15 at 10:57
  • In C, an array index is just a dereference of a pointer after addition: `x[i]` just means `*(x + i)`. – Mark Cidade Mar 31 '15 at 15:57
  • 1
    @KajolK; Always remember [pointers are not arrays and vice-versa](http://www.c-faq.com/aryptr/aryptr2.html). – haccks Mar 31 '15 at 21:31
  • @KajolK: No, it means that the indexing operator `[]` requires a pointer, not an array, as one of its operands. Often that pointer will be the result of the implicit conversion of an array name. – Keith Thompson Mar 31 '15 at 22:00
4

The expression status[10] is mere syntactic sugar for *(status+10).

The \0 termination is used under the hood to check for the end, if you were implementing some string-handler yourself you could do this too, or you could ignore it and use some other parameter size given with the string, or you could (don't!) choose anything else as the termination symbol.

This isn't just true of char arrays, or 'strings', a C array is just a pointer to a contiguous block of like-typed stuff with a compile-time check that your 'array' subscripts don't go beyond the 'end' specified at time of declaration. With the *(array+offset) notation, you need to check this for yourself.

OJFord
  • 10,522
  • 8
  • 64
  • 98
  • @MattMcNabb `status[10]` is not undefined behavior here; the standard defines dereferencing _one past_ the last element of the array. – Blacklight Shining Mar 31 '15 at 15:54
  • @BlacklightShining it only defines that if there is guaranteed to be allocated memory at that location (and some other conditions are true), which is not the case with `char status[10];` – M.M Mar 31 '15 at 19:59
  • `status[10]` means the same thing as `*(status+10)` anywhere _except_ in the specific kind of context the question is asking about. –  Apr 25 '15 at 21:48
4

Good to know:

char status[10] = "Married";

is just syntax sugar for the equivalent:

char status[10]; // allocate 10 Bytes on stack
status[0] = 'M';
status[1] = 'a';
...
status[6]= 'd';
status[7] = '\0'; // same as 0

Nothing more, nothing less.

Also:

char c = status[3];

is exactly the same as

char c = *(status+3);
DrKoch
  • 9,556
  • 2
  • 34
  • 43
3

To get character at index 4 strPtr, you just use strPtr[4] (this also work for status).

To get the end of the string when using strPtr, you need to go through the characters and look for the terminating \0. This is what printf("%s", strPtr) does when it prints the string (and also when it parses the "%s" expression, which is just another string). To find a number of valid characters in the string in C, you use strlen() function. Oh, and make sure you dont do something like this:

char a[3];
strcpy(a, "Hello!");

As this will write 7 bytes into a three-byte memory space, and hence overwrite something you don't want overwritten.

che
  • 12,097
  • 7
  • 42
  • 71
3

I'm going to make a provocative statement: the way to think of this is that C doesn't have strings. C only has arrays of char. And despite its name, char is actually a numeric type ('A', for example, is just a funny way to write a number, usually 65).

An array of char is not really different from an array of int or any other array of numeric type; it's just that the language offers some extra ways to write objects of type char and arrays of them, and there is a general convention (systematized with functions like strlen) for how to interpret data stored in char arrays as being representations of strings.

char status[10];     // declares an array of `char` of length 10. 
char *strPtr;        // declare a pointer to `char`
strPtr = status;     // make `strPtr` point to the first element of `status`

// Declare an array of 6 `char`, and initialize it.
char hello[6] = {'H', 'e', 'l', 'l', 'o', '\0'};

// Shorthand notation for initializing an array of 6 `char` as above
char world[6] = "World";

// I want to store numeric data in this one!
char other[6] = {0, 1, 2, 3, 4, 5};

// "World" is shorthand for a constant array of 6 `char`. This is
// shorthand for telling the compiler to actually put that array in
// memory someplace, and initialize worldPtr to point to that memory.
const char *worldPtr = "World";

// This does the same thing as above. But it's still a *constant* array.
// You should *never* do this; it should be syntactically illegal to
// make a nonconstant `char*` to point to it. This is only allowed for
// historical reasons.
char *helloPtr = "Hello";
1

The '\0' at the end of the string is a useless add-on designed for easy or safety. You can tell string last character by using 'sizeof' like this:

char status[] = "Married"; 

size_t szLastCharstatus = sizeof(status) / sizeof(status[0]) - 2;

char chLastChar = status[szLastCharstatus];

Detailed explanation:

sizeof(status)

Returns the number of bytes array occpuies.

sizeof(status[0])

Returns the number of bytes first element occupies (and so the rest).

The division between those 2 values gives us the number of elements in the array. To access the last element now we need to subtract one 2 times because elements in array count from zero and because the last character in the string is '\0'.

Also note that arrays are not pointers and vice-versa. Arrays have an implicit conversion to pointer of their first element, constant size and their own type. They can be passed around by pointers or by value (using structure hack is required for the second).

Note that I'm using 'size_t' which is a type-def of a variable storing some size of data.

AnArrayOfFunctions
  • 3,452
  • 2
  • 29
  • 66
  • 1
    That will give the number of element in the array, but not the length of the string contained in the array. Since only 8 characters are initialized, the last two will have indeterminate values and reading them will lead to undefined behavior. Also note that this only works on "proper" arrays, once an array has decayed to a pointer the `sizeof` trick will no longer work. And the terminator character is not useless, all the standard C string functions rely on it being there. – Some programmer dude Mar 31 '15 at 10:52
  • 1
    Also note that the question is tagged `C`, so no `std` namespace. – Some programmer dude Mar 31 '15 at 10:53
  • @Joachim Pileborg However it's a performance cost sometimes and avoiding it requires complex syntax. The trick will work as long as arrays are passed either by value or pointer. Also fixed for 'C'. – AnArrayOfFunctions Mar 31 '15 at 11:07
  • 1
    The trailing nullbyte isn't useless at all—because of pointer decay, it's actually the _only_ way to find the end of the string if it's e.g. passed to a function as an argument. – Blacklight Shining Mar 31 '15 at 15:58