The first half of your question is equivalent to this:
I'm new to life and started to learn about road traffic a few weeks back. I have read in a book that you should wait for the green light before entering the intersection, but when I enter the intersection without waiting, it works properly. How is it possible?
In other words, you just got lucky. It just so happened that, even though you constructed an array of characters without a proper \0
terminator, there happened to be a 0 byte in memory just after the e
in apple
, so it worked anyway. But it's not at all guaranteed to work, any more than it's guaranteed that you can keep crossing the street against the light and not, eventually, get hit.
Moving on to your second question, when you read that "char
is a subset of integer datatype", that does not at all mean that anywhere you would ordinarily use a char
, you can also use int
.
Here are some characters in memory. Each of them is one byte in size:
char c1 = 'p', c1 = 'e', c3 = 'a', c4 = 'r';
+---+ +---+
c1: | p | c2: | e |
+---+ +---+
+---+ +---+
c3: | a | c4: | r |
+---+ +---+
Here are some ints in memory. On a modern machine, each of them is probably four bytes in size:
int i1 = 'p', i1 = 'e', i3 = 'a', i4 = 'r';
+---+---+---+---+ +---+---+---+---+
i1: | p | i2: | e |
+---+---+---+---+ +---+---+---+---+
+---+---+---+---+ +---+---+---+---+
i3: | a | i4: | r |
+---+---+---+---+ +---+---+---+---+
Here is an array of char
, properly null-terminated:
char ca[] = { 'p', 'e', 'a', 'r', '\0' };
+---+---+---+---+---+
ca: | p | e | a | r |\0 |
+---+---+---+---+---+
When printf
prints this string, or strlen
computes its length, they start at the beginning and move along the string one byte at a time, until they find the \0
.
But here is an array of int
:
int ia[] = { 'p', 'e', 'a', 'r', '\0' };
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
ia: | p | e | a | r | \0 |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
But I've drawn it slightly wrongly, because in reality, the three extra bytes in each int aren't filled with empty spaces, they're filled with zero bytes. (It's as if we want to represent the number 1 with leading zeroes, that is, as 0001.) So the more accurate picture looks like this;
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
ia: | p \0 \0 \0 | e \0 \0 \0 | a \0 \0 \0 | r \0 \0 \0 | \0 \0 \0 \0|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
So when printf
or strlen
start at the beginning and process the array one byte at a time looking for the terminating \0
, they find one immediately, just after the first letter.
An important point to consider here is that printf
and strlen
are defined to operate on arrays of char
. And because of the way C works, they had no way of knowing that you had cheated and passed an array of int
instead. They literally took that same memory and treated it as if it were an array of char
, and so got a very different result than what you expected.
Because it's easy to make mistakes like this, good compilers will warn you if you do. For your code, my compiler gave me these warnings:
warning: incompatible pointer types passing 'int [5]' to parameter of type 'const char *'
warning: format specifies type 'char *' but the argument has type 'int *'
Those messages refer to type char *
, which is pointer-to-char
, because when you pass an array to a function, what actually gets passed is a pointer to the array's first element. (But that's a topic for another day. But it has a lot to do with what I said about printf
and strlen
"literally taking that same memory and treated it as if" it were an array of characters, instead.)