Often times, it is about managing expectations.
Let's start with a small thought experiment (or time travel back to the early days of computing), where there are no programming languages - just machine code. There, you would (with CPU specific instructions) write something like this to represent a string:
arr: db 'a','b','c'
strlen: ; RDI (pointer to string) -> RAX (length of string)
; RAX length counter and return value
; CL used for null character test
xor RAX, RAX ; set RAX to 0
strlen_loop:
mov cl, [rdi] ; load CL with the byte pointed to by argument
test cl,cl
jz strlen_loop_done
inc rdi ; look at next byte in argument
inc rax ; increment the length counter
jmp strlen_loop
strlen_loop_done:
ret ; rax contains a zero terminated strings length
Compared to that, writing the same function in C is much simpler.
- We do not have to care about register allotment (which register does what).
- We do not rely on the instruction set of a specific CPU
- We do not have to look up the "calling conventions" or ABI for the target system (argument passing conventions etc)
size_t strlen(const char* s) {
size_t l = 0;
while (*s) {
l++;
s++;
}
return l;
}
The convention, that "strings" are just pointers to chars (bytes) with the null value terminator is admittedly quite arbitrary but "comes" with the C programming language. It is just a convention. The compiler itself knows nothing about it (oh well it does know to add a terminating null on string literals). But when calling strlen()
it cannot distinguish the string case from the just a byte array case. Why? because there is no specific string type.
As such, it is just about as clever as the assembler code version I gave above. It relies on the "c-string-convention". The assembler does not check, nor does the C compiler, because - let's be honest, C's main accomplishments are the bullet items I gave above.
So if you manage your expectations, about the language C, think of it as: A slightly abstracted version of a glorified assembly language.
If you are annoyed about the c-string convention (after all, strlen
is O(n)
in time complexity), you can still come up with your own string type, maybe so:
typedef struct String_tag {
size_t length;
char data[];
} String_t;
And write yourself helpers (to create a string on the heap) and macros (to create a string on the stack with alloca
or something). And write your own string feature library around that type.
If you are just getting started with C, instead of tackling something bigger, I think this would be a good exercise for learning the language.