In ISO C, calling the standard function strlen
with a NULL pointer is Undefined Behaviour. This doesn't require/guarantee a crash, it merely means that's a possibility. (And so are demons flying out of your nose). UB means literally anything can happen (without violating the standard).
Obviously in practice the set of things which can actually happen is usually not that large, and of course most OSes don't even let you map the zero page at all. (And most C/C++ implementations use the bit-pattern 0
as the object-representation for nullptr
/ NULL, same as the C/C++ source-level 0
, even though fun fact that's not required.)
So most strlen implementations simply start by loading the first byte. (Or if it won't cross a page, load the first 16 bytes to branchlessly check for zero with SSE2. Is it safe to read past the end of a buffer within the same page on x86 and x64? has some discussion of glibc's x86-64 asm strlen implementations.)
If the caller wants to avoid crashing on NULL pointers, check before calling.
It's perfectly valid to write function without a nullptr
check, as long as that part of the contract is made clear to people writing code that calls it.
If you're writing by hand in asm (presumably for performance reasons), yes you should avoid writing extra sanity-checks that aren't needed, unless there's some useful behaviour that you could actually factor out of several callers, into a wrapper for strlen. (e.g. that falls into normal unchecked strlen for non-NULL, so it's just an extra couple instructions ahead of the normal strlen label.)
Besides, what could you return that would be any use to your caller for a NULL input? 0
implies that it's safe to read ptr[0]
and find a '\0'
. Maybe that's ok for some callers. size_t is an unsigned type so every other possible value is positive and also a valid size.