38

Why is strlen() not checking for NULL?

if I do strlen(NULL), the program segmentation faults.

Trying to understand the rationale behind it (if any).

hari
  • 9,439
  • 27
  • 76
  • 110
  • 1
    Please note that about 10 years ago, strlen and other string functions did check for null strings before processing, but this was removed because most programmers explicitly checked these pointers anyway, and it was pointless checking it twice. – Owl Mar 05 '17 at 16:05

6 Answers6

39

The rational behind it is simple -- how can you check the length of something that does not exist?

Also, unlike "managed languages" there is no expectations the run time system will handle invalid data or data structures correctly. (This type of issue is exactly why more "modern" languages are more popular for non-computation or less performant requiring applications).

A standard template in c would look like this

 int someStrLen;

 if (someStr != NULL)  // or if (someStr)
    someStrLen = strlen(someStr);
 else
 {
    // handle error.
 }
Hogan
  • 69,564
  • 10
  • 76
  • 117
  • Thanks Hogan. Does that mean, the caller should check for NULL before passing the object to strlen()? something like: if (object) len =strlen(object) else len=-1 – hari Apr 26 '11 at 20:41
  • 2
    "Managed"... That's right. Imagine every function begin very paranoid and checking for every possible mistake. Printf storing meta-information for every argument in the list, every math operation checking for overflow etc. That's managed. –  Apr 26 '11 at 20:42
  • @hari - yes, that means you need to check that your pointer is valid before calling `strlen`. – John Bode Apr 26 '11 at 20:55
  • @hari - yes exactly, I've updated the answer with a template. – Hogan Apr 26 '11 at 20:58
  • Actually you can only add the check when it's needed, e.g: some programs of mine have these checks when reading config files, but the "guts" of the program only deal with strings that do exist. – ninjalj Apr 26 '11 at 21:29
  • 1
    I take exception to the "standard template". If `someStr` is supposed to point to a string, it should never be a null pointer when this point in the program is reached. Some people use null pointers as a special "empty" value, but this is not a universal convention and I would say it does a lot more harm than good... – R.. GitHub STOP HELPING ICE Feb 05 '12 at 01:20
  • 2
    @R I guess we are not in agreement about what "standard template" means. Maybe you would prefer "useful pattern"? If you feel better with this term, I'm fine with it. – Hogan Feb 05 '12 at 18:44
  • 2
    In c11, there is `strnlen_s(str, strsz)` that returns zero if str is a null pointer. – jfs Oct 28 '17 at 13:52
  • 1
    @jfs it does more than that it also limits the max size returned. But you make a good point this is clearly the better choice for a robust program. – Hogan Oct 29 '17 at 14:08
26

The portion of the language standard that defines the string handling library states that, unless specified otherwise for the specific function, any pointer arguments must have valid values.

The philosphy behind the design of the C standard library is that the programmer is ultimately in the best position to know whether a run-time check really needs to be performed. Back in the days when your total system memory was measured in kilobytes, the overhead of performing an unnecessary runtime check could be pretty painful. So the C standard library doesn't bother doing any of those checks; it assumes that the programmer has already done it if it's really necessary. If you know you will never pass a bad pointer value to strlen (such as, you're passing in a string literal, or a locally allocated array), then there's no need to clutter up the resulting binary with an unnecessary check against NULL.

John Bode
  • 119,563
  • 19
  • 122
  • 198
6

The standard does not require it, so implementations just avoid a test and potentially an expensive jump.

ninjalj
  • 42,493
  • 9
  • 106
  • 148
5

A little macro to help your grief:

#define strlens(s) (s==NULL?0:strlen(s))
Stoian Ivanov
  • 107
  • 1
  • 3
3

Three significant reasons:

  • The standard library and the C language are designed assuming that the programmer knows what he is doing, so a null pointer isn't treated as an edge case, but rather as a programmer's mistake that results in undefined behaviour;

  • It incurs runtime overhead - calling strlen thousands of times and always doing str != NULL is not reasonable unless the programmer is treated as a sissy;

  • It adds up to the code size - it could only be a few instructions, but if you adopt this principle and do it everywhere it can inflate your code significantly.

Blagovest Buyukliev
  • 42,498
  • 14
  • 94
  • 130
  • 4
    Some standard C functions do check for `NULL` inputs, so the first reason is bogus. The third reason is also bogus because putting a few extra checks in the library adds less to code size (on a typical, non-embedded platform) than all the checks inserted in client code. – Fred Foo Apr 26 '11 at 21:02
  • @larsmans: reason one wasn't an ultimate statement but rather an attempt to describe the prevailing mindset in C programming; reason three makes sense when you are sure that the pointer can't be `NULL` in the client code and such a check acts more like an `assert` statement. – Blagovest Buyukliev Apr 26 '11 at 21:21
  • @larsmans: oh, but most functions that check for `NULL` are on "newer" parts of the standard (e.g: `mb*`, `wc*`), aren't they? – ninjalj Apr 26 '11 at 21:25
  • 1
    @ninjalj: And checking for NULL is actually the biggest flaw in the wc/mb interfaces. A common need with these functions is to process a single byte/character at a time, and performing multiple useless null pointer checks on each call can easily double the time spent in them. – R.. GitHub STOP HELPING ICE Feb 05 '12 at 01:22
  • @R..: sure, I was just pointing out that the existence of those functions doesn't really constitute a counter-example of Blagovest's first point. – ninjalj Feb 14 '12 at 23:52
  • Second reason is a bit questionable too. For modern CPUs, comparing value passed to a routine to zero takes near zero time because it usually can be done at the same clock with other operations. Yes, adding a branching can affect performance, but I suspect that doing this in complex user code is worse than in a relatively small library function. – Rodion Melnikov Jun 19 '21 at 21:09
1
size_t strlen ( const char * str );

http://www.cplusplus.com/reference/clibrary/cstring/strlen/

Strlen takes a pointer to a character array as a parameter, null is not a valid argument to this function.

Casey Flynn
  • 13,654
  • 23
  • 103
  • 194