Why no sanity checks in legacy strcpy()

Question

Following is the most popular implementation of strcpy in traditional systems. Why dest and src are not checked for NULL in the start? I heard once that in old days the memory was limited so short code was always preferred. Will you implement strcpy and other similar functions with NULL pointer checks at the start now days? Why not?

char *strcpy(char *dest, const char *src)
{
   char *save = dest;
   while(*dest++ = *src++);
   return save;
}

It may be safer in general. But that also means experienced developers have to pay the cost of safety (that they do not need) just so that inexperienced developers do not snarf up. — Martin York, Sep 01 '10 at 09:09
Experienced developers have to pay the cost of safety (that they do not think they need)... (FTFY). — Brian Hooper, Sep 01 '10 at 10:21
@Brian Hooper - no, if you're using C, you should know exactly what you need. My embedded code never, ever needs `NULL` checks on `strcpy` because all buffers are statically allocated and used directly. There is absolutely no way I will ever pass `NULL` to `strcpy`. So why would I want to pay the price? There's no "do not think I need" about it. — detly, Sep 01 '10 at 11:39
Not crashing on NULL pointers is not safety unless that's what's specified. Handling that case when it's not specified means passing off the problem to another function - which might have unexpected consequences. The only safe program is a terminated one. — , Sep 01 '10 at 12:53
Useless `NULL` checks in functions that do not assign special meaning to `NULL` arguments are a bane of bad C libraries. They lock you into added waste and encourage bad coders to toss `NULL` pointers around as if they were a universally-valid "empty string" or something. — R.. GitHub STOP HELPING ICE, Sep 01 '10 at 13:10
@detly, perhaps you are right. The point I was half-seriously trying to make was that although inexperienced developers snarf up, so do experienced developers and the difference is chiefly that experienced developers know in advance that they are going to do so, and take steps to mitigate the damage it causes. — Brian Hooper, Sep 01 '10 at 15:27
@Brian Hooper - I concede that is a reasonable point to make :) — detly, Sep 02 '10 at 02:41

score 20 · Answer 1 · answered Sep 01 '10 at 08:45

20

NULL is a bad pointer, but so is (char*)0x1. Should it also check for that? In my opinion (I don't know the definitive reason why), sanity checks in such a low-level operation are uncalled for. strcpy() is so fundamental that it should be treated something like as asm instruction, and you should do your own sanity checks in the caller if needed. Just my 2 cents :)

answered Sep 01 '10 at 08:45

tenfour

36,141
15
83
142

I agree: low level routines should be implemented for efficiency and high level routines should add security when applicable. – Matthieu M. Sep 01 '10 at 11:51
2

+1 for pointing out that `NULL` is only one example of the 99.9% of pointer space that's likely also invalid. – R.. GitHub STOP HELPING ICE Sep 01 '10 at 13:12
1

What makes you say (char*)0x1 is necessarily a bad pointer? In C99, the null pointer *is* a special case in that it "is guaranteed to compare unequal to a pointer to any object or function." (6.3.2.3). – JeremyP Sep 01 '10 at 14:36
1

i'm just illustrating a point, no need to be pedantic. On my machine, 0x1 is a bad pointer. Will you also criticize R for his inaccurate statistic of 99.9% as well? :P – tenfour Sep 01 '10 at 15:03
If your program's size in memory is 4 megs on a 32-bit machine, then 99.9% of possible pointers are invalid -- and that's assuming `char` pointers with no alignment restrictions. Change that to `int` and the threshold goes up by 4x. And of course if you're on a 64-bit machine, 99.999999% of pointer values will be invalid in the vast majority of programs. – R.. GitHub STOP HELPING ICE Feb 01 '11 at 01:55

score 15 · Answer 2 · answered Sep 01 '10 at 09:01

There are no sanity checks because one of the most important underlying ideologies of C is that the developer supplies the sanity. When you assume that the developer is sane, you end up with a language that can be used to do just about anything, anywhere.

This is not an explicitly stated goal — it's quite possible for someone to come up with an implementation that does check for this, and more. Maybe they have. But I doubt that many people used to C would clamour to use it, since they'd need to put the checks in anyway if there was any chance that their code would be ported to a more usual implementation.

*...the developer supplies the sanity.* - I like that ;) – caf Sep 01 '10 at 09:11 — caf, Sep 01 '10 at 09:11

score 11 · Answer 3 · answered Sep 01 '10 at 08:59

The whole C language is written with the motto "We'll behave correctly provided the programmer knows what he's doing." The programmer is expected to know to make all the checks he needs to make. It's not just checking for NULL, it's ensuring that dest points to enough allocated memory to hold src, it's checking the return value of fopen to make sure the file really did open successfully, knowing when memcpy is safe and when memmove is required, and so on.

Getting strcpy to check for NULL won't change the language paradigm. You will still need to ensure that dest points to enough space -- and this is something that strcpy can't check for without changing the interface. You will also need to ensure that src is '\0'-terminated, which again strcpy can't possibly check.

There are some C standard library functions which do check for NULL: for example, free(NULL) is always safe. But in general, C expects you to know what you're doing.

[C++ generally eschews the <cstring> library in favour of std::string and friends.]

..which is so utterly incompatible, it seems to be *designed to hurt*. — peterchen, Sep 01 '10 at 10:50

score 6 · Answer 4 · answered Sep 01 '10 at 10:45

It's usually better for the library to let the caller decide what it wants the failure semantics to be. What would you have strcpy do if either argument is NULL? Silently do nothing? Fail an assert (which isn't an option in non-debug builds)?
It's easier to opt-in than it is to opt-out. It's trivial to write your own wrapper around strcpy that validates the inputs and to use that instead. If, however, the library did this itself, you would have no way of choosing not to perform those checks short of re-implementing strcpy. (For example, you might already know that the arguments you pass to strcpy aren't NULL, and it might be something you care about if you're calling it in a tight loop or are concerned about minimizing power usage.) In general, it's better to err on the side of granting more freedom (even if that freedom comes with additional responsibility).

+1 for custom wrapper that implements your error handling policy. (Though I probably wouldn't wrap strcpy individually. I use a StrOnBuf class that wraps the core character buffer manipulation routines, and can be configured to truncate silently, truncate with debug assert, or throw). — peterchen, Sep 01 '10 at 10:52

Martin B · Answer 5 · 2010-09-01T09:16:35.803

3

The most likely reason is: Because strcpy is not specified to work with NULL inputs (i.e. its behaviour in this case is undefined).

So, what should a library implementer choose to do if a NULL is passed in? I would argue that the best thing do to is to let the application crash. Think of it this way: A crash is a fairly obvious sign that something has gone wrong... silently ignoring a NULL input, on the other hand, may mask a bug that will be much harder to detect.

edited Sep 01 '10 at 09:16

answered Sep 01 '10 at 08:46

Martin B

23,670
6
53
72

1

No. `strcpy` on NULL input is undefined behaviour, which may crash, or it may *silently do the right thing*. You certainly can't rely on a runtime error from using `strcpy` with NULL. – Philip Potter Sep 01 '10 at 08:51
It might do, but the reality is that it won't. – Puppy Sep 01 '10 at 09:09
@Philip: Good point -- I've edited the answer to remove the erroneous statement that "the correct thing to do is to crash" (but I would still argue that it's the _best_ thing to do). – Martin B Sep 01 '10 at 09:18

score 2 · Answer 6 · answered Sep 01 '10 at 08:47

NULL checks were not implemented because C's earliest targets supported strong memory protections. When a process attempted to read from or write to NULL, the memory controller would signal the CPU that an out-of-range memory access was attempted (segmentation violation), and the kernel would kill the offending process.

This was an alright answer, because code attempting to read from or write to a NULL pointer is broken; the only answer is to re-write the code to check return values from malloc(3) and friends and take corrective action. By the time you're trying to use pointers to unallocated memory, it is too late to make a correct decision about how to fix the situation.

score 0 · Answer 7 · answered Sep 01 '10 at 09:05

You should think of the C standard library functions as the thinnest possible additional layer of abstraction above the assembly code that you don't want to churn out to get your stuff over the door. Everything beyond that, like error checking, is your responsibility.

aeh · Answer 8 · 2010-09-01T09:34:14.370

According to me any function you would want to define would have a pre-condition and a post-condition. Taking care of the preconditions should never be part of a function. Following is a precondition to use strcpy taken from the man page.

The strcpy() function copies the string pointed to by src (including the terminating '\0' character) to the array pointed to by dest. The strings may not overlap, and the destination string dest must be large enough to receive the copy.

Now if the precondition is not met then things might be undefined.

Whether I would include a NULL check in my strcpy now. I would rather have another safe_strcpy, giving safety the priority I would definitely include NULL checks and handle overflow conditions. And accordingly my precondition gets modified.

score 0 · Answer 9 · answered Sep 01 '10 at 11:18

0

There is simply no error semantic defined for it. In particular there is no way for strcpy to return an error value. C99 simply states:

The strcpy function returns the value of s1.

So for a conforming implementation there wouldn't even a possibility to return the information that something went wrong. So why bother with it.

All this is voluntary, I think, since strcpy is replaced by most compilers by very efficient assembler directly. Error checks are up to the caller.

answered Sep 01 '10 at 11:18

Jens Gustedt

76,821
6
102
177

Since behavior is undefined if `NULL` is passed, `strcpy` could conceivably return something other than `s1` when `s1` is `NULL`. Or it could fail to return at all (crash or infinite loop). – R.. GitHub STOP HELPING ICE Sep 01 '10 at 12:56
What if you are the first one and were asked to design strcpy from scratch and also write the C99 standards youself. Would you change it to return some error value? – user436748 Sep 01 '10 at 14:39
@user436748: unfortunately this is purely hypothetical, three options. For a design as "high level" I would just require it to return `NULL` if on error, to set `errno` with an indication and also that the original data is unchanged in such a case. this is done in several other places, but not here for `strcpy`. If I would design it as "low level" I'd go for "just do the right thing" but I would in addition require it to produce a segfault in case that one of the pointers is `NULL`. Then, you asked, for real C99 you could go for `char* strcpy(char s1[static 1], char const s2[static 1]);` – Jens Gustedt Sep 01 '10 at 15:04

Why no sanity checks in legacy strcpy()

9 Answers9

Linked