96

I've read on Stack Overflow that some C functions are "obsolete" or "should be avoided". Can you please give me some examples of this kind of function and the reason why?

What alternatives to those functions exist?

Can we use them safely - any good practices?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Andrei Ciobanu
  • 12,500
  • 24
  • 85
  • 118
  • 3
    I believe it is a very good thing to know. Some functions from C should really be avoided and used *only* for educational purposes. – INS Apr 02 '10 at 09:04
  • @Felix, it would be easier to edit one single (marked correct) answer in the future. It really is up in the air regarding answers that will likely change over time. Who knows, maybe Jeff will provide a 'janitor' badge to people who keep answers current in the next few years. – Tim Post Apr 02 '10 at 09:37
  • 1
    @Tim Post: Ok I will delete my comments. – Felix Kling Apr 02 '10 at 09:44
  • FWIW, `snprintf()` was not in the C90 standard. I believe it was added with C99, one of the better things about the new standard. – David Thornley Apr 02 '10 at 16:27
  • 3
    I would not use `strncpy()` as a general replacement for `strcpy()`, and I wouldn't use `strncat()` ever because it has the most unintuitive interface imaginable - do YOU know what the length parameter specifies? – Jonathan Leffler Apr 04 '10 at 01:00
  • 2
    Using strncpy and strncat is almost always a mistake. They certainly should not be used in place of strcpy and strcat!! scanf and sprintf are also perfectly usable if you know how to use them... – R.. GitHub STOP HELPING ICE Jul 03 '10 at 20:39
  • You should not use any variant of strcat. You should know the length of your string (and your buffer) and should pass str+len as the first parameter to whatever function is going to concatenate to your string. – jmucchiello Oct 21 '16 at 15:43
  • 1
    Rather than posting a summary in your question, you should post it as a separate answer to the question... or just delete it. Particularly since you have picked up a lot of incorrect opinions from several present, incorrect answers. I have posted a better summary with authoritative references. Note that Microsoft is not a valid reference, but rather the opposite - they are widely infamous for their misuse of the C language and the inability of their products to follow standard C. – Lundin Oct 04 '17 at 11:42

13 Answers13

60

Deprecated Functions
Insecure
A perfect example of such a function is gets(), because there is no way to tell it how big the destination buffer is. Consequently, any program that reads input using gets() has a buffer overflow vulnerability. For similar reasons, one should use strncpy() in place of strcpy() and strncat() in place of strcat().

Yet some more examples include the tmpfile() and mktemp() function due to potential security issues with overwriting temporary files and which are superseded by the more secure mkstemp() function.

Non-Reentrant
Other examples include gethostbyaddr() and gethostbyname() which are non-reentrant (and, therefore, not guaranteed to be threadsafe) and have been superseded by the reentrant getaddrinfo() and freeaddrinfo().

You may be noticing a pattern here... either lack of security (possibly by failing to include enough information in the signature to possibly implement it securely) or non-reentrance are common sources of deprecation.

Outdated, Non-Portable
Some other functions simply become deprecated because they duplicate functionality and are not as portable as other variants. For example, bzero() is deprecated in favor of memset().

Thread Safety and Reentrance
You asked, in your post, about thread safety and reentrance. There is a slight difference. A function is reentrant if it does not use any shared, mutable state. So, for example, if all the information it needs is passed into the function, and any buffers needed are also passed into the function (rather than shared by all calls to the function), then it is reentrant. That means that different threads, by using independent parameters, do not risk accidentally sharing state. Reentrancy is a stronger guarantee than thread safety. A function is thread safe if it can be used by multiple threads concurrently. A function is thread safe if:

  • It is reentrant (i.e. it does not share any state between calls), or:
  • It is non-reentrant, but it uses synchronization/locking as needed for shared state.

In general, in the Single UNIX Specification and IEEE 1003.1 (i.e. "POSIX"), any function which is not guaranteed to be reentrant is not guaranteed to be thread safe. So, in other words, only functions which are guaranteed to be reentrant may be portably used in multithreaded applications (without external locking). That does not mean, however, that implementations of these standards cannot choose to make a non-reentrant function threadsafe. For example, Linux frequently adds synchronization to non-reentrant functions in order to add a guarantee (beyond that of the Single UNIX Specification) of threadsafety.

Strings (and Memory Buffers, in General)
You also asked if there is some fundamental flaw with strings/arrays. Some might argue that this is the case, but I would argue that no, there is no fundamental flaw in the language. C and C++ require you to pass the length/capacity of an array separately (it is not a ".length" property as in some other languages). This is not a flaw, per-se. Any C and C++ developer can write correct code simply by passing the length as a parameter where needed. The problem is that several APIs that required this information failed to specify it as a parameter. Or assumed that some MAX_BUFFER_SIZE constant would be used. Such APIs have now been deprecated and replaced by alternative APIs that allow the array/buffer/string sizes to be specified.

Scanf (In Answer to Your Last Question)
Personally, I use the C++ iostreams library (std::cin, std::cout, the << and >> operators, std::getline, std::istringstream, std::ostringstream, etc.), so I do not typically deal with that. If I were forced to use pure C, though, I would personally just use fgetc() or getchar() in combination with strtol(), strtoul(), etc. and parse things manually, since I'm not a huge fan of varargs or format strings. That said, to the best of my knowledge, there is no problem with [f]scanf(), [f]printf(), etc. so long as you craft the format strings yourself, you never pass arbitrary format strings or allow user input to be used as format strings, and you use the formatting macros defined in <inttypes.h> where appropriate. (Note, snprintf() should be used in place of sprintf(), but that has to do with failing to specify the size of the destination buffer and not the use of format strings). I should also point out that, in C++, boost::format provides printf-like formatting without varargs.

Michael Aaron Safyan
  • 93,612
  • 16
  • 138
  • 200
  • Is gethostbyaddr, gethostbyname from the standard library ? – Andrei Ciobanu Apr 02 '10 at 08:35
  • @nomemory, no; they are defined in the Single UNIX Specification. – Michael Aaron Safyan Apr 02 '10 at 08:38
  • 4
    "Deprecated" is a strong word, which has a specific meaning when discussing the C++ Standard. In that sense, gets(), strcpy() etc. are not deprecated. –  Apr 02 '10 at 09:34
  • @Neil, "deprecated" is a generic programming term derived from "depreciated" and simply means held or looked on with contempt. In that sense, I believe I am using the terms correctly. – Michael Aaron Safyan Apr 02 '10 at 09:37
  • 4
    Just so long as you distinguish between "deprecated by the C standard", "deprecated by Michael Aaron Safyan", and "deprecated by person or persons unknown who hopefully know what they're talking about [citation needed]". The question *is* about preferred coding style, not about the C standard, so the second two are appropriate. But like Neil I required a double-take before realising that your statements did not intend to imply the first meaning. – Steve Jessop Apr 02 '10 at 11:46
  • @Steve, fair enough. I only use the term when the Single UNIX Specification's informative section recommends against its use and suggests using an alternative function. – Michael Aaron Safyan Apr 02 '10 at 15:50
  • @Steve, for example, with bzero, it says "The memset() function is preferred over this function." – Michael Aaron Safyan Apr 02 '10 at 15:51
  • Lastly, I'd like to point out that the original question did not specify the ISO C standard library... and so my answer includes many functions that are commonly used in C but not from the standard library. – Michael Aaron Safyan Apr 02 '10 at 15:53
  • Sure, in the case of the POSIX functions, if the POSIX specification deprecates them, then they "are deprecated", and it advises against using certain standard C functions *on POSIX systems*. `gets` is deprecated in the latest C99 draft, `strcpy` isn't. In C++, deprecation is meaningless: "deprecated" is defined to mean "not guaranteed to be part of the Standard in future revisions", but C++0x is removing a feature which was not deprecated in C++03. So non-deprecated features aren't guaranteed not to be removed either. Hence IMO the need to resolve the ambiguity of "is deprecated". – Steve Jessop Apr 02 '10 at 16:58
  • 12
    `strncpy` should generally be avoided as well. It doesn't do what most programmers assumes it does. It does not guarantee termination (leading to buffer overruns), and it pads shorter strings (possibly degrading performance in some cases). – Adrian McCarthy Apr 02 '10 at 22:20
  • 2
    @Adrian: I agree with you - neither `strncpy()` nor the even worse `strncat()` is a sensible replacement for the n-less variants. – Jonathan Leffler Apr 04 '10 at 01:01
  • @Adrian, strncpy() and strcat() actually *do NUL-terminate strings*, but _only if_ the destination buffer has remaining space. The non-portable strlcpy() and strlcat(), while they will forcibly terminate the strings, also lead to truncations and detecting them is not entirely intuitive. Moreover, since buffers should be dynamically allocated (to allow for arbitrary input sizes), it is really not that difficult to ensure that buffers are always one more character in length than the data (in order to provide space for the NUL-terminator). – Michael Aaron Safyan Apr 04 '10 at 03:08
  • @Michael: I didn't say `strncpy` doesn't terminate; I said it doesn't _guarantee_ termination, which is true. Many programmers believe that it does guarantee termination, and that leads to buffer overruns, which is why many shops ban it from their code. – Adrian McCarthy Apr 05 '10 at 15:34
  • 7
    This answer spreads nonsense dogma about "replace strcpy with strncpy - I don't know why but Microsoft tells me to." strncpy was never intended to be a safe version of strcpy! In face it is far more unsafe. See [Why are strlcpy and strlcat considered insecure?](https://stackoverflow.com/questions/2114896/why-are-strlcpy-and-strlcat-considered-insecure). – Lundin Oct 04 '17 at 07:02
  • There's much better replacements for the string functions now: `getline()`, `getdelim()`, `strdup()`, `asprintf()`, and `open_memstream()`. These all share the trait that they themselves call `malloc()` to back the resulting string, allocating it to fit the actual data that's available. This guarantees two things: 1. No buffer overruns possible, and 2. no **truncations** possible. This means that you need much less error prone error handling code to use these functions compared to the "safe" variants like `strncpy()` (`strncpy()` may actually leave the result without a terminating null byte!). – cmaster - reinstate monica Nov 14 '17 at 12:56
25

Once again people are repeating, mantra-like, the ludicrous assertion that the "n" version of str functions are safe versions.

If that was what they were intended for then they would always null terminate the strings.

The "n" versions of the functions were written for use with fixed length fields (such as directory entries in early file systems) where the nul terminator is only required if the string does not fill the field. This is also the reason why the functions have strange side effects that are pointlessly inefficient if just used as replacements - take strncpy() for example:

If the array pointed to by s2 is a string that is shorter than n bytes, null bytes are appended to the copy in the array pointed to by s1, until n bytes in all are written.

As buffers allocated to handle filenames are typically 4kbytes this can lead to a massive deterioration in performance.

If you want "supposedly" safe versions then obtain - or write your own - strl routines (strlcpy, strlcat etc) which always nul terminate the strings and don't have side effects. Please note though that these aren't really safe as they can silently truncate the string - this is rarely the best course of action in any real-world program. There are occasions where this is OK but there are also many circumstances where it could lead to catastrophic results (e.g. printing out medical prescriptions).

Dipstick
  • 9,854
  • 2
  • 30
  • 30
  • 2
    You are right about `strncpy()`, but wrong about `strncat()`. `strncat()` was not designed for use with fixed-length fields - it was actually designed to be a `strcat()` that limits the number of characters concatenated. It is quite easy to use this as a "safe `strcat()`" by keeping track of the space remaining in the buffer when doing multiple concatenations, and even easier to use it as a "safe `strcpy()`" (by setting the first character of the destination buffer to `'\0'` before calling it). `strncat()` *always* terminates the destination string, and it does not write extra `'\0'`s. – caf Apr 02 '10 at 10:52
  • 2
    @caf - yes but strncat() is totally useless as it takes as a parameter the maximum length to be copied - not the length of the destination buffer. To prevent buffer overflow you need to know the length of the current destination string, and if you know that why would you use strncat() - which has to work it out the dest length again - rather than just strlcat() the source string to the end of the dest string. – Dipstick Apr 02 '10 at 11:24
  • 1
    That still doesn't change the fact that you imply that `strncat()` doesn't always nul-terminate the destination, and that it was designed for use with fixed-length fields, both of which are wrong. – caf Apr 02 '10 at 12:52
  • 2
    @chrisharris: `strncat()` will work properly regardless of the length of the source string, while `strcat()` won't. The problem with `strlcat()` here is that it's not a standard C function. – David Thornley Apr 02 '10 at 16:24
  • @caf - you are correct about strncat(). I've never actually used it because - as I pointed out above - it is totally useless. It should therefore still be avoided. – Dipstick Apr 02 '10 at 17:28
  • @caf, @David, not only is "strlcat()" not a standard C function, but it is also not in IEEE Std 1003.1 or in the Single UNIX Specification. It is only in BSD-based OSs. – Michael Aaron Safyan Apr 04 '10 at 03:12
  • +1 for giving a correct explanation of the idiotic strn* functions. – R.. GitHub STOP HELPING ICE Jul 03 '10 at 20:42
21

Several answers here suggest using strncat() over strcat(); I'd suggest that strncat() (and strncpy()) should also be avoided. It has problems that make it difficult to use correctly and lead to bugs:

  • the length parameter to strncat() is related to (but not quite exactly - see the 3rd point) the maximum number of characters that can be copied to the destination rather than the size of the destination buffer. This makes strncat() more difficult to use than it should be, particularly if multiple items will be concatenated to the destination.
  • it can be difficult to determine if the result was truncated (which may or may not be important)
  • it's easy to have an off-by-one error. As the C99 standard notes, "Thus, the maximum number of characters that can end up in the array pointed to by s1 is strlen(s1)+n+1" for a call that looks like strncat( s1, s2, n)

strncpy() also has an issue that can result in bugs you try to use it in an intuitive way - it doesn't guarantee that the destination is null terminated. To ensure that you have to make sure you specifically handle that corner case by dropping a '\0' in the buffer's last location yourself (at least in certain situations).

I'd suggest using something like OpenBSD's strlcat() and strlcpy() (though I know that some people dislike those functions; I believe they're far easier to use safely than strncat()/strncpy()).

Here's a little of what Todd Miller and Theo de Raadt had to say about problems with strncat() and strncpy():

There are several problems encountered when strncpy() and strncat() are used as safe versions of strcpy() and strcat(). Both functions deal with NUL-termination and the length parameter in different and non-intuitive ways that confuse even experienced programmers. They also provide no easy way to detect when truncation occurs. ... Of all these issues, the confusion caused by the length parameters and the related issue of NUL-termination are most important. When we audited the OpenBSD source tree for potential security holes we found rampant misuse of strncpy() and strncat(). While not all of these resulted in exploitable security holes, they made it clear that the rules for using strncpy() and strncat() in safe string operations are widely misunderstood.

OpenBSD's security audit found that bugs with these functions were "rampant". Unlike gets(), these functions can be used safely, but in practice there are a lot of problems because the interface is confusing, unintuitive and difficult to use correctly. I know that Microsoft has also done analysis (though I don't know how much of their data they may have published), and as a result have banned (or at least very strongly discouraged - the 'ban' might not be absolute) the use of strncat() and strncpy() (among other functions).

Some links with more information:

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 1
    It's far better to keep track of the length of strings outside the string itself. Like that, concatenating two strings(-with-length) safely becomes a matter of a simple calculation (to get the required buffer size) a possible reallocation, and a `memmove()`. (Well, you can use `memcpy()` in the normal case when the strings are independent.) – Donal Fellows Apr 02 '10 at 10:02
  • 1
    Your second point is dead wrong - `strncat()` *always* terminates the destination string. – caf Apr 02 '10 at 10:48
  • 1
    Your second point is wrong for `strncat()`. However, it is correct for `strncpy()`, which has some other problems. `strncat()` is a reasonable replacement for `strcat()`, but `strncpy()` is not a reasonable replacement for `strcpy()`. – David Thornley Apr 02 '10 at 16:19
  • 2
    caf and David are 100% right about my claim that `strncat()` doesn't always null terminate. I was confusing the behaviors of `strncat()` and `strncpy()` a bit (another reason they're functions to avoid - they have names that imply similar behaviors, but in fact behave differently in important ways...). I've amended my answer to correct this as well as add additional information. – Michael Burr Apr 02 '10 at 20:26
  • A bigger problem with strncat and strcat is that they return a useless pointer to the start of the destination string, rather than a pointer to the trailing zero byte that was written. A safe and efficient string-concatenation function should accept pointers to the beginning and end of the destination buffer, and return a pointer to the new trailing null byte. – supercat Mar 27 '15 at 21:39
  • 1
    Note that `char str[N] = ""; strncat(str, "long string", sizeof(str));` is a buffer overflow if N is not big enough. The `strncat()` function is too easy to misuse; it should not be used. If you can use `strncat()` safely, you could have used `memmove()` or `memcpy()` instead (and those would be more efficient). – Jonathan Leffler May 04 '18 at 15:16
  • @JonathanLeffler: How often does code not know the length of the destination buffer, nor the string that's presently in it, but nonetheless know how many bytes may safely be added to that string without overrunning the buffer? In how many of those situations would code then not care about the length of the destination string? I find myself hard pressed to imagine any situation where I would find `strncat` useful. – supercat Aug 06 '20 at 21:55
  • @supercat — to be able to use `strncat()` safely, you have to have enough information available to you that you could use `memmove()` instead (and it would be more efficient to do so as `memmove()` would not have to scan the leading string). In other words, you're right — there is no good use case for `strncat()`. – Jonathan Leffler Aug 06 '20 at 22:58
  • @JonathanLeffler: I could design a number of useful safe string concatenation and other functions, but the library ones are pretty horrid. Actually, the only obstacle to doing strings better in general is the lack of any means of producing an in-line static const literal of any form other than a zero-terminated string. – supercat Aug 06 '20 at 23:01
15

Standard library functions that should never be used:

setjmp.h

  • setjmp(). Together with longjmp(), these functions are widely recogniced as incredibly dangerous to use: they lead to spaghetti programming, they come with numerous forms of undefined behavior, they can cause unintended side-effects in the program environment, such as affecting values stored on the stack. References: MISRA-C:2012 rule 21.4, CERT C MSC22-C.
  • longjmp(). See setjmp().

stdio.h

  • gets(). The function has been removed from the C language (as per C11), as it was unsafe as per design. The function was already flagged as obsolete in C99. Use fgets() instead. References: ISO 9899:2011 K.3.5.4.1, also see note 404.

stdlib.h

  • atoi() family of functions. These have no error handling but invoke undefined behavior whenever errors occur. Completely superfluous functions that can be replaced with the strtol() family of functions. References: MISRA-C:2012 rule 21.7.

string.h

  • strncat(). Has an awkward interface that are often misused. It is mostly a superfluous function. Also see remarks for strncpy().
  • strncpy(). The intention of this function was never to be a safer version of strcpy(). Its sole purpose was always to handle an ancient string format on Unix systems, and that it got included in the standard library is a known mistake. This function is dangerous because it may leave the string without null termination and programmers are known to often use it incorrectly. References: Why are strlcpy and strlcat considered insecure?, with a more detailed explanation here: Is strcpy dangerous and what should be used instead?.

Standard library functions that should be used with caution:

assert.h

  • assert(). Comes with overhead and should generally not be used in production code. It is better to use an application-specific error handler which displays errors but does not necessarily close down the whole program.

signal.h

stdarg.h

  • va_arg() family of functions. The presence of variable-length functions in a C program is almost always an indication of poor program design. Should be avoided unless you have very specific requirements.

stdio.h
Generally, this whole library is not recommended for production code, as it comes with numerous cases of poorly-defined behavior and poor type safety.

  • fflush(). Perfectly fine to use for output streams. Invokes undefined behavior if used for input streams.
  • gets_s(). Safe version of gets() included in C11 bounds-checking interface. It is preferred to use fgets() instead, as per C standard recommendation. References: ISO 9899:2011 K.3.5.4.1.
  • printf() family of functions. Resource heavy functions that come with lots of undefined behavior and poor type safety. sprintf() also has vulnerabilities. These functions should be avoided in production code. References: MISRA-C:2012 rule 21.6.
  • scanf() family of functions. See remarks about printf(). Also, - scanf() is vulnerable to buffer overruns if not used correctly. fgets() is preferred to use when possible. References: CERT C INT05-C, MISRA-C:2012 rule 21.6.
  • tmpfile() family of functions. Comes with various vulnerability issues. References: CERT C FIO21-C.

stdlib.h

  • malloc() family of functions. Perfectly fine to use in hosted systems, though be aware of well-known issues in C90 and therefore don't cast the result. The malloc() family of functions should never be used in freestanding applications. References: MISRA-C:2012 rule 21.3.

    Also note that realloc() is dangerous in case you overwrite the old pointer with the result of realloc(). In case the function fails, you create a leak.

  • system(). Comes with lots of overhead and although portable, it is often better to use system-specific API functions instead. Comes with various poorly-defined behavior. References: CERT C ENV33-C.

string.h

  • strcat(). See remarks for strcpy().
  • strcpy(). Perfectly fine to use, unless the size of the data to be copied is unknown or larger than the destination buffer. If no check of the incoming data size is done, there may be buffer overruns. Which is no fault of strcpy() itself, but of the calling application - that strcpy() is unsafe is mostly a myth created by Microsoft.
  • strtok(). Alters the caller string and uses internal state variables, which could make it unsafe in a multi-threaded environment.
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • I suggest recommending `static_assert()` in place of `assert()`, if condition can be resolved at compile time. Also, `sprintf()` can almost always be replaced `snprintf()` which is a little bit safer. – user694733 Nov 14 '17 at 12:50
  • 2
    Using `strtok()` in a function A means that (a) the function cannot call any other function that also uses `strtok()` while A is using it, and (b) means that no function that calls A can be using `strtok()` when it calls A. In other words, using `strtok()` poisons the call chain; it cannot be used safely in library code because it must document that it uses `strtok()` to prevent other users of `strtok()` from calling the library code. – Jonathan Leffler May 04 '18 at 15:23
  • The `strncpy` is the proper function to use when writing a zero-*padded* string buffer with data taken from either a zero-terminated string or a zero-padded buffer whose size is at least as large as the destination. Zero-padded buffers aren't terribly common, but they're not exactly obscure either. – supercat Aug 06 '20 at 21:52
  • What would you recommend as a viable replacement for the *`printf()` family of functions*? – 303 Nov 09 '21 at 00:41
  • I think the advice against `setjmp` is too strong; I have occasionally used it myself, and it does its job well if you know how to avoid its pitfalls. The advice against `system`, on the other hand, is too weak; to me, it belongs in the ‘never use’ zone. Also, MISRA C is snake oil driven by overzealous UB panic (it recommends against using `offsetof`, [among other things](http://www.knosof.co.uk/misracom.html)), and is rendered in large part obsolete by better static analysis and runtime instrumentation, so I am wary of citing it as an authoritative reference. – user3840170 Apr 22 '22 at 10:06
  • @user3840170 MISRA dropped the rule against offsetof in 2012. You seem to be speaking of the 18 year old version from 2004. It is certainly not "snake oil"... have you ever been involved in any safety-related project? – Lundin Apr 22 '22 at 11:04
  • Not explicitly, but if you like arguments from authority, John Regehr is not very impressed with MISRA C either. I might have cited more up-to-date criticism, if MISRA weren’t even more paranoid about their precious copyrightses than about standing more than half a kilometre away from undefined behaviour. Last I checked, it was a mix of rules that at best, are only relevant to low-memory embedded environments and at worst, are non-actionable or actively harmful nonsense. It is a bad workaround for not having a static analyser. – user3840170 Apr 22 '22 at 14:08
  • @user3840170 You sure have a lot of strong opinions about a document which you have never read... There's no better argument than "because I say so", or "John says so" right? Stating actual reasons when making an argument is overrated and might cause people to take what you say seriously. – Lundin Apr 22 '22 at 14:14
8

Some people would claim that strcpy and strcat should be avoided, in favor of strncpy and strncat. This is somewhat subjective, in my opinion.

They should definitely be avoided when dealing with user input - no doubt here.

In code "far" from the user, when you just know the buffers are long enough, strcpy and strcat may be a bit more efficient because computing the n to pass to their cousins may be superfluous.

Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
  • 4
    And if available, better yet `strlcat` and `strlcpy`, since the 'n' version does not guarantee NULL termination of the destination string. – Dan Andreatta Apr 02 '10 at 08:42
  • Sounds like premature optimization to me. But IIRC, `strncpy()` will write write exactly `n` bytes, using nul characters if necessary. As Dan, using a safe version is the best choice IMO. – Bastien Léonard Apr 02 '10 at 08:47
  • @Dan, these functions are unfortunately non-standard, and hence don't belong to this discussion – Eli Bendersky Apr 02 '10 at 08:59
  • @Eli: but the fact that `strncat()` can be difficult to use correctly and safely (and should be avoided) is on topic. – Michael Burr Apr 02 '10 at 09:05
  • @Michael: I wouldn't say it's difficult to use correctly and safely. With care, it works just fine – Eli Bendersky Apr 02 '10 at 09:26
  • I don't think it matters whether it's user input or not, it depends on your coding style. If you are correctly calculating required buffer sizes before using `strcpy`, then replacing it with `strncpy` will not improve your code. If you belong to the "oh, 1024 will be enough" school of string-handling, then `strncpy` will improve your code. User input might be malicious, whereas trusted input will not be, but if there are errors in your code then (a) trusted input can trip them too, and (b) *there can be errors even using strncpy*, because you still have to pass the right length in. – Steve Jessop Apr 02 '10 at 11:35
  • So the classic strncpy/strncat error is `char buf[128]; strncpy(buf, "http://", 128); strncat(buf, url, 128);`. You should have subtracted the length of the first string from 128. If you're careful enough to do that, then you're careful enough to have just computed `strlen("http://") + strlen(url) + 1` in the first place. And likewise with ensuring that strings (`url` in particular in this example) actually do get nul-terminated. – Steve Jessop Apr 02 '10 at 11:40
  • @Dan - you're right, that's exactly what I wrote in my answer. But to get it right with user input is so hard that it's better not to use it at all there. – Eli Bendersky Apr 02 '10 at 12:37
  • @Eli: I've updated my answer (http://stackoverflow.com/questions/2565727/what-are-the-c-functions-from-the-standard-library-that-must-should-be-avoided/2565831#2565831) with more information about why `strncat()` should be avoided (even if it's possible to use correctly). – Michael Burr Apr 02 '10 at 20:28
  • On many platforms, if the length of the content to be copied is known, memcpy will be faster--in some cases by a factor of two or more--than strcpy or any variations thereof. In many cases, using strlen followed by memcpy wouldn't be much worse than using strcpy, and if the length could be used more than once the time savings in multiple memcpy operations will usually pay for the cost of calling strlen once. – supercat Mar 27 '15 at 21:46
7

Avoid

  • strtok for multithreaded programs as its not thread-safe.
  • gets as it could cause buffer overflow
N 1.1
  • 12,418
  • 6
  • 43
  • 61
codaddict
  • 445,704
  • 82
  • 492
  • 529
  • 2
    They are slightly different cases. You can use strtok safely if you know your program isn't multithreaded or you somehow lock access to it, but you can't use gets safely so should never use it. – jcoder Apr 02 '10 at 08:30
  • 9
    The problem with `strtok()` goes a little beyond thread safety - it's not safe even in a single threaded program unless you know for certain that no functions your code might call while using `strtok()` don't also use it (or they'll mess up `strtok()`'s state right from under you). In fact, most compilers that target multi-threaded platforms take care of `strtok()`'s potential problems as far as threads go by using thread-local storage for `strtok()`'s static data. But that still doesn't fix the problem of other functions using it while you are (in the same thread). – Michael Burr Apr 02 '10 at 08:51
  • Indeed and I'll shut up now as I don't want to encourage anyone to use strtok as it does suffer from many problems. I just wanted to point out that it's different from gets though as it *is* possible to use it safely whereas it's impossible to use gets safely. – jcoder Apr 02 '10 at 09:07
  • @Jimmy: There are often nonstandard library extensions, or you could write your own. The big problem with `strtok()` is that it keeps precisely one buffer to work in, so most good replacements require keeping a buffer value around and passing it in. – David Thornley Apr 02 '10 at 16:17
  • `strcspn` does most of what you need -- find the next token separator. You can reimplement a sane variant of `strtok` with it. – bluss Jun 10 '14 at 22:48
  • @Jimmy: POSIX provides `strtok_r()`; Microsoft supplies `strtok_s()` — apart from spelling, they're equivalent and safe to use in library code (they won't break some other function's use of `strtok()`, or `strtok_r()`/`strtok_s()`). The C11 standard has an optional `strtok_s()` that differs from Microsoft's. Fortunately, it is not widely implemented. – Jonathan Leffler May 04 '18 at 15:19
6

It is probably worth adding again that strncpy() is not the general-purpose replacement for strcpy() that it's name might suggest. It is designed for fixed-length fields that don't need a nul-terminator (it was originally designed for use with UNIX directory entries, but can be useful for things like encryption key fields).

It is easy, however, to use strncat() as a replacement for strcpy():

if (dest_size > 0)
{
    dest[0] = '\0';
    strncat(dest, source, dest_size - 1);
}

(The if test can obviously be dropped in the common case, where you know that dest_size is definitely nonzero).

caf
  • 233,326
  • 40
  • 323
  • 462
6

Also check out Microsoft's list of banned APIs. These are APIs (including many already listed here) that are banned from Microsoft code because they are often misused and lead to security problems.

You may not agree with all of them, but they are all worth considering. They add an API to the list when its misuse has led to a number of security bugs.

Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175
  • 2
    Note, however, that the recommended replacement functions are almost all Microsoft-specific. If you follow this, you're effectively making your code non-portable and Microsoft-only. [And NO, the Microsoft implementation of C's Annex K functions is not portable:](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm#impementations): "Microsoft Visual Studio implements an early version of the APIs. However, the implementation is incomplete and conforms neither to C11 nor to the original TR 24731-1. ..." – Andrew Henle Jan 20 '22 at 14:37
  • 2
    (cont) "As a result of the numerous deviations from the specification the Microsoft implementation cannot be considered conforming or portable." – Andrew Henle Jan 20 '22 at 14:38
4

It is very hard to use scanf safely. Good use of scanf can avoid buffer overflows, but you are still vulnerable to undefined behavior when reading numbers that don't fit in the requested type. In most cases, fgets followed by self-parsing (using sscanf, strchr, etc.) is a better option.

But I wouldn't say "avoid scanf all the time". scanf has its uses. As an example, let's say you want to read user input in a char array that's 10 bytes long. You want to remove the trailing newline, if any. If the user enters more than 9 characters before a newline, you want to store the first 9 characters in the buffer and discard everything until the next newline. You can do:

char buf[10];
scanf("%9[^\n]%*[^\n]", buf));
getchar();

Once you get used to this idiom, it's shorter and in some ways cleaner than:

char buf[10];
if (fgets(buf, sizeof buf, stdin) != NULL) {
    char *nl;
    if ((nl = strrchr(buf, '\n')) == NULL) {
        int c;
        while ((c = getchar()) != EOF && c != '\n') {
            ;
        }
    } else {
        *nl = 0;
    }
}
Alok Singhal
  • 93,253
  • 21
  • 125
  • 158
  • But because the `scanf` family does not support specifying the size dynamically (unlike the `printf` family with the `*` specifier), it's harder to do what you showed. You need to separately prepare the formatting string at runtime. – Aykhan Hagverdili Jul 28 '22 at 12:14
3

Don't forget about sprintf - it is the cause of many problems. This is true because the alternative, snprintf has sometimes different implementations which can make you code unportable.

  1. linux: http://linux.die.net/man/3/snprintf

  2. windows: http://msdn.microsoft.com/en-us/library/2ts7cx93%28VS.71%29.aspx

In case 1 (linux) the return value is the amount of data needed to store the entire buffer (if it is smaller than the size of the given buffer then the output was truncated)

In case 2 (windows) the return value is a negative number in case the output is truncated.

Generally you should avoid functions that are not:

  1. buffer overflow safe (a lot of functions are already mentioned in here)

  2. thread safe/not reentrant (strtok for example)

In the manual of each functions you should search for keywords like: safe, sync, async, thread, buffer, bugs

INS
  • 10,594
  • 7
  • 58
  • 89
  • 2
    `_sprintf()` was created by Microsoft before the standard `snprintf()` arrived, IIRC. [´StringCbPrintf()´](http://msdn.microsoft.com/en-us/library/ms647510%28VS.85%29.aspx) is quite similar to `snprintf()` though. – Bastien Léonard Apr 02 '10 at 09:37
  • you can use `sprintf` somehow safely in some cases: `sprintf(buffer,"%10s",input);` limits the copied nb bytes to 10 (if `buffer` is `char buffer[11]` it's safe even if the data may wind up truncated. – Jean-François Fabre Jan 19 '18 at 15:19
2

Almost any function that deals with NUL terminated strings is potentially unsafe. If you are receiving data from the outside world and manipulating it via the str*() functions then you set yourself up for catastrophe

rep_movsd
  • 6,675
  • 4
  • 30
  • 34
1

In all the string-copy/move scenarios - strcat(), strncat(), strcpy(), strncpy(), etc. - things go much better (safer) if a couple simple heuristics are enforced:

   1. Always NUL-fill your buffer(s) before adding data.
   2. Declare character-buffers as [SIZE+1], with a macro-constant.

For example, given:

#define   BUFSIZE   10
char      Buffer[BUFSIZE+1] = { 0x00 };  /* The compiler NUL-fills the rest */

we can use code like:

memset(Buffer,0x00,sizeof(Buffer));
strncpy(Buffer,BUFSIZE,"12345678901234567890");

relatively safely. The memset() should appear before the strncpy(), even though we initialized Buffer at compile-time, because we don't know what garbage other code placed into it before our function was called. The strncpy() will truncate the copied data to "1234567890", and will not NUL-terminate it. However, since we have already NUL-filled the entire buffer - sizeof(Buffer), rather than BUFSIZE - there is guaranteed to be a final "out-of-scope" terminating NUL anyway, as long as we constrain our writes using the BUFSIZE constant, instead of sizeof(Buffer).

Buffer and BUFSIZE likewise work fine for snprintf():

memset(Buffer,0x00,sizeof(Buffer));
if(snprintf(Buffer,BUFIZE,"Data: %s","Too much data") > BUFSIZE) {
    /* Do some error-handling */
}   /* If using MFC, you need if(... < 0), instead */

Even though snprintf() specifically writes only BUFIZE-1 characters, followed by NUL, this works safely. So we "waste" an extraneous NUL byte at the end of Buffer...we prevent both buffer-overflow and unterminated string conditions, for a pretty small memory-cost.

My call on strcat() and strncat() is more hard-line: don't use them. It is difficult to use strcat() safely, and the API for strncat() is so counter-intuitive that the effort needed to use it properly negates any benefit. I propose the following drop-in:

#define strncat(target,source,bufsize) snprintf(target,source,"%s%s",target,source)

It is tempting to create a strcat() drop-in, but not a good idea:

#define strcat(target,source) snprintf(target,sizeof(target),"%s%s",target,source)

because target may be a pointer (thus sizeof() does not return the information we need). I don't have a good "universal" solution to instances of strcat() in your code.

A problem I frequently encounter from "strFunc()-aware" programmers is an attempt to protect against buffer-overflows by using strlen(). This is fine if the contents are guaranteed to be NUL-terminated. Otherwise, strlen() itself can cause a buffer-overrun error (usually leading to a segmentation violation or other core-dump situation), before you ever reach the "problematic" code you are trying to protect.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
TLR
  • 19
  • 2
-2

atoi is not thread safe. I use strtol instead, per recommendation from the man page.

Fred
  • 449
  • 1
  • 5
  • 13
  • 5
    This sounds like it applies to one particular implementation. There's no reason why `strtol()` would be thread-safe and `atoi()` wouldn't be. – David Thornley Apr 02 '10 at 16:26
  • 2
    The recommendation to use strtol has nothing to do with thread safety, but with error handling. Not sure what man page you got that info from either - I can't find any recommendation below `man atoi` (there should be, though). – Lundin May 04 '18 at 13:13