23

I can see many sprintf's used in my applications for copying a string.

I have a character array:

char myarray[10];
const char *str = "mystring";

Now if I want want to copy the string str into myarray, is is better to use:

sprintf(myarray, "%s", str);

or

strncpy(myarray, str, 8);

?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Vijay
  • 65,327
  • 90
  • 227
  • 319
  • 5
    Don't forget to add a '\0' after calling strncpy. It might not do it by itself. – Thomas Padron-McCarthy Sep 05 '12 at 06:08
  • 3
    Better in what dimension? Performance - then benchmark! Though strncpy should be faster - it does less. Remember to add \0 – Drakosha Sep 05 '12 at 06:11
  • [This blog post](http://blog.liw.fi/posts/strncpy/) explains the differences pretty well. The biggest difference to most people is that `snprintf` tells you how many characters it wrote, and you don't have to worry about missing null terminators when there isn't enough space. – verdesmarald Sep 05 '12 at 06:11
  • then is strcpy better than strncpy since it also copies the terminating null character in the source string? – Vijay Sep 05 '12 at 06:11
  • Yes,better in performance and usability! – Vijay Sep 05 '12 at 06:12
  • @sarathi only if you're positive a buffer overflow is not possible. – obataku Sep 05 '12 at 06:12
  • You can only use `strcpy()` safely if you know how long the source and target strings are. If you know the length of the source, you can use `memmove()` or `memcpy()` instead of `strcpy()`. Note that `strncpy()` can be very inefficient if the length you specify as the third argument is very much greater than the string length, because `strncpy()` null pads the output to full length. And, as others pointed out, `strncpy()` does not guarantee that the output is null terminated. – Jonathan Leffler Sep 05 '12 at 06:17
  • 3
    The reason `strncpy` might not add the terminating `'\0'` character is because it was originally made to put text into small fixed-length fields that didn't need terminating if completely filled. – Some programmer dude Sep 05 '12 at 06:33

4 Answers4

54

Neither should be used, at all.

  1. sprintf is dangerous, deprecated, and superseded by snprintf. The only way to use the old sprintf safely with string inputs is to either measure their length before calling sprintf, which is ugly and error-prone, or by adding a field precision specifier (e.g. %.8s or %.*s with an extra integer argument for the size limit). This is also ugly and error-prone, especially if more than one %s specifier is involved.

  2. strncpy is also dangerous. It is not a buffer-size-limited version of strcpy. It's a function for copying characters into a fixed-length, null-padded (as opposed to null-terminated) array, where the source may be either a C string or a fixed-length character array at least the size of the destination. Its intended use was for legacy unix directory tables, database entries, etc. that worked with fixed-size text fields and did not want to waste even a single byte on disk or in memory for null termination. It can be misused as a buffer-size-limited strcpy, but doing so is harmful for two reasons. First of all, it fails to null terminate if the whole buffer is used for string data (i.e. if the source string length is at least as long as the dest buffer). You can add the termination back yourself, but this is ugly and error-prone. And second, strncpy always pads the full destination buffer with null bytes when the source string is shorter than the output buffer. This is simply a waste of time.

So what should you use instead?

Some people like the BSD strlcpy function. Semantically, it's identical to snprintf(dest, destsize, "%s", source) except that the return value is size_t and it does not impose an artificial INT_MAX limit on string length. However, most popular non-BSD systems lack strlcpy, and it's easy to make dangerous errors writing your own, so if you want to use it, you should obtain a safe, known-working version from a trustworthy source.

My preference is to simply use snprintf for any nontrivial string construction, and strlen+memcpy for some trivial cases that have been measured to be performance-critical. If you get in a habit of using this idiom correctly, it becomes almost impossible to accidentally write code with string-related vulnerabilities.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • +1. I was going to argue that `strncpy` isn't dangerous because it's perfectly suited to its intended use but I see that it's so commonly misunderstood, misused and abused that it's simpler to just say that it's dangerous. – CB Bailey Sep 05 '12 at 07:07
  • 6
    I think the "intended use" of `strncpy` is mostly a concept that has outlived its usefulness... – R.. GitHub STOP HELPING ICE Sep 05 '12 at 07:09
4

The different versions of printf/scanf are incredibly slow functions, for the following reasons:

  • They use variable argument lists, which makes parameter passing more complex. This is done through various obscure macros and pointers. All the arguments have to be parsed in runtime to determine their types, which adds extra overhead code. (VA lists is also quite a redundant feature of the language, and dangerous as well, as it has farweaker typing than plain parameter passing.)

  • They must handle a lot of complex formatting and all different types supported. This adds plenty of overhead to the function as well. Since all type evaluations are done in runtime, the compiler cannot optimize away parts of the function that are never used. So if you only wanted to print integers with printf(), you will get support for float numbers, complex arithmetic, string handling etc etc linked to your program, as complete waste of space.

  • Functions like strcpy() and particularly memcpy() on the other hand, are heavily optimized by the compiler, often implemented in inline assemble for maximum performance.

Some measurements I once made on barebone 16-bit low-end microcontrollers are included below.

As a rule of thumb, you should never use stdio.h in any form of production code. It is to be considered as a debugging/testing library. MISRA-C:2004 bans stdio.h in production code.

EDIT

Replaced subjective numbers with facts:

Measurements of strcpy versus sprintf on target Freescale HCS12, compiler Freescale Codewarrior 5.1. Using C90 implementation of sprintf, C99 would be more ineffective yet. All optimizations enabled. The following code was tested:

  const char str[] = "Hello, world";
  char buf[100];

  strcpy(buf, str);
  sprintf(buf, "%s", str);

Execution time, including parameter shuffling on/off call stack:

strcpy   43 instructions
sprintf  467 instructions

Program/ROM space allocated:

strcpy   56 bytes
sprintf  1488 bytes

RAM/stack space allocated:

strcpy   0 bytes
sprintf  15 bytes

Number of internal function calls:

strcpy   0
sprintf  9

Function call stack depth:

strcpy   0 (inlined)
sprintf  3 
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 6
    This answer is full of bad advice and misinformation. There's absolutely no justification for why stdio should not be used in production code, and in fact I would argue that `snprintf` is **the only** string function from the standard library you should ever use in security-conscious code that handles strings. The performance costs have been highly exaggerated (aside from constant overhead time, it should not take any longer than `strlen`/`memcpy` and probably even uses these internally), as have the dangers (any modern compiler will flag format string mismatches for you with warnings). – R.. GitHub STOP HELPING ICE Sep 05 '12 at 06:40
  • @R.. Regarding safety and production code, I did cite a very well-known and widely recognized authority for critical applications, sharing this belief, namely MISRA-C:2004 rule 20.9. If you think otherwise, your subjective opinion as a random internet person does not hold much weight, cite another authority supporting your belief. – Lundin Sep 05 '12 at 06:53
  • That's a standard for writing embedded-systems code for use in automobile ECUs and such. It's not relevant to general-purpose C programming, and I would go so far as to say the systems it's relevant to probably have no business working with strings at all... – R.. GitHub STOP HELPING ICE Sep 05 '12 at 06:57
  • @R.. It is used for mission critical programming of any nature, it has expanded far beyond the automotive branch. It applies to all programs where bugs are not desired and it certainly seems to be the most widely-acknowledged C coding standard at the moment. Though feel free to cite other authorities. What is the coding standard that you are personally using say about stdio? – Lundin Sep 05 '12 at 07:29
  • @R.. As for performance, my answer has been edited with measurements made on a quite code-effective C90 platform. – Lundin Sep 05 '12 at 07:30
  • 1
    And your measurements show minimal cost, as expected. The only reason the cost naively looks nontrivial is that your strings are so short. If you tried a 1k or 20k string, you'd find the performance almost the same. – R.. GitHub STOP HELPING ICE Sep 05 '12 at 07:33
  • As for MISRA-C, all I can say is that unless you're applying it to problems where string handling, numeric base conversion/formatting as strings, etc. is not necessary, the rule against stdio is just fundamentally misguided. Replicating this functionality without introducing dangerous errors is highly non-trivial, and all that banning stdio will do when this functionality is needed is encourage programmers to write their own buggy code to use in place of stdio... – R.. GitHub STOP HELPING ICE Sep 05 '12 at 07:38
  • @R.. That will only make a difference to the number of instructions. All other overhead such as program size remains static no matter the length of the string – Lundin Sep 05 '12 at 07:39
  • As for writing your own buggy code... if a programmer can't manage to write such a trivial thing as their own strcpy/memcpy function, they should probably look for another profession. – Lundin Sep 05 '12 at 07:42
  • Outside of embedded systems, almost nobody is using static binaries these days. Even if they are, 1488 bytes is not a significant price; you'd have a hard time duplicating the functionality in the same size. Anyway, I think that's a low estimate caused by an incomplete implementation; a correct printf implementation requires at least 3-4k, assuming you want to do floating point correctly. But even at that cost, it's pretty trivial. – R.. GitHub STOP HELPING ICE Sep 05 '12 at 07:42
  • 1
    @Lundin: I'm not talking about writing your own `strcpy`. I'm talking about assembling strings by putting together multiple strings, or numeric inputs together with strings. Of course you can do it yourself with `strlen` and `memcpy`, but every time you do it, there's a nontrivial probability of making a mistake (off-by-one errors, typos, etc.) that would not exist if you used a standard, safe idiom. Also, the calling code generally becomes smaller using `snprintf` compared to a string of `strlen` and `memcpy` calls. – R.. GitHub STOP HELPING ICE Sep 05 '12 at 07:45
  • 3
    +1 for atleast making a good attempt in explaining the facts and spending much time in commenting too. – Vijay Sep 05 '12 at 13:51
1

I would not use sprintf just to copy a string. It's overkill, and someone who reads that code would certainly stop and wonder why I did that, and if they (or I) are missing something.

Thomas Padron-McCarthy
  • 27,232
  • 8
  • 51
  • 75
0

There is one way to use sprintf() (or if being paranoid, snprintf() ) to do a "safe" string copy, that truncates instead of overflowing the field or leaving it un-NUL-terminated.

That is to use the "*" format character as "string precision" as follows:

So:

char dest_buff[32];
....
sprintf(dest_buff, "%.*s", sizeof(dest_buff) - 1, unknown_string);

This places the contents of unknown_string into dest_buff allowing space for the terminating NUL.

MikeW
  • 5,504
  • 1
  • 34
  • 29