4

This might be somewhat pointless, but I'm curious what you guys think about it. I'm iterating over a string with pointers and want to pull a short substring out of it (placing the substring into a pre-allocated temporary array). Are there any reasons to use assignment over strncopy, or vice-versa? I.e.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int main()
{   char orig[]  = "Hello. I am looking for Molly.";

    /* Strings to store the copies
     * Pretend that strings had some prior value, ensure null-termination */
    char cpy1[4] = "huh\0";
    char cpy2[4] = "huh\0";

    /* Pointer to simulate iteration over a string */
    char *startptr = orig + 2;
    int length = 3;
    int i;

    /* Using strncopy */
    strncpy(cpy1, startptr, length);

    /* Using assignment operator */
    for (i = 0; i < length; i++)
    {   cpy2[i] = *(startptr + i); 
    }   

    /* Display Results */
    printf("strncpy result:\n");
    printf("%s\n\n", cpy1);
    printf("loop result:\n");
    printf("%s\n", cpy2);
}   

It seems to me that strncopy is both less typing and more easily readable, but I've seen people advocate looping instead. Is there a difference? Does it even matter? Assume that this is for small values of i (0 < i < 5), and null-termination is assured.

Refs: Strings in c, how to get subString, How to get substring in C, Difference between strncpy and memcpy?

Community
  • 1
  • 1
groundlar
  • 878
  • 1
  • 8
  • 17
  • 1
    strncpy() is always wrong. Avoid it untill you find a reason to use it. – wildplasser Sep 05 '12 at 14:33
  • Thanks for the helpful comment. It's always nice when someone offers good advice and explains their reasoning. – groundlar Sep 05 '12 at 14:38
  • Please read the description in the manpage for strncpy(). Ask yourself which of its "features" you actually want. Then ask yourself which of these features you actually really don't want. – wildplasser Sep 05 '12 at 14:41
  • I did. I wanted to pull a 3-char substring out of the original string without trying to copy the rest of the string or messing with the final character, the '\0' null termination value. So I had a choice between strncpy, strlcpy, and iteration (and probably some others that I don't know of). Since I explicitly knew all of the sizes, I thought strncpy wouldn't be a problem. I don't see any features here that are a problem. – groundlar Sep 05 '12 at 14:50
  • 1
    In the case where you know all the sizes, `memcpy(cpy1, startptr, length);` does exactly the right thing (which in this special case is *exactly* the same as your strncpy). It also informs the human reader that you know what your doing (and that you don't want a nul-terminator, because you rely on the existing one) In the case where `(strlen(2nd argument) < length)`, both would fail in their own particular way. – wildplasser Sep 05 '12 at 14:58
  • Right. I'll take a look at the source code myself later, but so you think that `memcpy` is really better in terms of optimization? Also for the other case, where the original string is longer, there's really not a good solution that I've found... As far as I can tell, that's just something that you should avoid in C. – groundlar Sep 05 '12 at 15:11
  • Optimisation is barely relevant here. On modern platforms, basic stuff like this is totally dependent on the speed of the memory bus / caches. And the memory / cache footprint is the same for all scenarios. (again: except for the `strncpy(a,b,c)` case, with `(c > strlen(b)))` – wildplasser Sep 05 '12 at 19:47

4 Answers4

4

strncpy(char * dst, char *src, size_t len) has two peculiar properties:

  • if (strlen(src) >= len) : the resulting string will not be nul-terminated.
  • if (strlen(src) < len) : the end of the string will be filled/padded with '\0'.

The first property will force you to actually check if (strlen(src) >= len) and act appropiately. (or brutally set the final character to nul with dst[len-1] = '\0';, like @Gilles does above) The other property is not particular dangerous, but can spill a lot of cycles. Imagine:

char buff[10000];
strncpy(buff, "Hello!", sizeof buff);

which touches 10000 bytes, where only 7 need to be touched.

My advice:

  • A: if you know the sizes, just do memcpy(dst,src,len); dst[len] = 0;
  • B: if you don't know the sizes, get them somehow (using strlen and/or sizeof and/or the allocated size for dynamically allocced memory). Then: goto A above.

Since for safe operation the strncpy() version already needs to know the sizes, (and the checks on them!), the memcpy() version is not more complex or more dangerous than the strncpy() version. (technically it is even marginally faster; because memcpy() does not have to check for the '\0' byte)

wildplasser
  • 43,142
  • 8
  • 66
  • 109
3

While this may seem counter-intuitive, there are more optimized ways to copy a string than by using the assignment operator in a loop. For instance, IA-32 provides the REP prefix for MOVS, STOS, CMPS etc for string handling, and these can be much faster than a loop that copies one char at a time. The implementation of strncpy or strcpy may choose to use such hardware-optimized code to achieve better performance.

Nathan Fellman
  • 122,701
  • 101
  • 260
  • 319
  • 1
    So in short, generally try to use built-in methods because they're smarter than me? ;) Gotcha. I was suspicious this might be the case (that strncpy might be able to optimize under the hood), but I'm a total newbie to CS in general so I wanted to ask the experts. Thanks! – groundlar Sep 05 '12 at 14:37
  • Take a look at gnu's libc source code. The optimised code tries to read/write int-wide objects, and does a hell of a job to get all the alignment right. – wildplasser Sep 05 '12 at 14:44
  • @wildplasser , so what are your suggestions? So far you've said "avoid strncpy," which while possibly illuminating, does not propose an alternative solution. – groundlar Sep 05 '12 at 14:52
  • Maybe I should add an answer, then. – wildplasser Sep 05 '12 at 14:59
  • BTW: the REP/REPZ prefixed opcodes were fast for the 6086. Things changed with the 286. After 386 these opcodes have no advantage over plain loops, since memory bandwith will always be the bottleneck for simple operations (and almost every operation is simple, nowadays). Also,: compilers will not like these instructions because of the implicit use of SI, DI AND CX. – wildplasser Aug 13 '14 at 12:07
  • @wildplasser: I know that Intel has done a lot of work to make the built-in rep strings instructions the fastest way to copy a block of memory from one location to the other. They are much faster than plain loops because they can use hardware that isn't available to simple instructions to effectively execute many iterations at a time - as many as 64 iterations per clock cycle. – Nathan Fellman Aug 13 '14 at 12:58
1

As long as you know your lengths are "in range" and everything is correctly nul terminated, then strncpy is better.

If you need to get length checks etc in there, looping could be more convenient.

John3136
  • 28,809
  • 4
  • 51
  • 69
0

A loop with assignment is a bad idea because you're reinventing the wheel. You might make a mistake, and your code is likely to be less efficient than the code in the standard library (some processors have optimized instructions for memory copies, and optimized implementations usually at least copy word by word if possible).

However, note that strncpy is not a well-rounded wheel. In particular, if the string is too long, it does not append a null byte to the destination. The BSD function strlcpy is better designed, but not available everywhere. Even strlcpy is not a panacea: you need to get the buffer size right, and be aware that it might truncate the string.

A portable way to copy a string, with truncation if the string is too long, is to call strncpy and always add the terminating null byte. If the buffer is an array:

char buffer[BUFFER_SIZE];
strncpy(buffer, source, sizeof(buffer)-1);
buf[sizeof(buffer)-1] = 0;

If the buffer is given by a pointer and size:

strncpy(buf, source, buffer_size-1);
buf[buffer_size-1] = 0;
Community
  • 1
  • 1
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
  • Thanks for the info! I've read a bunch about strncpy and its possible "misbehavior," but here I'm certain of null-termination. I'll keep that in mind if I ever don't explicitly know the size of my buffers, though! – groundlar Sep 05 '12 at 14:42
  • @surfreak If you already know the size of the source string, and you know that it fits in the destination buffer, you can use `strcpy`. Preferably with an `assert` or a comment reminding the reader (and perhaps the runtime system) of the size requirements. – Gilles 'SO- stop being evil' Sep 05 '12 at 14:46
  • Yes, but here I was pulling a substring, and to my knowledge there's no way to do that with `strncpy` if the substring terminates before the end of the source string... Or is there? – groundlar Sep 05 '12 at 14:54
  • @surfreak Oh, right. For a substring, if you already know that it fits in the destination, you can use `memcpy` and add the terminating null byte. If you don't know whether it fits, use `strlcpy` if available, otherwise `strncpy` (and add the terminating null byte). – Gilles 'SO- stop being evil' Sep 05 '12 at 14:56
  • So now the question is between memcpy and strncpy... Does THAT matter? Either way I have to add the null byte. – groundlar Sep 05 '12 at 15:05