18

On my Windows/Visual C environment there's a wide number of alternatives for doing the same basic string manipulation tasks.

For example, for doing a string copy I could use:

  • strcpy, the ANSI C standard library function (CRT)
  • lstrcpy, the version included in kernel32.dll
  • StrCpy, from the Shell Lightweight Utility library
  • StringCchCopy/StringCbCopy, from a "safe string" library
  • strcpy_s, security enhanced version of CRT

While I understand that all these alternatives have an historical reason, can I just choose a consistent set of functions for new code? And which one? Or should I choose the most appropriate function case by case?

lornova
  • 6,667
  • 9
  • 47
  • 74
  • 10
    @egrunin, -1 if I could on a comment. This is a C question, stick to it. – Jens Gustedt Nov 15 '10 at 15:12
  • I asked a legitimate and potentially relevant question, and then explained why I was asking it. – egrunin Nov 15 '10 at 15:18
  • 7
    "Why aren't you using C++?" is never a valid answer to a C question, and as a comment it's at best argumentative/trollish. I agree with Jens 110%. – R.. GitHub STOP HELPING ICE Nov 15 '10 at 15:34
  • 1
    the c vs. c++ debate should continue at http://stackoverflow.com/questions/649789/why-artificially-limit-your-code-to-c not in a question about string functions – Nick Van Brunt Nov 15 '10 at 15:40
  • 1
    @R.: Hey, I'm not like one of those people whose response to every Windows question is "why aren't you using Linux." When I choose to program in C rather than C++ (and I sometimes do), it's an application-dependent decision. The questioner is relatively new to the environment who might not know what questions to ask himself. Sheesh, tough crowd. – egrunin Nov 16 '10 at 15:42
  • @egrunin: what do you mean with "the questioner is relatively new to the environment"? I'm developing on Windows with either C or C++ since at least 1994. This question was a strictly C one. – lornova May 07 '12 at 08:02

6 Answers6

14

First of all, let's review pros and cons of each function set:

ANSI C standard library function (CRT)

Functions like strcpy are the one and only choice if you are developing portable C code. Even in a Windows-only project, it might be a wise thing to have a separation of portable vs. OS-dependent code.
These functions have often assembly level optimization and are therefore very fast.
There are some drawbacks:

  • they have many limitations and therefore often you still have to call functions from other libraries or provide your own versions
  • there are some archaisms like the infamous strncpy

Kernel32 string functions

Functions like lstrcpy are exported by kernel32 and should be used only when trying to avoid any dependency to the CRT. You might want to do that for two reasons:

  • avoiding the CRT payload for an ultra lightweight executable (unusual these days but not in the 90s!)
  • avoiding initialization issues (if you launch a thread with CreateThread instead of _beginthread).

Moreover, the kernel32 function could be more optimized that the CRT version: when your executable will run on Windows 12 optimized for a Core i13, kernel32 could use an assembly-optimized version.

Shell Lightweight Utility Functions

Here are valid the same considerations made for the kernel32 functions, with the added value of some more complex functions. However I doubt that they are actively maintained and I would just skip them.

StrSafe Function

The StringCchCopy/StringCbCopy functions are usually my personal choice: they are very well designed, powerful, and surprisingly fast (I also remember a whitepaper that compared performance of these functions to the CRT equivalents).

Security-Enhanced CRT functions

These functions have the undoubted benefit of being very similar to ANSI C equivalents, so porting legacy code is a piece of cake. I especially like the template-based version (of course, available only when compiling as C++). I really hope that they will be eventually standardized. Unfortunately they have a number of drawbacks:

  • although a proposed standard, they have been basically rejected by the non-Windows community (probably just because they came from Microsoft)
  • when fail, they don't just return an error code but execute an invalid parameter handler

Conclusions

While my personal favorite for Windows development is the StrSafe library, my advice is to use the ANSI C functions whenever is possible, as portable-code is always a good thing.

In the real life, I developed a personalized portable library, with prototypes similar to the Security-Enhanced CRT functions (included the powerful template based technique), that relies on the StrSafe library on Windows and on the ANSI C functions on other platforms.

lornova
  • 6,667
  • 9
  • 47
  • 74
Wizard
  • 1,042
  • 6
  • 9
  • The C standard now includes safer string functions. Checkout [ISO/IEC TR 24731, Bounds-checking interfaces](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1225.pdf). It was a normative reference, and it has been adopted. – jww Nov 14 '18 at 17:25
  • You are also missing the safer string functions used on BSD's and OS X, like `strlcpy` and `strlcat`. – jww Nov 14 '18 at 17:31
4

My personal preference, for both new and existing projects, are the StringCchCopy/StringCbCopy versions from the safe string library. I find these functions to be overall very consistent and flexible. And they were designed from the groupnd up with safety / security in mind.

JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
3

I'd answer this question slightly different. Do you want to have portable code or not? If you want to be portable you can not rely on anything else but strcpy, strncpy, or the standard wide character "string" handling functions.

Then if your code just has to run under Windows you can use the "safe string" variants.

If you want to be portable and still want to have some extra safety, than you should check cross-platform libraries like e.g glib or libapr or other "safe string libraries" like e.g: SafeStrLibrary

Ajay
  • 18,086
  • 12
  • 59
  • 105
Friedrich
  • 5,916
  • 25
  • 45
  • you could also use strcat or strncat after setting the first byte of the target to 0. That is portable, allow buffer checking and avoid the efficiency problems of strncpy null padding ? – kriss Nov 16 '10 at 11:08
  • You are right of course. If he wants to be portable he has to stick to the "standard" functions. – Friedrich Nov 17 '10 at 05:37
1

I would suggest using functions from the standard library, or functions from cross-platform libraries.

Alan Haggai Alavi
  • 72,802
  • 19
  • 102
  • 127
0

I would stick to one, I would pick whichever one is in the most useful library in case you need to use more of it, and I would stay away from the kernel32.dll one as it's windows only.

But these are just tips, it's a subjective question.

J V
  • 11,402
  • 10
  • 52
  • 72
0

Among those choices, I would simply use strcpy. At least strcpy_s and lstrcpy are cruft that should never be used. It's possibly worthwhile to investigate those independently written library functions, but I'd be hesitant to throw around nonstandard library code as a panacea for string safety.

If you're using strcpy, you need to be sure your string fits in the destination buffer. If you just allocated it with size at least strlen(source)+1, you're fine as long as the source string is not simultaneously subject to modification by another thread. Otherwise you need to test if it fits in the buffer. You can use interfaces like snprintf or strlcpy (nonstandard BSD function, but easy to copy an implementation) which will truncate strings that don't fit in your destination buffer, but then you really need to evaluate whether string truncation could lead to vulnerabilities in itself. I think a much better approach when testing whether the source string fits is to make a new allocation or return an error status rather than performing blind truncation.

If you'll be doing a lot of string concatenation/assembly, you really should write all your code to manage the length and current position as you go. Instead of:

strcpy(out, str1);
strcat(out, str2);
strcat(out, str3);
...

You should be doing something like:

size_t l, n = outsize;
char *s = out;

l = strlen(str1);
if (l>=outsize) goto error;
strcpy(s, str1);
s += l;
n -= l;

l = strlen(str2);
if (l>=outsize) goto error;
strcpy(s, str2);
s += l;
n -= l;

...

Alternatively you could avoid modifying the pointer by keeping a current index i of type size_t and using out+i, or you could avoid the use of size variables by keeping a pointer to the end of the buffer and doing things like if (l>=end-s) goto error;.

Note that, whichever approach you choose, the redundancy can be condensed by writing your own (simple) functions that take pointers to the position/size variable and call the standard library, for instance something like:

if (!my_strcpy(&s, &n, str1)) goto error;

Avoiding strcat also has performance benefits; see Schlemiel the Painter's algorithm.

Finally, you should note that a good 75% of the string copying and assembly people perform in C is utterly useless. My theory is that the people doing it come from backgrounds in script languages where putting together strings is what you do all the time, but in C it's not useful that often. In many cases, you can get by with never copying strings at all, using the original copies instead, and get much better performance and simpler code at the same time. I'm reminded of a recent SO question where OP was using regexec to match a regular expression, then copying out the result just to print it, something like:

char *tmp = malloc(match.end-match.start+1);
memcpy(tmp, src+match.start, match.end-match.start);
tmp[match.end-match.start] = 0;
printf("%s\n", tmp);
free(tmp);

The same thing can be accomplished with:

printf("%.*s\m", match.end-match.start, src+match.start);

No allocations, no cleanup, no error cases (the original code crashed if malloc failed).

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 1
    Sorry, I had to downvote as this long answer, although very interesting, after the first sentence goes completely off topic... – lornova Nov 15 '10 at 16:39
  • I don't see how a detailed answer on safe string usage is off-topic unless you're restricting the topic purely to which function to use. **How** you use the functions is more important than which one you use, from a security standpoint. – R.. GitHub STOP HELPING ICE Nov 15 '10 at 16:49
  • The question was very clear: should I choose a single function set and use it extensively? And which one? Or should I choose every time the better solution? Your essay on safe strings is interesting but off topic. – lornova Nov 15 '10 at 16:52
  • 1
    @R..: as soon as you compute `l = strlen(str1)` why would you want to call the (comparatively) slow `strcpy(s, str1)` instead of the faster `memcpy(s, str1, l)` ? – kriss Nov 15 '10 at 21:17
  • Fair enough. I tend to assume people implement `strcpy` efficiently, but the `while (*dst++=*src++);` implementation seems to be more common than you'd expect. :-) – R.. GitHub STOP HELPING ICE Nov 15 '10 at 23:13
  • 1
    I tend to agree with @kriss - when you start scrupulously tracking lengths and checking for truncation, you soon realise that you have all the information you need to use `memcpy()` everywhere. – caf Nov 16 '10 at 02:12