23

Most C programmers are familiar with the strdup function. Many of them will take it for granted, yet it is not part of the C Standard (neither C89, C99 nor C11). It is part of POSIX and may not be available on all environments. Indeed Microsoft insisted on renaming it _strdup, adding to confusion.

It is rather easy to define it this way (in C):

#include <string.h>

char *strdup(const char *s) {
    size_t size = strlen(s) + 1;
    char *p = malloc(size);
    if (p) {
        memcpy(p, s, size);
    }
    return p;
}

But even savvy programmers can easily get it wrong.

Furthermore, redefining the function only on systems that do not have it proves a bit complicated as explained here: strdup() function

Why not include such useful widely supported functions in revised editions of the C Standard? A lot of new functions have been added in the C standard library in C99, what is the rationale for not including strdup?

Community
  • 1
  • 1
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1
    @AlterMann: `malloc` and friends have always been part of the C Standard. `aligned_alloc` was added in C11, `malloc` is mentioned on 11 pages in the C11 standard, can you explain what you mean? – chqrlie Oct 05 '15 at 08:48
  • [Why is malloc() harmful in embedded systems?](https://www.quora.com/Why-is-malloc-harmful-in-embedded-systems) – David Ranieri Oct 05 '15 at 08:51
  • I think is OffTopic for StackOverflow... – LPs Oct 05 '15 at 08:52
  • 5
    See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n704.htm and http://open-std.org/JTC1/SC22/WG14/www/docs/n718.htm – dyp Oct 05 '15 at 08:54
  • I am not asking for opinion, but for an explanation of the standardization process and possibly for a recount of historical discussions in the C Standard committee. – chqrlie Oct 05 '15 at 08:54
  • Thanks for the pointers, I feel less lonely... but what does `CV 1/1/2 -- Failed` mean? – chqrlie Oct 05 '15 at 08:58
  • 6
    @AlterMann: not everyone is programming for embedded systems. If local rules ban memory allocation via `malloc` they should obviously also ban `strdup`. – chqrlie Oct 05 '15 at 08:59
  • CV = WG14 consensus vote, 1/1/2 = 1 For, 1 Opposed, 2 Abstain. I'm not familiar enough with the standardization process to provide further information, though. – dyp Oct 05 '15 at 08:59
  • 1
    It is sad that the evolution of the C language be determined by such a small group, 4 people, half of which not seeming to care. – chqrlie Oct 05 '15 at 09:01
  • What would be achieved from making `strdup()` "standard"? It's there on POSIX, and it's there (albeit under a different name) on Windows. If it isn't there, it's dead-easy to define yourself. I don't really see where you are coming from with this one. `alloca()` isn't standard, either...?!? – DevSolar Oct 05 '15 at 09:06
  • 1
    @DevSolar Standardization doesn't only add new features, it also records existing practice to make it portable and long-term stable. @chqrlie An "abstain" vote doesn't imply they don't care, they might have seen both good and bad aspects of `strdup`. – dyp Oct 05 '15 at 09:09
  • 1
    @DevSolar: Making it standard improves future portability, avoids dead-easy local definitions done wrong and so many ill fated work arounds by newbie programmers. – chqrlie Oct 05 '15 at 09:11
  • @dyp: hence not *seeming* to care. The decision is not motivated, so we are at a loss trying to make sense of it. Maybe the actual decision makers will read this and enlighten us. – chqrlie Oct 05 '15 at 09:13
  • 1
    @chqrlie: It doesn't mean 4 people: US (the FV numbers meaning For/Opposed/Abstain/Absent/Total, so US was the for vote), UK, Canada, Denmark. See the participants mentioned in 1.2. – cremno Oct 05 '15 at 09:41
  • I've searched for related discussions, and there seems to exist a bias against the "hidden" memory allocation done by `strdup()`. C's memory **alloc**ation functions are `malloc()`, `calloc()`, `realloc()`, all residing in ``. `strdup()` allocates memory but is not named accordingly, and resides in ``. -- Apparently (I have no first-hand quote for this!) the sentiment was, "this cannot be done in a way that is clean / consistent with the standard, and the existing solution of having it as a POSIX extension works well enough, so let's leave it at that". – DevSolar Oct 05 '15 at 09:42
  • If you feel strongly about that, you can propose it for inclusion again of course. ;-) – DevSolar Oct 05 '15 at 09:44
  • A search also yielded [this comp.std.c thread](https://groups.google.com/forum/#!topic/comp.std.c/pMaEU_8Rb7w) about this meeting. It's a bit long though and Google Groups is slow (for me). `strdup()` is also defined by the dynamic allocation technical report (which sadly wasn't part of C11 unlike the bounds-checking one). – cremno Oct 05 '15 at 09:49
  • 1
    @cremno: Good pointer, thank you! So the rationale seems to have bee: *strdup() lost on the grounds that it would be the *ONLY* function other than *alloc() in the entire library whose return could be sanely passed to free(), and this is surprising.* A lame argument IMHO, easily defeated by adding `aprintf` at the same time;-) – chqrlie Oct 05 '15 at 10:00
  • 1
    @chqrlie, there is no such `aprintf` in the standard. – Jens Gustedt Oct 05 '15 at 10:05
  • 2
    @JensGustedt I think he was suggesting that if we were to add both `strdup` and `aprintf`, then it could no longer be argued that `strdup` would be the only function not ending in `alloc` which requires `free` ing the result – M.M Oct 05 '15 at 10:17

1 Answers1

21

The quoted link in the comments (http://open-std.org/JTC1/SC22/WG14/www/docs/n718.htm) gives an explanation about what is "wrong" about having strdup in the standard library:

The major issue was the desirability of adding a function to the standard library which allocates heap memory automatically for the user.

Basically, the C language and its standard library try their best not to make assumptions about how the user allocates and uses memory.
It gives a few facilities among which are the stack, and the heap.

While malloc/free are standardized for dynamic memory allocation, they are by no means the only way to do so, because dynamic memory management is a very complicated topic and the default allocation strategy might not be desirable for all kinds of applications.

There are for example a few independant libraries such as jemalloc which emphasizes low fragmentation and concurrency, or even full-fledged garbage collectors such as The Boehm-Demers-Weiser conservative garbage collector. These libraries offer malloc/free implementations that are meant to be used exclusively in replacement to the standard *alloc and free functions from <stdlib.h> without breaking compatibility with the rest of the C standard library.

So if strdup was made standard, it would effectively be disqualified from being used by code using third-party memory management functions (it must be noted that the aforementioned jemalloc library does provide an implementation of strdup to avoid this problem).

More generally speaking, while strdup certainly is a practical function, it suffers from a lack of clarity in its semantics. It is a function declared in the <string.h> header, but calling it requires to consequently free the returned buffer by calling the free function from the <stdlib.h> header. So, is it a string function or a memory function ?
Leaving it in the POSIX standard seems to be the most reasonable solution to avoid making the C standard library less clear.

SirDarius
  • 41,440
  • 8
  • 86
  • 100
  • 1
    I understand your arguments, but I feel this answer is not very convincing: `strdup` is widely used to the point that many professional C programmers believe it to be part of the Standard (ask your own developers). All alternative implementations of the `malloc` family of functions supply a replacement for `strdup` as well, the Boehm gc or any other would not be defeated by `strdup` being made standard. The header issue is a non-issue, at worst `strdup` could be defined in both `` and `` as is already the case of `NULL` defined in multiple standard header files. – chqrlie Oct 05 '15 at 13:43
  • @chqrlie as an exception that might work, but imagine if functions using similar allocation patterns started getting standardized, it would mean that every alternative malloc implementation would have to ship their own version of these functions, and eventually ship their own version of the C standard library. That would be a maintenance nightmare. – SirDarius Oct 05 '15 at 13:48
  • 1
    For completeness, `NULL` is defined in ``, ``, ``, ``, ``, ``, ``... – chqrlie Oct 05 '15 at 13:49
  • but in the case of `strdup`, it **is** already the case. It would actually simplify things if `strdup` was included in the Standard, as so many people already assume. – chqrlie Oct 05 '15 at 13:51
  • 3
    In my defense, given the mixed results of the vote mentioned earlier, I cannot pretend to make a more convicing case against strdup than the standard committee themselves. To be honest, I have used strdup in my code before but I can't say I like the function very much, because I prefer when malloc and free are located at the same level of abstraction. It is just too easy to forget to call free when you have not called malloc yourself. – SirDarius Oct 05 '15 at 14:19
  • Note that the only functions in the standard C library that do memory allocation are the memory allocation functions — `malloc()`, `realloc()`, `calloc()`, `aligned_alloc()` and `free()`. All the other functions operate without requiring memory allocation per se (though the standard I/O functions typically allocate memory for streams via `malloc()`, it is not required by the standard). The [TR 24731-2 Dynamic Allocation Functions](http://www.open-std.org/jtc1/sc22/wg14/www/projects#24731-2) would alter that — and would define `strdup()` in ``. – Jonathan Leffler Apr 11 '16 at 01:35
  • 5
    Note that `` contains `strerror()`, but you need `` to get hold of `errno` values or macros which you might then pass to `strerror()`. The suggestion that `strdup()` fits uncomfortably in `` is only of marginal relevance — though there is some justice to that. – Jonathan Leffler Apr 11 '16 at 01:37
  • I shall accept your answer as the document in reference provides the necessary clues for the committee's *rationale*. Yet I am in total disagreement with their thinking and I consider the header file argument to be very weak. – chqrlie Nov 28 '16 at 22:13