8

In my current project, I am coding according to the C11 standard (building with gcc -std=c11) and needed something like strnlen (a "safe" version of strlen which returns the length of a 0-terminated string, but only up to a given maximum). So I looked it up (e.g. https://en.cppreference.com/w/c/string/byte/strlen) and it seems the C11 standard mentions such a function, but with the name strnlen_s.

Hence I went with strnlen_s, but this turned out to be undefined when including string.h. On the other hand, strnlen is defined, so my current solution is to use strnlen with a remark that the standard name seems to be strnlen_s but that this is not defined by GCC.

The question is: am I correct to assume that strnlen is the most portable name to use or what could I do for the code to be most portable/standard?

Note: Microsoft (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strnlen-strnlen-s) implements both functions with the distinction that strnlen_s checks if the string pointer is NULL and returns 0 in that case while strnlen has no such check.

nielsen
  • 5,641
  • 10
  • 27
  • 2
    In 2021, [UTF-8 is everywhere](http://utf8everwhere.org/), so you could use [GNU libunistring](https://www.gnu.org/software/libunistring) or [Glib](https://en.wikipedia.org/wiki/GLib) – Basile Starynkevitch Feb 24 '21 at 07:37
  • 1
    @BasileStarynkevitch This is a good point for the many applications that needs to be UTF-aware. I am UTF-aware but my current application does not need to be. By the way, there is a "y" missing in your link to "UTF-8 is everywhere". – nielsen Feb 24 '21 at 08:05
  • 1
    Yeah, the proper site is https://utf8everywhere.org/ – tripleee Feb 24 '21 at 09:48

4 Answers4

11

The question is: am I correct to assume that strnlen is the most portable name to use or what could I do for the code to be most portable/standard?

No, it isn't portable at all. It was never part of C. It is included in POSIX, which doesn't mean much.

I would imagine the reason why the function doesn't exist in the standard, probably because it's superfluous when we already have memchr(str, '\0', max);.

strnlen_s is part of the optional bounds-checking interface in C11 annex K. This whole chapter turned out a huge fiasco and barely any compiler implements it. Microsoft has similar named functions but they are sometimes not compatible. So I would assume that all _s functions are completely non-portable.

So use neither of these, use memchr or strlen.


EDIT

In case you must implement strnlen yourself for some reason, then this is what I'd recommend:

#include <string.h>

size_t strnlength (const char* s, size_t n) 
{ 
  const char* found = memchr(s, '\0', n); 
  return found ? (size_t)(found-s) : n; 
}
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • `memchr` is a good option. It is not so convenient in a case like mine (a string has a fixed-size buffer and it is only required to be 0-terminated if it is shorter than the buffer size). Probably my most portable option is to write my own "`strnlen`", possibly wrapping a call to `memchr`. – nielsen Feb 24 '21 at 08:20
  • What was the reason for compilers to not implement the `_s` functions? Why was it optional? – Aykhan Hagverdili Feb 24 '21 at 08:32
  • @nielsen Or simply don't wrap it and keep the code readable instead. It's not like it's such a huge effort to write `char* result = memchr(str, '\0', n); if(result){ ptrdiff_t index = str - result; }`. You need to check the result no matter what you do. – Lundin Feb 24 '21 at 08:45
  • 2
    @AyxanHaqverdili See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm. – Lundin Feb 24 '21 at 08:47
  • @Lundin Right, but in my case it is a bit more. Something like: `char* result = memchr(str, '\0', n); if(result) {size = result - str;} else {size = n;}` or `size = (result) ? result - str : n;`. And I need it in several places. In any case, that is a question of preferences and trade-offs. Luckily, it is not very complicated. – nielsen Feb 24 '21 at 08:54
2

strnlen_s() is specified in Annex K of the C Standard starting at version C11. This Annex is not widely implemented and even Microsoft's implementation is not fully conformant with the specified version. The semantics are contorted especially regarding error handling. I would recommend not using it.

strnlen() is a simple function specified in POSIX.1-2008 and available on many platforms. It is easy to implement on platforms that do not provide it:

#include <string.h>

size_t strnlen(const char *s, size_t n) {
    size_t i;
    for (i = 0; i < n && s[i] != '\0'; i++)
        continue;
    return i;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • But if you define a function with this name, it'll cause conflict on systems that do have `strnlen`. So, maybe name it differently – Aykhan Hagverdili Feb 24 '21 at 09:13
  • 1
    Or as noted in comments to another answer, implement it as: `size_t strnlen (const char* s, size_t n) { char* found = memchr(s, '\0', n); return found ? (size_t)(found-s) : n; }`. And this snippet is actually library quality code. – Lundin Feb 24 '21 at 09:20
1

The question is: am I correct to assume that strnlen is the most portable name to use or what could I do for the code to be most portable/standard?

For C, strnlen is OK as the name is not reserved. It is not part of the standard, so OK for you to add.

POSIX reserves str...(), so you might want to use another name.

strnlen_s collides with K.3.7.4.4 The strnlen_s function and has a controversial history that you might not want your code tied into. Avoid naming your function strnlen_s().


I would avoid name collisions to common libraries with any function one adds with 2 names: the formal less-likely-to-collide-name and macro

size_t nielsen_strnlen(const char *s, size_t maxsize);
#define slength nielsen_strnlen

Or simply go directly with something less likely to collide.

size_t nstrnlen(const char *s, size_t maxsize);

Deeper: OP appears to want to use a popular function that is outside the standard C library (or current version), but might be available when code is ported to other systems. OP wants to provide a use-my-code-if-not-available function.

Careful where you tread.

I would use a macro (or a wrapper function)

#if ON_SYSTEM_WITH_strnlen
  #define slength strnlen
#else
  #define slength nielsen_strnlen
#endif   

... and then use calls to slenth().

Problems comes up when OP's version of code is not exactly like the desired (today and tomorrow) or because it is not standard, various implementations vary - a little bit, on its implementation. To mitigate, consider a macro or function wrapper indirection.


Side issue: Parameter order and a potential new principle to the "original principles" of C.

size_t foo1(const char *s, size_t maxsize);

// arranged such that the size of an array appears before the array. 
size_t foo2(size_t maxsize, const char *s);
size_t foo3(size_t maxsize, const char s[maxsize]); 
Björn Lindqvist
  • 19,221
  • 20
  • 87
  • 122
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • You make some good points, but my original question was about using the function which would most likely be available in other environments. It is only after the discussion that I realized the best option might be to implement it myself (to be sure it would always be available). In that case, I fully agree that the name should be chosen to avoid collisions as you explain very well. – nielsen Feb 24 '21 at 13:46
0

string is the c++ header and string.h is the c header (at least with gcc). strlen_s (afaik) is a Microsoft extension to the C library. You right, strlen would be the more standard. You could also use memchr if you need a byte count. To @Basile's point, if you need count of characters you need something that is UTF-8 aware.

Allan Wind
  • 23,068
  • 5
  • 28
  • 38