39

std::strlen doesn't handle c strings that are not \0 terminated. Is there a safe version of it?

PS I know that in c++ std::string should be used instead of c strings, but in this case my string is stored in a shared memory.

EDIT

Ok, I need to add some explanation.

My application is getting a string from a shared memory (which is of some length), therefore it could be represented as an array of characters. If there is a bug in the library writing this string, then the string would not be zero terminated, and the strlen could fail.

BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • 19
    ...so what _does_ terminate the string? If there is no string terminator and there is no other way of inferring the length of the string from the string itself, you need to store the length in a separate variable somewhere. – Aasmund Eldhuset May 09 '11 at 10:14
  • 14
    If you don't know the length, and you have no way of knowing the length, then you cannot determine the length. – Lightness Races in Orbit May 09 '11 at 10:17
  • 3
    How long is a piece of string? Twice the distance from one end to the middle. – johnsyweb May 09 '11 at 10:20
  • 1
    Modify the library so there won't be bugs. If the other program may crash in the middle of writing the string, modify the protocol so there will be a flag set when it completes the modification successfully. You can't know the length of a corrupted string. – Yakov Galka May 09 '11 at 10:30
  • @ybungalobill I do not have access to it. – BЈовић May 09 '11 at 10:33
  • @VJo see edit in my answer, hope that it helps. – Dennis May 09 '11 at 10:36
  • Is there a reason why you can't just set the last byte of your known-size shared memory buffer to `'\0'`? – Damon Aug 27 '13 at 14:01
  • 4
    A `char` array that isn't `'\0'` terminated is not a string. – Keith Thompson Dec 23 '13 at 04:38
  • @YakovGalka you can't, but you can limit the damage if you get something that wasn't what you expected. Sometimes a communication partner, whether communicated with by shared memory or over a network or whatever may not be entirely truested, so you need to verify that things remain within sane limits. – plugwash Nov 03 '20 at 01:43

12 Answers12

20

You've added that the string is in shared memory. That's guaranteed readable, and of fixed size. You can therefore use size_t MaxPossibleSize = startOfSharedMemory + sizeOfSharedMemory - input; strnlen(input, MaxPossibleSize) (mind the extra n in strnlen).

This will return MaxPossibleSize if there's no \0 in the shared memory following input, or the string length if there is. (The maximal possible string length is of course MaxPossibleSize-1, in case the last byte of shared memory is the first \0)

MSalters
  • 173,980
  • 10
  • 155
  • 350
13

C strings that are not null-terminated are not C strings, they are simply arrays of characters, and there is no way of finding their length.

  • 2
    Ok, but is there an alternative to std::strlen that is safe? – BЈовић May 09 '11 at 10:31
  • 10
    @unapersson: given that the user means "safe" in the unorthodox meaning of the word "safe" used by "safe" string functions like `strlcpy`, what you say is not true. Well, it's true but not relevant since the questioner doesn't ask how to find the "length" of something without a nul terminator, he asks how to find length if it has one, and not crash if it doesn't. One might know the buffer length, but not know whether it contains a nul byte, and it *is* possible to find out which and (if it's a string) the length. – Steve Jessop May 09 '11 at 11:21
  • @Steve What is this strlcpy of which you speak? And my answer was to the original question. –  May 09 '11 at 11:28
  • 2
    The user? The questioner, I mean. `strlcpy` is a BSD function, I use it as an example of the only kind of thing I have *ever* known people to be talking about when they talk about "safe" string functions in C - something that takes a length and avoids working outside that bound. – Steve Jessop May 09 '11 at 11:28
  • @unaperson ... and it completely missed regarding the modified question, where I explained what I am looking for. @Steve another example is sprintf – BЈовић May 09 '11 at 11:40
10

If you define a c-string as

char* cowSays = "moo";

then you autmagically get the '\0' at the end and strlen would return 3. If you define it like:

char iDoThis[1024] = {0};

you get an empty buffer (and array of characters, all of which are null characters). You can then fill it with what you like as long as you don't over-run the buffer length. At the start strlen would return 0, and once you have written something you would also get the correct number from strlen.
You could also do this:

char uhoh[100];
int len = strlen(uhoh);

but that would be bad, because you have no idea what is in that array. It could hit a null character you might not. The point is that the null character is the defined standard manner to declare that the string is finished.
Not having a null character means by definition that the string is not finished. Changing that will break the paradigm of how the string works. What you want to do is make up your own rules. C++ will let you do that, but you will have to write a lot of code yourself.

EDIT From your newly added info, what you want to do is loop over the array and check for the null character by hand. You should also do some validation if you are expecting ASCII characters only (especially if you are expecting alpha-numeric characters). This assumes that you know the maximum size. If you do not need to validate the content of the string then you could use one of the strnlen family of functions: http://msdn.microsoft.com/en-us/library/z50ty2zh%28v=vs.80%29.aspx
http://linux.about.com/library/cmd/blcmdl3_strnlen.htm

Dennis
  • 3,683
  • 1
  • 21
  • 43
  • 8
    @VJo: since `strnlen` isn't standard C or C++ you might prefer `memchr` (with a check for null and a pointer subtraction). Or you might not mind given that `strnlen` is in Windows and Posix. – Steve Jessop May 09 '11 at 11:27
  • 1
    @Steve I didn't know it is not standard, but since it is posix, it is good enough for me (I am using linux). I guess it is also good enough for people programming on windows, since it is there – BЈовић May 09 '11 at 11:38
8
size_t safe_strlen(const char *str, size_t max_len)
{
    const char * end = (const char *)memchr(str, '\0', max_len);
    if (end == NULL)
        return max_len;
    else
        return end - str;
}
Andrew W. Phillips
  • 3,254
  • 1
  • 21
  • 24
  • 1
    You could rename the function to match this: http://linux.about.com/library/cmd/blcmdl3_strnlen.htm – harper Jun 15 '15 at 16:17
8

Yes, since C11:

size_t strnlen_s( const char *str, size_t strsz );

Located in <string.h>

Sergei Krivonos
  • 4,217
  • 3
  • 39
  • 54
3

Get a better library, or verify the one you have - if you can't trust you library to do what it says it will, then how the h%^&l do you expect your program to?

Thats said, Assuming you know the length of the buiffer the string resides, what about

buffer[-1+sizeof(buffer)]=0 ;
 x = strlen(buffer) ; 
  • make buffer bigger than needed and you can then test the lib.

    assert(x<-1+sizeof(buffer));
    
mattnz
  • 517
  • 2
  • 13
  • 3
    Well, the guy that wrote that library is not here anymore, and was very sloppy. I found one bug in it that caused strlen to fail. Anyway, strnlen is doing what I need – BЈовић May 10 '11 at 06:38
1

C11 includes "safe" functions such as strnlen_s. strnlen_s takes an extra maximum length argument (a size_t). This argument is returned if a null character isn't found after checking that many characters. It also returns the second argument if a null pointer is provided.

size_t strnlen_s(const char *, size_t);

While part of C11, it is recommended that you check that your compiler supports these bounds-checking "safe" functions via its definition of __STDC_LIB_EXT1__. Furthermore, a user must also set another macro, __STDC_WANT_LIB_EXT1__, to 1, before including string.h, if they intend to use such functions. See here for some Stack Overflow commentary on the origins of these functions, and here for C++ documentation.

GCC and Clang also support the POSIX function strnlen, and provide it within string.h. Microsoft too provide strnlen which can also be found within string.h.

user2023370
  • 10,488
  • 6
  • 50
  • 83
0

a simple solution:

buff[BUFF_SIZE -1] = '\0'

ofc this will not tell you if the string originally was exactly BUFF_SIZE-1 long or it was just not terminated... so you need xtra logic for that.

NoSenseEtAl
  • 28,205
  • 28
  • 128
  • 277
0

How about this portable nugget:

int safeStrlen(char *buf, int max)
{
   int i;
   for(i=0;buf[i] && i<max; i++){};
   return i;
}
hotplasma
  • 68
  • 4
0

As Neil Butterworth already said in his answer above: C-Strings which are not terminated by a \0 character, are no C-Strings!

The only chance you do have is to write an immutable Adaptor or something which creates a valid copy of the C-String with a \0 terminating character. Of course, if the input is wrong and there is an C-String defined like:

char cstring[3] = {'1','2','3'};

will indeed result in unexpected behavior, because there can be something like 123@4x\0 in the memory now. So the result of of strlen() for example is now 6 and not 3 as expected.

The following approach shows how to create a safe C-String in any case:

char *createSafeCString(char cStringToCheck[]) {
    //Cast size_t to integer
    int size = static_cast<int>(strlen(cStringToCheck)) ;
    //Initialize new array out of the stack of the method
    char *pszCString = new char[size + 1];
    //Copy data from one char array to the new
    strncpy(pszCString, cStringToCheck, size);
    //set last character to the \0 termination character
    pszCString[size] = '\0';
    return pszCString;
}

This ensures that if you manipulate the C-String to not write on the memory of something else.

But this is not what you wanted. I know, but there is no other way to achieve the length of a char array without termination. This isn't even an approach. It just ensures that even if the User (or Dev) is inserting ***** to work fine.

Pwnstar
  • 2,333
  • 2
  • 29
  • 52
0

You will need to encode your string. For example:

struct string
{
    size_t len;
    char *data;
} __attribute__(packed);

You can then accept any array of characters if you know the first sizeof(size_t) bytes of the shared memory location is the size of the char array. It gets tricky when you want to chain arrays this way.

It's better to trust your other end to terminate it's strings or roll your own strlen that does not go outside the bounderies of the shared memory segment (providing you know at least the size of that segment).

Mel
  • 6,077
  • 1
  • 15
  • 12
0

If you need to get the size of shared memory, try to use

// get memory size
struct shmid_ds shm_info;
size_t shm_size;
int shm_rc;
if((shm_rc = shmctl(shmid, IPC_STAT, &shm_info)) < 0)
    exit(101);
shm_size = shm_info.shm_segsz;

Instead of using strlen you can use shm_size - 1 if you are sure that it is null terminated. Otherwise you can null terminate it by data[shm_size - 1] = '\0'; then use strlen(data);

edo888
  • 398
  • 2
  • 12