0

I've written the following code to understand better how strnlen behaves:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    char bufferOnStack[10]={'a','b','c','d','e','f','g','h','i','j'};
    char *bufferOnHeap = (char *) malloc(10);

    bufferOnHeap[ 0]='a';
    bufferOnHeap[ 1]='b';
    bufferOnHeap[ 2]='c';
    bufferOnHeap[ 3]='d';
    bufferOnHeap[ 4]='e';
    bufferOnHeap[ 5]='f';
    bufferOnHeap[ 6]='g';
    bufferOnHeap[ 7]='h';
    bufferOnHeap[ 8]='i';
    bufferOnHeap[ 9]='j';

    int lengthOnStack = strnlen(bufferOnStack,39);
    int lengthOnHeap  = strnlen(bufferOnHeap, 39);

    printf("lengthOnStack = %d\n",lengthOnStack);
    printf("lengthOnHeap  = %d\n",lengthOnHeap);

    return 0;
}

Note the deliberate lack of null termination in both buffers. According to the documentation, it seems that the lengths should both be 39:

RETURN VALUE The strnlen() function returns strlen(s), if that is less than maxlen, or maxlen if there is no null terminating ('\0') among the first maxlen characters pointed to by s.

Here's my compile line:

$ gcc ./main_08.c -o main

And the output:

$ ./main
lengthOnStack = 10
lengthOnHeap  = 10

What's going on here? Thanks!

moooeeeep
  • 31,622
  • 22
  • 98
  • 187
OrenIshShalom
  • 5,974
  • 9
  • 37
  • 87
  • 7
    It's undefined behavior and you are just lucky that there is a zero byte right behind your array? – moooeeeep Jun 06 '18 at 11:16
  • 4
    Neither of those buffers *have* "first 39 characters", because they're only of size 10. You're observing undefined behaviour from reading past the buffers' end. – Quentin Jun 06 '18 at 11:17
  • How do you know that there is dfinitely no '\0' at the location after the last character in your buffer. The result is IMHO UB. – cwschmidt Jun 06 '18 at 11:17
  • `for ( int i = 0; i < 10; i++ ) bufferOnHeap[ i ] = 'a' + i;` - instead of 10 similar lines. – i486 Jun 06 '18 at 11:30

4 Answers4

3

Firstly, don't cast malloc.

Secondly, you are reading past the end of your arrays. The memory outside your array bounds is undefined, and therefore there is no guarantee that it is not zero; in this instance, it is!

In general, this kind of behaviour is sloppy - see this answer for a good summary of the potential consequences

ACascarino
  • 4,340
  • 1
  • 13
  • 16
  • when I use gdb, I see that indeed 0 is placed after bufferOnStack. Is this some kind of stack protection against buffer overflows? – OrenIshShalom Jun 06 '18 at 11:45
  • 1
    @OrenIshShalom It's probably padding and it's zero-initialized because you were lucky. Don't expect it to be there. It's outside the memory region you are allowed to walk over. – moooeeeep Jun 06 '18 at 12:20
3

First of all, strnlen() is not defined by C standard; it's a POSIX standard function.

That being said, read the documentation carefully

The strnlen() function returns the number of bytes in the string pointed to by s, excluding the terminating null byte ('\0'), but at most maxlen. In doing this, strnlen() looks only at the first maxlen bytes at s and never beyond s+maxlen.

So that means, while calling the function, you need to make sure, for the value you provide for maxlen, the array idexing is valid for [maxlen -1] for the supplied string, i.e, the string has at least maxlen elements in it.

Otherwise, while accessing the string, you'll venture into memory location which is not allocated to you (array out of bound access) hereby invoking undefined behaviour.

Remember, this function is to calculate the length of an array, upper-bound to a value (maxlen). That implies, the supplied arrays are at least equal to or greater than the bound, not the other way around.


[Footnote]:

By definition, a string is null-terminated.

Quoting C11, chapter §7.1.1, Definitions of terms

A string is a contiguous sequence of characters terminated by and including the first null character. [...]

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • I think the wording in the man page is a bit ambiguous when it says "never beyond `s+maxlen`", but I think it really means _never at or beyond `s+maxlen`_. There is no reason for it to access `s[maxlen]`. – Ian Abbott Jun 06 '18 at 11:42
  • @IanAbbott Is that so? I thought the length is taken, considering the null as the +1. – Sourav Ghosh Jun 06 '18 at 11:45
  • Well before that bit it does say "**`strnlen()`** looks only at the first `maxlen` characters ...". – Ian Abbott Jun 06 '18 at 11:46
  • @IanAbbott Ah..now I see, Well, I cannot edit the quote, but edited my answer. :) – Sourav Ghosh Jun 06 '18 at 11:49
  • It's more explicit in the SUSv4 documentation: "The _strnlen()_ function shall never examine more than _maxlen_ bytes of the array pointed to by _s_." – Ian Abbott Jun 06 '18 at 12:07
  • "... need to make sure, for the value you provide for maxlen, the array idexing is valid for [maxlen -1] for the supplied string, i.e, the string has at least maxlen elements in it." --> Hmm I would expect `strnlen("hello",39);` to be valid, yet does not meet that condition. – chux - Reinstate Monica Jun 06 '18 at 14:31
  • If able, provide a link to the first citation. – chux - Reinstate Monica Jun 06 '18 at 14:32
  • Note that with `strnlen(const char *s, size_t maxlen)`, `s` need not point to a _string_. A character array will do as in OP's case. – chux - Reinstate Monica Jun 06 '18 at 14:40
1

Your question is roughly equivalent to the following:

I know that a burglar alarm is supposed to prevent your house from getting robbed. This morning when I left the house, I turned off the burglar alarm. Sometime during the day when I was away, a burglar broke in and stole my stuff. How did this happen?

Or to this:

I know you can use the cruise control on your car to help you avoid getting speeding tickets. Yesterday I was driving on a road where the speed limit was 65. I set the cruise control to 95. A cop pulled me over and I got a speeding ticket. How did this happen?

Actually, those aren't quite right. Here's a more contrived analogy:

I live in a house with a 10 yard long driveway to the street. I have trained my dog to fetch my newspaper. One day I made sure there were no newspapers on the driveway. I put my dog on a 39 yard leash, and I told him to fetch the newspapwer. I expected him to go to the end of the leash, 39 yards away. But instead, he only went 10 yards, then stopped. How did this happen?

And of course there are many answers. Perhaps, when your dog got to the end of your newspaper-free driveway, right away he found someone else's newspaper in the gutter. Or perhaps, when the leash failed to stop him at the end of the driveway and he continued into the street, he got run over by a car.

The point of putting your dog on a leash is to restrict him to a safe area -- in this case, your property, that you control. If you put him on such a long leash that he can go off into the street, or into the woods, you're kind of defeating the purpose of controlling him by putting him on a leash.


Similarly, the whole point of strnlen is to behave gracefully if, within the buffer you have defined, there is no null character for strnlen to find.

The problem with non-null-terminated strings is that functions like strlen (which blindly search for null terminators) sail off the end and rummage blindly around in undefined memory, desperately trying to find the terminator. For example, if you say

char non_null_terminated_string[3] = "abc";
int len = strlen(non_null_terminated_string);

the behavior is undefined, because strlen sails off the end. One way to fix this is to use strnlen:

char non_null_terminated_string[3] = "abc";
int len = strnlen(non_null_terminated_string, 3);

But if you hand a bigger number to strnlen, it defeats the whole purpose. You're back wondering what will happen when strnlen sails off the end, and there's no way to answer that.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Nice analogy! I'd suggest to drop the first two attempts though. They don't really help to drive home the point... – moooeeeep Jun 10 '18 at 19:08
0

What happens when ... "Undefined behaviour (UB)"?

“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose”

Your heading is actually not UB, since calling strnlen("hi", 5) is perfectly legal, but the specifics of your question shows it is indeed UB...

Both strlen and strnlen expect a string, i.e. a nul-terminated char sequence. Providing your non-nul-terminatedchar array to the function is UB.

What happens in your case is that the function reads the first 10 chars, finds no '\0', and since it hasn't went out-of-bounds it continues to read further, and by that invoking UB (reading un-allocated memory). It could be that your compiler took the liberty to end your array with '\0', it could be that the '\0' was there before... the possibilities are limited only by the compiler designers.

ACascarino
  • 4,340
  • 1
  • 13
  • 16
CIsForCookies
  • 12,097
  • 11
  • 59
  • 124