0

I'm very new to C, and I'm not understanding this behavior. Upon printing the length of this empty array I get 3 instead of 0.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct entry entry;

struct entry{
   char arr[16];
};

int main(){
  entry a;
  printf("%d\n",strlen(a.arr));
  return 0;
}

What am I not understanding here?

Flow-MH
  • 61
  • 1
  • 9
  • 3
    You've not initialized the variable, so the array can contain anything; it need not be null terminated at all. Also, the correct format for printing `strlen(a.arr)` is `%zu` since `strlen()` returns `size_t`. – Jonathan Leffler Apr 17 '16 at 06:29
  • A safe way to initialize it would be `a.arr[0] = 0;` that way you have a zero length string. – totoro Apr 17 '16 at 07:05

5 Answers5

4

The statement entry a; does not initialize the struct, so its value is likely garbage. Therefore, there's no guarantee that strlen on any of its members will return anything sensible. In fact, it might even crash the program, or worse.

Community
  • 1
  • 1
Rufflewind
  • 8,545
  • 2
  • 35
  • 55
2

There is no such thing as an "empty array" in C. Your array of char[16]; always contains 16 bytes - uninitialized as a local variable each char has an unspecified value. In addition, if none of these unspecified values happen to be 0, strlen will read outside the array and your code will have undefined behaviour.

Additionally strlen returns size_t and using %d to print this has undefined behaviour too; you must use %zu where z says that the corresponding argument is size_t.

(If by happenstance you're using the MSVC++ "C" compiler, do note that it might not support %zu. Get a real C compiler and C standard library instead.)

Community
  • 1
  • 1
  • Aren't `size_t` unsigned? I think you have to use `%zu`. And there is a dummy, famous and well spread operating system which have its `printf()` function unable to recognize `%z`. – jdarthenay Apr 17 '16 at 06:39
  • @jdarthenay added a notice about the famous and well-spread vir.. er operating system. – Antti Haapala -- Слава Україні Apr 17 '16 at 06:49
  • It's not a problem with the MSVC compiler, it's a runtime problem. You can solve it using MinGW-w64 because there is an option to replace printf familly functions with custom MinGW functions, see [here](http://stackoverflow.com/a/25677110/5845470) – jdarthenay Apr 17 '16 at 06:50
  • @jdarthenay complier is a nice typo when we're talking about MSVC incomplier. However there are now some claims that VS 2015 would actually support `%z`? – Antti Haapala -- Слава Україні Apr 17 '16 at 06:53
  • Not sure about this, but I can tell you if you use `printf("%lld");` in a Windows program, your program will work in Windows Seven but not in Windows XP, because this is a Windows DLLs problem that is solved from Windows Vista if I well remember. – jdarthenay Apr 17 '16 at 06:56
1

Here's the source code to strlen():

size_t strlen(const char *str)
{
    const char *s;
    for (s = str; *s; ++s);
    return(s - str);
}

Wait, you mean there's source code to strlen()? Why yes. All the standard functions in C are themselves written in C.

This function starts at the memory address specified by str. It then uses the for function to start at that address, and then it goes forward, byte by byte, until it reaches zero. How does that for function do that? Well first it assigns s to str. Then, it checks the value s points to. If it's zero (i.e. if *s returns zero) then the for loop is done. If that value is not zero, the s pointer is incremented, and the zero check is done, over and over, until it finds a zero.

Finally, the distance that the s pointer has moved, minus the original pointer you passed in, is the result of strlen().

In other words, strlen() just walks through memory until it finds the next zero character, and it returns the number of characters from that point to the original pointer.

But, what if it doesn't find a zero? Does it stop? Nope. It will just trudge on and on until it finds a zero or the program crashes.

That is why strlen() is so confusing, and why it's source of many critical bugs in modern software. This doesn't mean you can't use it, but it does mean you must be very very careful to make sure that whatever you pass in is a null-terminated string (i.e. a set of zero or more non-zero characters, followed by a zero character.)

Remember also that in C, you basically have no idea what memory contains when you allocate it or set it aside. If you want it to be all zeros, then you need to make sure to fill it with zeros yourself!

Anyway, the answer to your question involves the use of the memset() function. You'll have to pass memset() the pointer to the beginning of your array, the length of that array, and the value to fill it with (in your case, zero of course!)

johnwbyrd
  • 3,432
  • 2
  • 29
  • 25
0

No initialization of a, this leads to undefined behavior.

C "strings" are '\0' terminated arrays of char. So strlen() will browse whole memory from given address until it either finds a '\0' or results in a segmentation fault.

jdarthenay
  • 3,062
  • 1
  • 15
  • 20
0

What am I not understanding here?

Perhaps the mis-understanding is that auto variables, such as:

entry a;

are assigned memory from the process' stack. The pre-existing content of that stack memory is not zeroed-out for your benefit. Hence the value(s) of the elements of a, which will also be located on the process stack, will not be initially zeroed-out for your benefit. Rather, the entire content of a and its elements (including .arr) will contain bizarre and perhaps unexpected values.

C programmers learn to initialize auto variables by zeroing them out, or initializing them with a desirable value.

For example, the question code might do this as follows:

int main(){
  entry a = 
    {
    .arr[0] = 0
    };

...
} 

Or:

int main(){
  entry a;

  memset(&a, 0, sizeof(a));

...
} 
Mahonri Moriancumer
  • 5,993
  • 2
  • 18
  • 28