1

just started playing with C, I have this

char str_arr[2][3] = {"gou", "ram"};
printf("%s / %s / %s", str_arr, str_arr[0], str_arr[1]);

which prints:

gouram / gouram / ram

and

char str_arr[2][4] = {"gou", "ram"};
printf("%s / %s / %s", str_arr, str_arr[0], str_arr[1]);

prints:

gou / gou / ram

I really don't understand, the 4 is the maximum size, yet makes no sense at all.

//Edit Just wanted to say that this helped me a lot, it may be a dumb question for most of you, but for me it was not, I just got into memory allocation and more advanced stuff. Thank you SO!

Martzy
  • 85
  • 12
  • It makes perfect sense once you know how strings & arrays work in C. What do you know about either of these? – Scott Hunter Apr 18 '22 at 15:46
  • 2
    Common undefined behaviour. Which book are you reading? – autistic Apr 18 '22 at 15:47
  • I'm watching CS50 on Youtube. Any book you would recommend to straighten it? – Martzy Apr 18 '22 at 16:06
  • @Martzy: In my opinion, the CS50 course does a very good job of explaining strings and the meaning of the null terminating character. Regarding books, you may want to take a look at [this question](https://stackoverflow.com/questions/562303/the-definitive-c-book-guide-and-list). – Andreas Wenzel Apr 18 '22 at 16:07
  • I know, but I watched it last night first time, for me, I need to get over something multiple times, different sources are better, more ways to get it explained, since it's a new thing. I knew what null character does, but I did not think about it in this case. As I said, now I'm just scratching, I come from the web(PHP), which of course is way simpler. There if I try to print an array it won't work at all, which in C it does, in a strange way I see. I'm relating to PHP to make sense. – Martzy Apr 18 '22 at 16:11
  • 1
    @Martzy: If you learn from CS50, you should be aware that this course first tries to hide the true nature of strings from you, by using the `typedef` `string` instead of `char *`. Only in about week 4 of the course is the true nature of strings and pointers revealed to you. Overall, I have a very good impression of the course. – Andreas Wenzel Apr 18 '22 at 16:15
  • Now I'm seeing lesson 3 which did that, removed the string definition and got to char *. – Martzy Apr 18 '22 at 16:18
  • @Martzy: Note that only the person whose post you attach a comment to will automatically be notified of your comment, unless you explicitly write the person's name by using the `@` syntax in your comment. Press the "Help" button while writing a comment for further information. Your previous comment seems to have been intended for me, but I was not notified of it, because you did not write my name using the `@` syntax. If you do not notify people whose comment you are replying to, you risk that person not noticing your comment. – Andreas Wenzel Apr 18 '22 at 16:28
  • @AndreasWenzel I see, did now know that, I thought all people that left a comment will get a notification. – Martzy Apr 18 '22 at 16:29
  • @Martzy: No, that is only the case if they are following the post to which the comment is attached. That is what the "Follow" button is for. Unless you click that button, you will not be notified of comments to that post (unless that post belongs to you). – Andreas Wenzel Apr 18 '22 at 16:31
  • @autistic I guess you are mostly right, yet if this is what you want, won't be undefined behaviour, since if you run this you'll get same result every time. I just wanted to know why it does what it does. – Martzy Apr 18 '22 at 19:33
  • @Martzy [undefined behavior](https://port70.net/~nsz/c/c11/n1570.html#3.4.3p1) is "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements". Notice how the distinction isn't whether "you'll get the same result every time"; it's whether you get the same result on every machine. If you like writing non-portable, non-compliant code, fine, but please don't work on anything mission critical until you've read K&R2e and done the exercises. We've too many assuming they can guess when you could've researched. – autistic Apr 19 '22 at 21:05
  • P.S. Yes, K&R2e covers this frequently asked question. Any good book will, as does the C standard. See also, [the definition for "string"](https://port70.net/~nsz/c/c11/n1570.html#7.1.1p1). Read more and guess less. – autistic Apr 19 '22 at 21:08

1 Answers1

3

Passing str_arr to the function printf with the %s format specifier will invoke undefined behavior. The %s specifier requires a char * as an argument. The expression str_arr is not a char * and will also not decay into one. However, writing str_arr[0] instead of str_arr will decay to a char *.

In the first example

char str_arr[2][3] = {"gou", "ram"};

passing str_arr[0] will also invoke undefined behavior, for a different reason:

The %s format specifier as a function argument a pointer to a valid string, i.e. a pointer to a sequence of characters terminated by a null character. However, neither str_arr[0] nor str_arr[1] are terminated by a null character, because there is no room for one.

However, when you write

char str_arr[2][4] = {"gou", "ram"};

there is room for a terminating null character, and both str_arr[0] and str_arr[1] will have one after initialization, so the behavior of the programm is well-defined when passing these sub-arrays to the function printf (i.e. there is no undefined behavior).

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
  • Nice, so the problem is that the size was too small to have null at the end, which would concat them. – Martzy Apr 18 '22 at 16:01
  • 2
    @Martzy: In your case, the undefined behavior resulted in the strings being concatenated. However, you cannot rely on that. Undefined behavior is, by definition, unpredictable. – Andreas Wenzel Apr 18 '22 at 16:05
  • I understand now, yes, just in my case did that. Thank you. – Martzy Apr 18 '22 at 16:07
  • @AndreasWenzel: The strings are literally concatenated; arrays are defined to store elements contiguously, so putting characters in the separate elements of `char str_arr[2][3]` necessarily concatenates them in memory, and aliasing any object, including the entire array, through a `char *` is defined, and passing `str_arr[0]` passes such a `char *`. The only thing preventing `printf("%s", str_arr[0])` from being defined to print “gouram” is the lack of a terminating null character. Had the array been defined `char str_arr[3][3]`, there would be a terminating null character. – Eric Postpischil Apr 18 '22 at 16:27
  • @EricPostpischil I think last time you wanted to say [2][4] instead of [3][3] – Martzy Apr 18 '22 at 17:40
  • @Martzy: No, I am making the point that if the array is defined with `char = str_arr[3][3] = {"gou", "ram"};`, then the bytes of the array, from the start, will be “gouram” terminated by a null character (with two more after that), since the elements not explicitly initialized will be initialized to zero. – Eric Postpischil Apr 18 '22 at 17:42
  • @EricPostpischil I see, that makes sense. – Martzy Apr 18 '22 at 17:43
  • @AndreasWenzel UB is not *by definition* unpredictable, since [the definition](https://port70.net/~nsz/c/c11/n1570.html#3.4.3p1) doesn't mention predictability in the slightest. Your mistake there is relying upon Wikipedia. For future reference, if you can cite from the C standard (the document I linked to), please do that instead. Alternatively, POSIX/OpenGroup manpages strive to follow ISO C. – autistic Apr 19 '22 at 21:15
  • So there are a number of points to iterate here, Andreas already mentioned the representation of `char *` and `char (*)[n]` aren't required to be compatible, there's no pointer conversion when calling a variadic function. He also mentioned the lack of space for NUL termination; in my view that's the main point because the other points are merely theoretical and not frequently asked questions about practical issues. But mostly it's important to see this "concatenation" as undefined, too, due to 6.5.6p8 (quoted below); your code accesses outside of bounds of `str_arr[0]` – autistic Apr 20 '22 at 21:06
  • This is probably ideal as a quote to open the answer, [6.5.6p8](https://port70.net/~nsz/c/c11/n1570.html#6.5.6p8) reads: *"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary \* operator that is evaluated."* ... and so it stands to reason that `str_arr[0][3]` is invalid in context of the first example of code. – autistic Apr 20 '22 at 21:10