0

I am new to C and was trying to play around with different ways to initialize arrays of chars according to various ways explained here and found one difference I cannot explain based on what I could learn from that previous thread or other resources I've been learning from. Stopping at a breakpoint just below the lines shown below in gdb:

char myCharArray1[] = "foo";
char myCharArray2[] = "bar";
char myCharMultiArray[2][10] = {myCharArray1, myCharArray2};
char myCharMultiArrayLiteral[2][10] = {"foo", "bar"};

In gdb I notice the following:

ptype myCharMultiArray
type = char [2][10]
ptype myCharMultiArrayLiteral
type = char [2][10]
ptype myCharMultiArray[0]
type = char [10]
ptype myCharMultiArrayLiteral[0]
type = char [10]
info locals
myCharArray1 = "foo"
myCharArray2 = "bar"
myCharMultiArray = {"\364\360\000", "\000\000\000"}
myCharMultiArrayLiteral = {"foo", "bar"}

Why do the contents of myCharMultiArray and myCharMultiArrayLiteral differ? Where do the numbers in myCharMultiArray \364\360 even come from?

If I were to try to explain why this is happening from what I've read so far, is it may have something to do with the following ideas:

  1. I'm inadvertently trying to modify a string literal
  2. myCharArray1 and myCharArray2 are not actually type char [4] (despite what gdb tells me) and they are just pointers to the first character in the string literals (i.e. the address of where the 'f' and 'b' are stored respectively.
  3. The creation of a new char array myCharMultiArray requires some memory in an address not associated with where myCharArray1 or myCharArray2 are stored, and the syntax of char myCharMultiArray[2][10] = {myCharArray1, myCharArray2}; is actually trying to move the myCharArray1 and myCharArray2 data as opposed to copying it. Which is not possible for some reason I don't yet quite grasp.

Adding a link to a relevant topics (but still can't find a duplicate).

E_net4
  • 27,810
  • 13
  • 101
  • 139
topher217
  • 1,188
  • 12
  • 35
  • What does your compiler tell you about this line? `char myCharMultiArray[2][10] = {myCharArray1, myCharArray2};` Does it show a warning about "making integer value from pointer of different size" or similar? – Gerhardh Jan 19 '23 at 06:59
  • I get a warning of 'initialization of 'char' from 'char *' makes integer from pointer without a cast [-Wint-conversion]"' (Lets see if my escape characters worked for markdown). – topher217 Jan 19 '23 at 07:02
  • Your option 2 is quite close. They are arrays as gdb shows you. But in many cases, if you use the name of an array, it automatically decays to a pointer to first element. That means you provide 2 addresses to initialize your array. And as a result you try to store the least significant byte of these addresses in your `char` array. – Gerhardh Jan 19 '23 at 07:03
  • @Gerhardh what would be a good way to verify this? If I use `p &myCharArray1` in gdb I get `(char (*)[4]) 0x20041ff4`. So this address is in hex and the numbers I'm seeing in the `myCharMultiArray` (i.e. `\364\360`) are octal? decimal? Just trying to figure out how I would check that for my sanity. – topher217 Jan 19 '23 at 07:08
  • 1
    In a string literal a \ followed by a number indicates an octal value. That means what you are seing are the values `0xF4, 0xF0` which are the least significant bytes of `&myCharArray1, &myCharArray2` – Gerhardh Jan 19 '23 at 07:15
  • Also, do you know of any way to verify when the array decays to a pointer to the first element via gdb or other tools? I'm sure this may seem obvious to someone working with C for a while, but for someone new to this, I'm hoping for some tool I can use to show what an object is every step of the way so I can observe these automatic changes. – topher217 Jan 19 '23 at 07:15
  • The standard demands that it decays in that situation. I don't know how you could make gdb show what you want. The only reason why it works with string literals is because the standard explicitely mentions it as a way to initialize a `char[]`. It also won't work if you try to assign a string literal to an array in any other situation than during initializion. – Gerhardh Jan 19 '23 at 07:16
  • @Gerhardh great! That is very helpful. These things are not so search engine friendly. Finally, is there any valid way of using `myCharArray1` and `myCharArray1` to define the values of `myCharMultiArray` other creating everything as pointers (as seems to be the better option anyway). I just want to know the limitations of the language so knowing if I can somehow reuse the strings in `myCharArray1` for defining values in some other array. Regardless, if you'd like to wrap up everything you've mentioned so far in an answer, I'm quite happy to mark it correct (unless someone finds a duplicate). – topher217 Jan 19 '23 at 07:20
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/251259/discussion-between-gerhardh-and-topher217). – Gerhardh Jan 19 '23 at 07:20
  • @topher217 Why do you seek to maintain a wrong and flawed perspective? The compiler issued its appraisal of the statement saying "this is bad". Take its advice and fix the code and then look at it with a debugger... There's no "understanding" to be found in "understanding" corrupt code. – Fe2O3 Jan 19 '23 at 07:21

1 Answers1

1

First of all please use this: What compiler options are recommended for beginners learning C?

After which a compiler like gcc will tell you that the code is invalid C.

error: initialization of 'char' from 'char *' makes integer from pointer without a cast [-Wint-conversion]

You are debugging invalid C so there's not much point in trying to making sense of whatever the compiler let through in "lax mode" just because you compiled with non-standard compiler settings.

Specifically, your problem boils down to the fact that C does not allow us to copy arrays during initialization/assignment. Why it was designed like that is a long story and there's not much in the way of a rationale, just accept that C doesn't allow it. Your first arrays are of type char[4] indeed but decay into a char* when used in most expressions (hence "from char*" in the compiler error).

A char array initializer list expects char items as initializers, hence "error: initialization of 'char'". Even if it is a 2D array, it still wants char initializers, but allows for nested braces which is good practice. So you'd do
{ {'f', 'o', 'o', '\0'}, {'b','a','r','\0} } and that's correct C, although a pain to write out like that. Therefore string literals are allowed as an alternative form of initialization.

In case of string literals, yes they are by themselves to be regarded as arrays, but the array initialization rules in C mention string literals as a special case with special initialization rules.

char myCharMultiArrayLiteral[2][10] = {"foo", "bar"}; is therefore fine.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Thanks I'll give those compiler options a try! I wasn't ignoring the warnings, but didn't know how to interpret them based on what gdb was telling me (i.e. the discrepancy between `char [4]` and `char *`. I had guessed something along those lines with my theory #2, but didn't know how to verify it. If not via gdb, do you know of any good way to dive deeper into what the compiler is doing at each stage such that I can verify/know when these kinds of decays occur? Or somehow show the process from when `myCharArray1` goes from `char [4]` to `char *`? – topher217 Jan 19 '23 at 08:00
  • Also just curious if the compiler options you mentioned would give a more straightforward explanation as to what was invalid (i.e. what you said "C does not allow us to copy arrays during initialization/assignment."). – topher217 Jan 19 '23 at 08:02
  • @topher217 Regarding the first question, you just have to learn. Arrays "decay" into pointers to the first element whenever used in an expression or used as parameters to a function. In fact the C standard itself is pretty helpful here(6.3.2.1/3): "Except when it is the operand of the `sizeof` operator, or the unary `&` operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object" – Lundin Jan 19 '23 at 08:29
  • @topher217 Regarding the second question, not really although you could try to switch compiler between gcc and clang if you don't understand the compiler messages from one of them. clang says: "error: incompatible pointer to integer conversion initializing 'char' with an expression of type 'char[4]'" Which is perhaps slightly easier to understand. Also "what is the meaning of this compiler message" is probably a good question to ask on SO. Since as you can tell, compilers sometimes assume that the reader has quite in-depth C knowledge when presenting the error messages. – Lundin Jan 19 '23 at 08:32
  • It is perfectly normal for the average C programmer to treat error messages as "bug here: fix!" without having a clue what all the language-lawyer stuff like "scalars" and "lvalues" that the compiler is yapping about actually means. Very few (if any) compilers on the market are designed for the purpose of being easy to use products suitable for productive development. Rather, compiler manufacturers tend to be fixated on efficient optimizations above all else. Or in the case of gcc, fixated on all the prestige invested in the mildly useful often harmful GNU C dialect. – Lundin Jan 19 '23 at 08:39
  • Very informative, thank you. I'll be sure to catch up on some light reading by going through the whole standard someday :P. – topher217 Jan 19 '23 at 08:51