Strings are a painful learning experience in c. Its quite unlike higher level languages.
First off, to answer your explicit question, str[i]
is the value of the i-th element in an array. If str
points to the array of characters"Hello"
, then str[1]
is the value "e". str + i
, on the other hand, is a pointer to the i-th element in an array. In the same situation, where str
points to the array of characters "Hello"
, str + 1
is a pointer to the "e" in that array. str[i]
and *(str + i)
are identical in every way. In fact, the spec defines a[b]
to be (*(a + b))
, with all of the behaviors that come with it! (As an aside, C still supports a very ancient notation i[str]
which is exactly the same as str[i]
because addition of pointers and integers is commutative. You should never use the backwards form, but its worth noting that when I say they are defined to be the same, I mean it!)
Note that I have been very careful to avoid the word "string," and focused on "character arrays" instead. C doesn't technically have a string type. This is important here because you can't do easy things like std::string str[5]
(which is a valid C++ notation to create an array of 5 strings) to get a variable length string. You have to make sure you have memory for it. char *str[5]
creates an array of 5 char*
, but doesn't create any arrays of characters to write to. This is why your code is failing. What is actually happening is each element of str
is a pointer to an unspecified address (garbage memory from whatever was there before the variable was created), and scanf
is trying to assign into that (nonexistent) array. Bad things happen when you write to somewhere random in memory!
There are two solutions to this. One is to use Serve Lauijssen's approach using malloc. Just please please please please please remember to use free()
to deallocate that memory. In nearly any real program, you will not want to leak memory, and using free
is a very important habit to get into early on. You should also make sure malloc
did not return null. That's another one of those habits. It virtually never returns null on a desktop, but it can. On embedded platforms, it can easily happen. Just check it! (And, from the fact that I have to be reminded of this in the comments suggests I failed to get in the habit early!)
The other approach is to create a multidimensional array of characters. You can use the syntax char str[5][80]
to create a 5x80 array of characters.
The exact behavior is a bit of a doozie, but you will find it just happens to behave the way you think it should in your case. You can just use the syntax above, and keep moving. However, you should eventually circle back to understand how this works and what is going on underneath.
C handles the access to these multidimensional arrays in a left to right manner, and it "flattens" the array. In the case of char str[5][80]
, this will create an array of 400 characters in memory. str[0]
will be a char [80]
(an 80 character array) which is the first 80 characters in that swath of memory. str[1]
will be the next swath of 80 characters, and so on. C will decay an array into a pointer implicitly, so when scanf
expects a char*
, it will automatically convert the char [80]
that is the value of str[i]
into a char*
that points at the first character of the array. phew
Now, all that explicit "here's what's actually going on" stuff aside, you'll find this does what you want. char str[5][80]
will allocate 400 characters of memory, laid out in 5 groups of 80. str[i]
will (almost) always turn into a char*
pointing at the start of the i-th group of characters. Then scanf
has a valid pointer to an array of characters to fill in. Because C "strings" are null-terminated, meaning they end at the first null (character 0
aka '\0'
) rather than at the end of the memory allocated for it, the extra unused space in the character array simply wont matter.
Again, sorry its so long. This is a source of confusion for basically every C programmer that ever graced the surface of this earth. I am yet to meet a C programmer who was not initially confused by pointers, much less how C handles arrays.
Three other details:
- I recommend changing the name from
str
to strs
. It doesn't affect how the code runs, but if you are treating an object as an array, it tends to be more readable if you use plurals. If I was reading code, strs[i]
looks like the i-th string in strs
, while str[i]
looks like the i-th character in a string.
- As Bodo points out in comments, using things like
scanf("%79s", str[i])
to make sure you don't read too many characters is highly highly highly desirable. Later on, you will be plagued by memory corruptions if you don't ingrain this habit early. The vast majority of exploits you read about in major systems are "buffer overruns" which are where an attacker gets to write too many characters into a buffer, and does something malicious with the extra data as it spills over into whatever happens to be next in the memory space. I'm quite sure you aren't worried about an attacker using your code maliciously at this point in your C career, but it will be a big deal later.
- Eventually you will write code where you really do need a
char**
, that is a pointer to a pointer to a character. The multidimensional array approach won't actually work on that day. When I come across this, I have to create two arrays. The first is char buffer[400]
which is my "backing" buffer that holds the characters, and the second is char* strs[5]
, which holds my strings. Then I have to do strs[0] = buffer + (0 * 80); strs[1] = buffer + (1 * 80);
and so on. You don't need this here, but I have needed it in more advanced code.
- If you do this, you can also follow the suggestion in the comments to make a
static char backing[400]
. This creates a block of 400 bytes at compile time which can be used by the function. In general I'd recommend avoiding this, but I include it for completeness. There are some embedded software situations where you will want to use this due to platform limitations. However, this is terribly broken in multithreading situations, which is why many of the standard C functions that relied on static allocated memory now have a variant ending in _r
which is re-entrant and threadsafe.
- There is also alloca.