Short answer
You cannot define a string with unknown size. But you may make the string larger and larger when needed. All string and I/O functions need to get an already allocated string to work with, so you have to play with allocation a little bit and be careful not to exceed the allocated capacity.
However, when you have a constant upper bound on string length, just allocate string of that length and you’ll be safe. Resizing is more complex so avoid it if you don’t need it.
What is a string in C
String is a part of char array, terminated by \0
. String functions stop reading the array when they encounter the first \0
character. Remember that strlen
returns the number of characters before the first \0
, so even if the array ends immediately after the first \0
, string length is strictly less than the underlying array length. Specifically in that case array length is strlen + 1
. This is important when allocating a string; always allocate space for the terminating \0
!
E.g.
char w[7] = "Hello";
is the same as
char w[7] = {'H', 'e', 'l', 'l', 'o', '\0', '\0'};
When used as a string, the first \0
is end of string and anything after it is ignored (not read by string functions). Even if you rewrite the last element of the example char array by a printable character (e.g. w[6] = '!';
resulting in having {'H', 'e', 'l', 'l', 'o', '\0', '!'}
), puts(w);
will print Hello
(not Hello!
or anything alike).
When playing with strings as with char arrays, be sure to always include \0
at its end as otherwise potentially unallocated memory after the array is read by string functions which results in segfault.
Why string with unknown size cannot be defined
As I already wrote, string is part of array of char. Each array must have a fixed size. You can use just a part of it (effectively making it smaller), but it has to be allocated and allocator (malloc
, calloc
) needs to know how much memory is wanted.
If you use an array as bigger than allocated, you program is likely to crash with segfault in the better case. If you are extremely unlucky, you program will not crash and will just use the part of memory right after the array, producing weird results.
Since C99, you can omit array length specification if it can be inferred from initializer: char w[] = "Hello";
is the same result as char w[6] = "Hello";
. However this will not help you because you specify initializer at compile time and you need to dynamically change the string length at run time.
How to simulate arbitrary length string
To handle strings of unlimited length, you can create array of a fixed length and every time its length is too low, allocate a new array twice as long and copy the original contents to the new one. (And free the old one.) You can use realloc
to do this work for you, with additional benefit of higher speed when the array does not need to be moved and can just be made longer in-place.