4

Please ignore the Japanese there.
I attempt to count the length of the string entered from stdin using the following code. But it didn't work expectedly:

#include <stdio.h>

int main(int argc, const char *argv[]) {
    char str[100];

    printf("文字列を入力してください:");  // Please enter a string:
    fgets(str,99,stdin);

    int n = 0;
    while (str[n++] != '\0');
    printf("文字列の長さは%dです\n", n);  // The length of the string is %d\n
    return 0; 
}

For example, if I enter glacious, I'll get n=10, which I expected to be n=8.
I understand that n++ will increment n after str[n++] != '\0' gets evaluated, and \0 is the default character appended to every string. But somehow this doesn't make sense to me. I know I can make this work for my purpose by adding n-=2 at the end, but I really want to understand what's going on here. Many thanks in advance!

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Mike Chen
  • 51
  • 3
  • 6
    Welcome to SO! Seems correct to me, you count 1 for every character, the newline and the null terminator. `glacious\n\0` => 10. – ggorlen Dec 31 '20 at 21:39
  • 1
    To further expand on the comment by @ggorlen - remember that `n` is incremented even when the `nul` character was found. Also, remember that `fgets` returns the input *including* the 'signal' newline character (if there's space for it). – Adrian Mole Dec 31 '20 at 21:43
  • 2
    _Side note:_ For `fgets`, it will subtract one from the length to make room for the EOS char. So, you can give the full size of the buffer (e.g. `100` instead of `99`). More idiomatic: `fgets(str,sizeof(str),stdin);` – Craig Estey Dec 31 '20 at 21:45
  • 1
    If you want us to ignore the Japanese, why include it at all? – klutt Dec 31 '20 at 21:46
  • @ggorlen Thanks for such a warm welcome! This is indeed my first interaction on SO. And, omg, I totally forgot about the `\n`. I was thinking I got the while loop's logic wrong. Your answer is very helpful, thank you! – Mike Chen Dec 31 '20 at 21:51
  • @klutt Good point I thought about it but I guess I'm just too lazy to re-type them in English.. – Mike Chen Dec 31 '20 at 21:53
  • @MikeChen You might want to read about how to create a [mre]. Don't tell us what to ignore. Create a snippet where everything is needed. But apart from that detail, your post is fine. – klutt Dec 31 '20 at 21:55
  • 3
    @MikeChen - you should invest some time in learning to use whatever debugger is available on your system/development environment so you can step through the code and figure this kind of thing out for yourself. I think you'll find that learning things for yourself teaches you more than having someone tell you how things work. – Bob Jarvis - Слава Україні Dec 31 '20 at 21:59
  • @klutt This is my first time asking a question on SO but ya I'd heed your advice for future posts. – Mike Chen Dec 31 '20 at 22:34
  • @MikeChen Yes, I understand it's your first post. That's why I'm teaching you how to ask questions here. Welcome to SO btw. :) – klutt Dec 31 '20 at 22:37
  • 1
    If you want the length of the string, not including the `'\n'`, you can use: `n = 0; while (str[n] != '\n' && str[n] != '\0') n++;` In this case to remove the `'\n'`, you can simply use `str[n] = 0;` after you exit the loop. – David C. Rankin Jan 01 '21 at 00:50

3 Answers3

5

"I attempt to count the length of the string entered from stdin"..."I know I can make this work for my purpose by adding n-=2 at the end, but I really want to understand what's going on here. "

Documentation for fgets() includes the following:

"...reads a line from the specified stream and stores it into the string pointed to by str. It stops when either (n-1) characters are read, the newline character is read, or the end-of-file is reached, whichever comes first."

This call, without checking the return value of the function, and by passing an incorrect value for the length of the string, limits the potential of detecting errors, and introduces the possibility of undefined behavior. To address these issues, change this:

fgets(str,99,stdin); 

To, for example this:

if( fgets (str, sizeof str, stdin) != NULL ) 
{    
     ...

Dissecting the following: given user input value: "glacious", str looks like this in memory:

|g|l|a|c|i|o|u|s|\n|\0|?|...|?|
 0 1 2 3 4 5 6 7 8  9 10    99 

int n = 0;
while(str[n++] != '\0');

iterations:

    n at start                       n at finish
  • 1st: n==0, str[0] (g) != \0, n++, n==1
  • 2nd: n==1, str[1] (l) != \0, n++, n==2
  • 3rd: n==2, str[2] (a) != \0, n++, n==3
  • 4th: n==3, str[3] (c) != \0, n++, n==4
  • 5th: n==4, str[4] (i) != \0, n++, n==5
  • 6th: n==5, str[5] (o) != \0, n++, n==6
  • 7th: n==6, str[6] (u) != \0, n++, n==7
  • 8th: n==7, str[7] (s) != \0, n++, n==8
  • 9th: n==8, str[8] (\n) != \0, n++, n==9
  • 10th: n==9, str[9] (\0) == \0, n++, n==10

Clearly illustrates the state of all iterations, including the final post-increment of n, bringing it's total to 10 for a user input assumed to be only 8 characters. The \n and the final post-increment ( for \0) account for the additional value to n`. In summary the problem is simply adjusting your expectations to account for all characters in the buffer, including the ones you do not see.

Of interest, counting value of n does not equate to measuring the string length of str, for which the idiomatic method ( strlen() ), will yield 9. Given the definition of a C string, the following shows varying results for each corresponding method of looking at str, assuming initialization:

char str[100] = {0}; 

And str contents are: "glacious\n"//null terminator is implied

//method to calculate n in discussion above
//                         //yields n == 10
int len = strlen(str);     //yields n == 9
//after using strcspn()
str[strcspn(str, "\n")] = 0;
len = strlen(str);         //yields n == 8
size_t size = sizeof str;  //yields size == 100

As an aside, if goal is to count the number of entries, and if an alternative approach is okay, consider simplifying the method...

Replacing this section:

char str[100];

printf("文字列を入力してください:");
fgets(str,99,stdin);

int n = 0;
while(str[n++] != '\0');
printf("文字列の長さは%dです\n", n);
return 0; 

With this one which will break the loop upon seeing \n (newline character), or EOF (-1) (#define in stdio.h), resulting in a correct count of user inputs (minus newline):

int count = 0;
printf("Please enter a string:");
int c = fgetc(stdin);
while(( c != '\n') && (c != EOF))
{
    count++; //only increments when c meets criteria
    fputc(c, stdout);
    c = fgetc(stdin);
}
printf("\n\nThe length of the string is: %d\n", count);
return 0; 
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • `n++` doesn't exclude `'\0'` either, hny – alex01011 Dec 31 '20 at 22:38
  • 1
    I upvoted, I meant to add that with `n++` he is also counting the NULL terminator , and Happy New Year – alex01011 Dec 31 '20 at 22:58
  • 1
    I did not downvote, but your remark about initializing the array is not as pertinent as telling the OP to test the return value of `fgets()`. Furthermore the C Standard does not seem to guarantee that no bytes are changed by `fgets()` in the destination array beyond the null terminator in a successful call. – chqrlie Jan 01 '21 at 00:17
  • 1
    Thank you ryyker you are being really helpful :) I actually just did the same thing to convince myself about the `n`'s value(writing out `n` 's value and incrementation in order for each iteration). I originally thought because this is a really simple question maybe no one even bothers to answer it. But I learned much more about string in C now. – Mike Chen Jan 01 '21 at 00:53
4

If fgets() encounters a newline character in stdin, it will write it to the end of str before the null terminator, and because you're using a post-increment operator in the condition expression of your while loop, you are also including the null terminator in your total count. This accounts for the difference of 2 from your expected value n.

Consider using strcspn(), found in the <string.h> header, so that you can compute the length of str up until the first encountered \n, or until the null terminator if none is found:

size_t n = strcspn(str, "\n");
Patrick Roberts
  • 49,224
  • 10
  • 102
  • 153
  • 1
    *`fgets()` will write the newline character...* `fgets()` won't write anything into the stream - try closing the stream with `CTRL-D` or similar - the input read won't end with a newline. `fgets()` just doesn't strip the newline if it's there. – Andrew Henle Dec 31 '20 at 22:27
  • @AndrewHenle I never said that `fgets()` would write anything into the stream, I said that _`fgets()` will write the newline character at the end of `str`_, but I updated my explanation to help clarify any confusion you had. – Patrick Roberts Jan 01 '21 at 00:28
  • @AndrewHenle: Hardware-oriented people may think of a processor writing data to memory. So `foo[i] = '\n';` writes a newline character into `foo`. – Eric Postpischil Jan 01 '21 at 01:10
2

The loop while(str[n++] != '\0'); counts all bytes read by fgets(), including the newline and the null terminator because n is incremented at every test including the last one that evaluates to false.

Also note that fgets(str, 99, stdin); should be fgets(str, 100, stdin); or better if (fgets(str, sizeof str, stdin) == NULL) return 1; to avoid undefined behavior in case of unexpected end of file (empty file redirected as input stream).

Modified version:

#include <stdio.h>
#include <string.h>

int main(int argc, const char *argv[]) {
    char str[100];

    printf("文字列を入力してください:");  // Please enter a string:
    if (!fgets(str, sizeof str, stdin))
        return 1;

    str[strcspn(str, "\n")] = '\0';  // strip the trailing newline if any

    int n;
    for (n = 0; str[n] != '\0'; n++)
        continue;

    printf("文字列の長さは%dです\n", n);  // The length of the string is %d\n
    return 0; 
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189