-1

I have a program written in C, that should count words in a given file. The program count the number of words in file, but not the exactly count (this situation perform when there are multiple lines to read in a file, so when he reads \n I guess). Can you explain me why the count is wrong? That's my code:

int in_word = 1;
while ((ch = fgetc(fp)) != EOF) {

     if(ch == ' ' || ch == '\t' || ch == '\0' || ch == '\n') {

            if (in_word) {
                  in_word = 0;
                  word_count++;
                  }

            } else {
                      in_word = 1; 
                    }
             
       }

If I try with in_word = 0; the count is correct-2. Can you explain me why?

The file: hello hello hello hello hello hello hello (\n here) hi hi hi hi hi hi

Output: 12 words Correct is: 13

kaylum
  • 13,833
  • 2
  • 22
  • 31
  • 3
    Consider your algorithm when multiple *consecutive* whitespace characters are introduced (like, say, a space and a newline. hmmmm). Try it on paper. And fyi, your algorithm is broken on inception if the starting input *leads* with consecutive whitespace chars. That alone should tell you there's a problem. – WhozCraig Jul 03 '21 at 07:47
  • As WhozCraig said, your code won't work when there are multiple consecutive whitespace characters, to fix this, you can check if the last character you read was a whitespace character or not and increase `word_count` accordingly. – JASLP doesn't support the IES Jul 03 '21 at 07:51
  • 3
    Instead of someone else explaining to you why your code is wrong, you will probably understand better if you run your own program line by line in a [debugger](https://stackoverflow.com/q/25385173/12149471) while monitoring the values of all variables. That way, you can see exactly what your program is doing and what is going wrong. – Andreas Wenzel Jul 03 '21 at 07:54
  • Also you should remove `ch=='\0'` and increment `word_count` at the end of the while-loop. Note that `'\0'!=EOF`. – JASLP doesn't support the IES Jul 03 '21 at 07:58

1 Answers1

0

Your code won't work when there are multiple consecutive whitespace characters or if the file starts with whitespace characters. The code below may help you:

#include<stdio.h>
#include<ctype.h>
int main(){
    FILE *fp=fopen("file.txt","rb");
    int word_count=0;
    int first=0; //is 0 until the first non-whitespace character is read
    int lastchar=0; //for checking if the last character read was a whitespace character or not, 1 if true, 0 if false
    int ch=0;
    while((ch=fgetc(fp))!=EOF){
        if(isspace(ch)){
            if(first!=0){
                lastchar=1;
            }
        }
        else{
            if(lastchar==1){ //checking if the last character read was a whitespace character or not
                word_count++;
            }
            first=1;
            lastchar=0;
        }
    }
    if(first==1){
        word_count++;
    }
    printf("%d",word_count);
    fclose(fp);
}