1

I am trying to write a C program that reads how many lines/entries there are in a set data file. I have used the code below and it works fine (gotten from: What is the easiest way to count the newlines in an ASCII file?)

#include <stdio.h>

int main()
{
FILE *correlations;
correlations = fopen("correlations.dat","r");
int                 c;              /* Nb. int (not char) for the EOF */
unsigned long       newline_count = 0;

    /* count the newline characters */
while ( (c=fgetc(correlations)) != EOF ) {
    if ( c == '\n' )
        newline_count++;
}

printf("%lu newline characters\n", newline_count);
return 0;
}

But I was wondering if there was a way to change this bit

if ( c == '\n' )
        newline_count++;

into something else so that if your data looks like

1.0

2.0

3.0 

(with an entry then the new line is a space then an entry then space) instead of

1.0
2.0
3.0

How do I get it to differentiate between a character/string/integer and just a new line? I tried %s but it didn't work.. I am just trying this out first on a small file with only 3 entries, but I will be using a very big file later on where I have spaces between each line so I'm wondering how to differentiate... Or should I divide the line_count by 2 to get the number of entries?

Community
  • 1
  • 1
Maheen Siddiqui
  • 185
  • 4
  • 13

1 Answers1

1

You can make a flag that tells you that you saw at least one non-whitespace character after the last \n, so that you could increment the line counter only when that flag is set to 1:

unsigned int sawNonSpace = 0;
while ( (c=fgetc(correlations)) != EOF ) {
    if ( c == '\n' ) {
        newline_count += sawNonSpace;
        // Reset the non-whitespace flag
        sawNonSpace = 0;
    } else if (!isspace(c)) {
        // The next time we see `\n`, we'll add `1`
        sawNonSpace = 1;
    }
}
// The last line may lack '\n' - we add it anyway
newline_count += sawNonSpace;

Dividing the count by two is not reliable, unless you are guaranteed to have double spacing in all your files.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Perhaps its late and my mind just isn't working properly but.. although it runs perfectly for some reason I cant seem to understand it completely.. if i have 1.0 'space' 2.0.. it starts reading the file the first character it comes across is 1.0.. I was just running it in the debugger to see the execution and when it begins it goes if(c=='\n') and then skips to else if(!isspace(c)) and executes that.. then it starts the while loop again and it does the same thing again.. and then on the 4th time it executes the if(c=='\n') loop and increments the newline_count.. I'm a bit confused is it just.. – Maheen Siddiqui Aug 20 '13 at 00:54
  • @MaheenSiddiqui The loop goes character by character. On each run of the loop the code visits the first `if`, the second `if`, or skips them both. When it sees `\n`, it visits the first `if`, adds `1` or `0` to the count depending on whether or not it had seen a non-space since the last `\n`, and clears out the non-space flag. When it sees a non-space, it sets the flag, so that the next time there's a `\n` the counter could be incremented. When it sees a space, it does nothing, so the flag remains what it was before. If there are only spaces on a line, the next `\n` adds zero to the count. – Sergey Kalinichenko Aug 20 '13 at 01:19