0

I want to ignore/skip the comments in a text file when I use fgets.

The problem is that I only can skip a comment if the first character in a line starts is #. Comments starts with # in my text file. But there are some # in my file.txt that are not the first character of a line, like so;

#Paths
A B #Path between A and B.
D C #Path between C and D.

A is my first node, B is my second node and when # comes I want to ignore the rest of text until the next line. My new node should be D and C etc. I can only use "r" in fopen function. I have tried fgets but it reads line by line and fgetc doesn't help either.

    bool ignore_comments(const char *s)
    {
        int i = 0;
        while (s[i] && isspace(s[i])) i++;
        return (i >= 0 && s[i] == '#');
    }
    FILE *file;
    char ch[BUFSIZE];
    file = fopen("e.txt", "r");
    if (file == NULL) {
        printf("Error\n");
        fprintf(stderr, "ERROR: No file input\n");
        exit(EXIT_FAILURE);
    }
    while(fgets(ch, BUFSIZE, file) != NULL)
    {
              if (line_is_comment(ch)) {
                        // Ignore comment lines.
                        continue;
                printf("%c",*ch);
                }
     fscanf(file, "%40[0-9a-zA-Z]s", ch);
....
}
  • It's unclear to me whether you want to skip the line `A B #Path between A and B.` or you want that line changed into just `A B ` ? – Support Ukraine May 20 '19 at 18:27
  • I only want to read A B and skip the line when a # comes – Stefan Andersson May 20 '19 at 19:24
  • regarding; `fscanf(file, "%40[0-9a-zA-Z]s", ch);` The letter 's' is part of the allowed input characters in the '%[..]' so would be consumed by the call to `fscanf()` so the posted call to `fscanf()` is not valid – user3629249 May 20 '19 at 23:47

3 Answers3

0

Also method names are different, but am I right with this version ? Ignore my dirty method line_is_comment - from first version unless you want to play with ;-)

Extended test input:

#Paths
A B #Path between A and B.
D C #Path between C and D.
E F
G H

Output:

 rest of line read
AB rest of line read
DC rest of line read
EF rest of line read
GH rest of line read
#include <stdio.h>

bool line_is_comment(const char *s)
{
    char *commentPos = const_cast<char*>(strchr(s, '#'));
    if(commentPos != NULL) {
        *commentPos = 0; // cut-off chars after comment
        //return true; // or false then to accept the line
        return commentPos == s;
    }
    return false;
}

#define BUFSIZE 50

int main()
{
    FILE *file;
    char ch[BUFSIZE];
    file = fopen("e.txt", "r");
    if (file == NULL) {
        printf("Error\n");
        fprintf(stderr, "ERROR: No file input\n");
        exit(EXIT_FAILURE);
    }
    int x;
    while(!feof(file)) {
        x = fscanf(file, "%40[0-9a-zA-Z]s", ch);
        if(x == 0) {
            ch[0] = fgetc(file);
            if(ch[0] == '#' || ch[0] == '\n') {
                if(ch[0] != '\n') fgets(ch, BUFSIZE, file);
                printf(" rest of line read\n");
            }
        } else if(x<0) break;
        else {
                 printf("%c",*ch); // continue with ... undisclosed part here
            }
    }

    return 0;
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Jan
  • 2,178
  • 3
  • 14
  • 26
  • 1
    [while(!feof(file)) is always wrong](https://stackoverflow.com/questions/5431941/why-is-while-feoffile-always-wrong) – user3629249 May 20 '19 at 23:39
  • OT: regarding: `printf("Error\n"); fprintf(stderr, "ERROR: No file input\n");` When an error indication is returned from a C library function, should output both your error message AND the text reason the system thinks the error occurred. Suggest `perror( "fopen failed" );` – user3629249 May 20 '19 at 23:41
  • 1
    regarding: `#include ` The `iostream` is a C++ header file, not a C header file and the OP taged the question as a C question – user3629249 May 20 '19 at 23:44
  • See [`while (!feof(file))` is always wrong](https://stackoverflow.com/questions/5431941/while-feof-file-is-always-wrong) — though here it is almost OK. That's very eccentric indentation (and spacing) of `else if (x < 0) break;`. – Jonathan Leffler May 21 '19 at 00:24
  • It would be nice to mark some answer as accepted if you think it was helpful (check mark under voting buttons) ;-) http://www.cplusplus.com/reference/cstdio/feof/ I know example linux has everything different or opposite including -1 end of file returned as unsigned type, default signed char in some embedded environments, etc., but at least sometimes it should work according to that reference (also tested in VS 2008 Express/ W 7) ;-) Btw, is there anyone using strictly oldschool "wild" C unless having an obscure compiler (anyway left the rest from original) ? – Jan May 21 '19 at 04:49
  • Sorry for the header - it was in another answer, but here it came from default VS (C++) project and left after cleaning. Anyway core content is while loop and as original was without headers, suppose he can copy/paste only part of interest... And nice to know there is this C format comment. – Jan May 21 '19 at 05:05
0

the following proposed code:

  1. performs the desired functionality
  2. cleanly compiles
  3. properly checks for errors
  4. this answer uses a state machine, based on: 'InComment'

and now, the proposed code:

#include <stdio.h>
#include <stdlib.h>

int main( void )
{
    int InComment = 0;

    FILE *fp = fopen( "file.txt", "r" );
    if( !fp )
    {
        perror( "fopen to read -file.txt- failed" );
        exit( EXIT_FAILURE );
    }

    int ch;

    while( (ch = fgetc(fp)) != EOF )
    {
        if( ch == '#' )
        {
            InComment = 1;
        }

        else if( ch == '\n' )
        {
            InComment = 0;
            fputc( ch, stdout );
        }

        else if( !InComment )
        {
            fputc( ch, stdout );
        }
    }
    fclose( fp );
}
user3629249
  • 16,402
  • 1
  • 16
  • 17
0

You can also make use of strcspn to trim all comments (and if not present, trim the line-endings from your buffer) in a single simple call. Where you would normally trim the line-ending from the buffer read by fgets() with:

        ch[strcspn (ch, "\r\n")] = 0;  /* trim line-ending */

You can simply add the "#" character to your reject list and nul-terminate there if a comment is present. That would reduce the complete task of removing comments beginning with '#' and outputting the newly formatted line to:

    while (fgets (ch, BUFSIZE, fp)) {   /* read every line */
        ch[strcspn (ch, "#\r\n")] = 0;  /* trim comment or line-ending */
        puts (ch);                      /* output line w/o comment */
    }

A short example taking the file to read as the first argument to the program (or reading from stdin by default if no argument is given), you could do:

#include <stdio.h>
#include <string.h>

#define BUFSIZE 1024    /* if you need a constant, #define one (or more) */

int main (int argc, char **argv) {

    char ch[BUFSIZE];
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }

    while (fgets (ch, BUFSIZE, fp)) {   /* read every line */
        ch[strcspn (ch, "#\r\n")] = 0;  /* trim comment or line-ending */
        puts (ch);                      /* output line w/o comment */
    }

    if (fp != stdin) fclose (fp);       /* close file if not stdin */

    return 0;
}

Example Input File

Borrowing Tom's example file :)

$ cat dat/comments_file.txt
#Paths
A B #Path between A and B.
D C #Path between C and D.
E F
G H

Example Use/Output

$ ./bin/comments_remove <dat/comments_file.txt

A B
D C
E F
G H

Look things over and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85