71

I have been working on a small exercise for my CIS class and am very confused by the methods C uses to read from a file. All that I really need to do is read through a file line by line and use the information gathered from each line to do a few manipulations. I tried using the getline method and others with no luck. My code is currently as follows:

int main(char *argc, char* argv[]){
      const char *filename = argv[0];
      FILE *file = fopen(filename, "r");
      char *line = NULL;

      while(!feof(file)){
        sscanf(line, filename, "%s");
        printf("%s\n", line);
      }
    return 1;
}

Right now I am getting a seg fault with the sscanf method and I am not sure why. I am a total C noob and just wondering if there was some big picture thing that I was missing. Thanks

Dan Bradbury
  • 2,071
  • 2
  • 21
  • 27

4 Answers4

156

So many problems in so few lines. I probably forget some:

  • argv[0] is the program name, not the first argument;
  • if you want to read in a variable, you have to allocate its memory
  • one never loops on feof, one loops on an IO function until it fails, feof then serves to determinate the reason of failure,
  • sscanf is there to parse a line, if you want to parse a file, use fscanf,
  • "%s" will stop at the first space as a format for the ?scanf family
  • to read a line, the standard function is fgets,
  • returning 1 from main means failure

So

#include <stdio.h>

int main(int argc, char* argv[])
{
    char const* const fileName = argv[1]; /* should check that argc > 1 */
    FILE* file = fopen(fileName, "r"); /* should check the result */
    char line[256];

    while (fgets(line, sizeof(line), file)) {
        /* note that fgets don't strip the terminating \n, checking its
           presence would allow to handle lines longer that sizeof(line) */
        printf("%s", line); 
    }
    /* may check feof here to make a difference between eof and io failure -- network
       timeout for instance */

    fclose(file);

    return 0;
}
bzeaman
  • 1,128
  • 11
  • 28
AProgrammer
  • 51,233
  • 8
  • 91
  • 143
  • 29
    don't forget `fclose(file)` before return. – vivisidea Feb 18 '14 at 09:46
  • 8
    the `fclose(file)` is actually not necessary, since it's happening in `main` and it'll automatically close all opened file buffers. – Leandros Aug 09 '15 at 13:52
  • 20
    @Leandros it's always better to be safe, than sorry! – vallentin Jul 29 '16 at 21:26
  • 2
    Still good to have for beginners, because sometimes it is necessary even at the end of main. `FILE*` objects are buffered in C, so if data was being written to a file and `fclose` wasn't called some of the data might not get flushed. – rovaughn Sep 01 '16 at 17:20
  • 1
    Hi, @alecRN: Are you sure? AFAIK, buffered output on a stream is flushed automatically when the program terminates by calling exit (see: https://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html ), and the OS will decides when to flush(can call fsync). There's an implicit call to exit_group at the end of the execution, you can see it using strace and nm. I suppose it's not added by gcc because there's no such symbol, probably it's added by the runtime. Even _exit closes the open file descriptors. Anyway I agree with you that it's a good habit to close open file explicitly /Ángel – Angel Jan 10 '17 at 09:02
  • Since when and why "not loop on feof()"? I never saw that "best practice" before... – BitTickler Sep 24 '17 at 20:25
  • 1
    @BitTickler, I was confused about feof() as well. Until I accidentially spelled a file name wrong and the program ended up in an infinite loop. – KANJICODER Mar 03 '19 at 21:16
9

To read a line from a file, you should use the fgets function: It reads a string from the specified file up to either a newline character or EOF.

The use of sscanf in your code would not work at all, as you use filename as your format string for reading from line into a constant string literal %s.

The reason for SEGV is that you write into the non-allocated memory pointed to by line.

Abrixas2
  • 3,207
  • 1
  • 20
  • 22
6

In addition to the other answers, on a recent C library (Posix 2008 compliant), you could use getline. See this answer (to a related question).

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
5

Say you're dealing with some other delimiter, such as a \t tab, instead of a \n newline.

A more general approach to delimiters is the use of getc(), which grabs one character at a time.

Note that getc() returns an int, so that we can test for equality with EOF.

Secondly, we define an array line[BUFFER_MAX_LENGTH] of type char, in order to store up to BUFFER_MAX_LENGTH-1 characters on the stack (we have to save that last character for a \0 terminator character).

Use of an array avoids the need to use malloc and free to create a character pointer of the right length on the heap.

#define BUFFER_MAX_LENGTH 1024

int main(int argc, char* argv[])
{
    FILE *file = NULL;
    char line[BUFFER_MAX_LENGTH];
    int tempChar;
    unsigned int tempCharIdx = 0U;

    if (argc == 2)
         file = fopen(argv[1], "r");
    else {
         fprintf(stderr, "error: wrong number of arguments\n"
                         "usage: %s textfile\n", argv[0]);
         return EXIT_FAILURE;
    }

    if (!file) {
         fprintf(stderr, "error: could not open textfile: %s\n", argv[1]);
         return EXIT_FAILURE;
    }

    /* get a character from the file pointer */
    while(tempChar = fgetc(file))
    {
        /* avoid buffer overflow error */
        if (tempCharIdx == BUFFER_MAX_LENGTH) {
            fprintf(stderr, "error: line is too long. increase BUFFER_MAX_LENGTH.\n");
            return EXIT_FAILURE;
        }

        /* test character value */
        if (tempChar == EOF) {
            line[tempCharIdx] = '\0';
            fprintf(stdout, "%s\n", line);
            break;
        }
        else if (tempChar == '\n') {
            line[tempCharIdx] = '\0';
            tempCharIdx = 0U;
            fprintf(stdout, "%s\n", line);
            continue;
        }
        else
            line[tempCharIdx++] = (char)tempChar;
    }

    return EXIT_SUCCESS;
}

If you must use a char *, then you can still use this code, but you strdup() the line[] array, once it is filled up with a line's worth of input. You must free this duplicated string once you're done with it, or you'll get a memory leak:

#define BUFFER_MAX_LENGTH 1024

int main(int argc, char* argv[])
{
    FILE *file = NULL;
    char line[BUFFER_MAX_LENGTH];
    int tempChar;
    unsigned int tempCharIdx = 0U;
    char *dynamicLine = NULL;

    if (argc == 2)
         file = fopen(argv[1], "r");
    else {
         fprintf(stderr, "error: wrong number of arguments\n"
                         "usage: %s textfile\n", argv[0]);
         return EXIT_FAILURE;
    }

    if (!file) {
         fprintf(stderr, "error: could not open textfile: %s\n", argv[1]);
         return EXIT_FAILURE;
    }

    while(tempChar = fgetc(file))
    {
        /* avoid buffer overflow error */
        if (tempCharIdx == BUFFER_MAX_LENGTH) {
            fprintf(stderr, "error: line is too long. increase BUFFER_MAX_LENGTH.\n");
            return EXIT_FAILURE;
        }

        /* test character value */
        if (tempChar == EOF) {
            line[tempCharIdx] = '\0';
            dynamicLine = strdup(line);
            fprintf(stdout, "%s\n", dynamicLine);
            free(dynamicLine);
            dynamicLine = NULL;
            break;
        }
        else if (tempChar == '\n') {
            line[tempCharIdx] = '\0';
            tempCharIdx = 0U;
            dynamicLine = strdup(line);
            fprintf(stdout, "%s\n", dynamicLine);
            free(dynamicLine);
            dynamicLine = NULL;
            continue;
        }
        else
            line[tempCharIdx++] = (char)tempChar;
    }

    return EXIT_SUCCESS;
}
Alex Reynolds
  • 95,983
  • 54
  • 240
  • 345
  • 1
    I'll down-vote any `while(!feof(file))` even the case occurring once it a blue moon where it isn't damageable (Note that here it will probably never been true, there is a break to leave the loop in that case, `while (true)` would work as well.) There are too many who think it is the correct idiom. – AProgrammer Feb 09 '12 at 14:18
  • I had no idea that was an issue. I'd honestly like to learn more about this. What are the problems with that usage? – Alex Reynolds Feb 10 '12 at 01:38
  • There are at lot of questions where this came up, http://stackoverflow.com/questions/5431941/while-feof-file-is-always-wrong for instance. – AProgrammer Feb 10 '12 at 10:22
  • 2
    Okay, I fixed the loops. Thanks for the pointer. I learn something new every day. – Alex Reynolds Feb 10 '12 at 22:00