2

I have a comma delimited list of boats and their specs that I need to read into a struct. Each line contains a different boat along with their specs so I have to read the file line by line.

Sample Input File (the file I'll be using has over 20 lines):

pontoon,Crest,Carribean RS 230 SLC,1,Suzuki,115,Blue,26,134595.00,135945.00,1,200,0,250,450,450,0
fishing,Key West,239 FS,1,Mercury,250,Orange,24,86430.00,87630.00,0,0,250,200,500,250,0
sport boat,Tahoe,T16,1,Yamaha,300,Yellow,22,26895.00,27745.00,0,250,0,0,350,250,0

I have a linked list watercraft_t:

typedef struct watercraft {
    char type[15];     // e.g. pontoon, sport boat, sailboat, fishing, 
                       //      canoe, kayak, jetski, etc.
    char make[20];
    char model[30];
    int propulsion;    // 0 = none; 1 = outBoard; 2 = inBoard; 
    char engine[15];   // Suzuki, Yamaha, etc.
    int hp;             // horse power  
    char color[25];
    int length;        // feet
    double base_price;
    double total_price;
    accessories_t extras;
    struct watercraft *next;
} watercraft_t;

My main function opens the file and stores it in a pointer:

FILE * fp = fopen(argv[1], "r"); // Opens file got from command line arg

This file is then passed to a function that should parse exactly 1 line and then return that node to be placed inside of a linked list.

 // Create watercrafts from the info in file
watercraft_t *new_waterCraft( FILE *inFile )
{
    watercraft_t *newNode;

    newNode = (watercraft_t*)malloc(sizeof(watercraft_t));

    fscanf(inFile, "%s %s %s %d %s %d %s %d %lf %lf", newNode->type, newNode->make, newNode->model, &(newNode->propulsion), newNode->engine, &(newNode->hp), newNode->color, &(newNode->length), &(newNode->base_price), &(newNode->total_price));

    return newNode;
}

When calling a function to print just the type of each boat this is the result:

1. pontoon,Crest,CRS
2. SLC,1,Suzuki,11fishing,Key
3. SLC,1,Suzuki,11fishing,Key
4. SLC,1,Suzuki,11fishing,Key
5. SLC,1,Suzuki,11fishing,Key
6. SLC,1,Suzuki,11fishing,Key
7. SLC,1,Suzuki,11fishing,Key
8. SLC,1,Suzuki,11fishing,Key
9. SLC,1,Suzuki,11fishing,Key
10. SLC,1,Suzuki,11fishing,Key
11. SLC,1,Suzuki,11fishing,Key
12. SLC,1,Suzuki,11fishing,Key
13. SLC,1,Suzuki,11fishing,Key
14. SLC,1,Suzuki,11fishing,Key
15. SLC,1,Suzuki,11fishing,Key
16. SLC,1,Suzuki,11fishing,Key
17. SLC,1,Suzuki,11fishing,Key

I've narrowed the issue down to how I'm reading the values from the file with fscanf.

The first thing I attempted was to use %*c in between all of my placeholders but after running that, my output looked exactly the same. The next thing I realized is that I won't be able to use fscanf because the text file will have whitespace that needs to be read.

My next thought was to use fgets but I don't think I will be able to use that either because I'm not sure how many characters will have to be read each time. I just need it to stop reading at the end of the line while separating the values by the comma.

I've been searching for answers for a few hours now but nothing has seemed to work so far.

anastaciu
  • 23,467
  • 7
  • 28
  • 53
Drake Owen
  • 87
  • 5

3 Answers3

3

When you use %s the text will be parsed until a space or a newline character is found, for instance, for the first line of your file, fscanf will store "pontoon,Crest,Carribean" in make, the parsing stops when a space is found.

The fscanf specifier must mach the line in the file, including the commas, so you would need something like this:

" %14[^,], %19[^,], %29[^,], %d , %14[^,], %d , %24[^,], %d , %lf , %lf /*...*/"

(note the space at the beginning of the format specifier, this avoids parsing leftover blanks from previous reads)

The format specifier [^,] makes fscanf read until a comma is found or the limit size is reached, it will also parse spaces as opposed to %s, moreover, using %14[^,] avoids potential undefined behavior via buffer overflow because it limits the read to 14 characters plus the null terminator matching the size of the buffer which is 15.

Using fgets to parse the line seems like a good idea, you can then use sscanf to convert the values, it works similarly to fscanf.

I would advise you to verify the return of *scanf to make sure that the correct number of fields was read.

anastaciu
  • 23,467
  • 7
  • 28
  • 53
2

My next thought was to use fgets but I don't think I will be able to use that either because I'm not sure how many characters will have to be read each time. I just need it to stop reading at the end of the line while separating the values by the comma.

Not a bad approach. This can be done this way:

int main(void) {
    FILE *fp = fopen("in.txt", "r");

    while(1) {
        int length = 0;
        int ch;
        long offset= ftell(fp);

        // Calculate length of next line
        while((ch = fgetc(fp)) != '\n' && ch != EOF)
            length++;
        
        // Go back to beginning of line
        fseek(fp, offset, SEEK_SET);

        // If EOF, it's the last line, and if the length is zero, we're done
        if(ch == EOF && length == 0)
            break;
    
        // Allocate space for line plus BOTH \n AND \0
        char *buffer = malloc(length+2);

        // Read the line
        fgets(buffer, length+2, fp);

        // Do something with the buffer, for instance with sscanf

        // Cleanup
        free(buffer);
        if(ch == EOF) break;
    }
}

I have omitted all error checking to keep the code short.

klutt
  • 30,332
  • 17
  • 55
  • 95
1

I wouldn't recommend the approach with fscanf. Error-prone and not flexible.

Here an example to parse a csv file:

int main(int argc, char *argv[])
{
    #define LINE_MAX 1024
    char line_buf[LINE_MAX];
    char *line = line_buf;

    char *delim;
    int sep = ',';

    #define FIELD_MAX 128
    char field[FIELD_MAX];

    FILE *fp = fopen(argv[1], "r");

    //foreach line
    while ((line = fgets(line_buf, LINE_MAX, fp))) {

        //iterate over the line
        for (char *line_end = line + strlen(line) - 1; line < line_end;) {

            //search for a separator
            delim = strchr(line, sep);

            //strchr returns NULL if no separator was found
            if (delim == NULL) {
                //we set delim to the end of line,
                //because we want to process the remaining chars (field)
                delim = line_end;
            }

            //the first character of field is at 'line'
            //the last character of field is at delim - 1
            //delim points to the separator

            //e.g.
            size_t len = delim - line;
            memcpy(field, line, len);
            field[len] = '\0';
            printf("%s ", field);
            //end of example

            //set the position to the next character after the separator
            line = delim + 1;

        }

        printf("\n");

    }

    return EXIT_SUCCESS;
}

Note: add an empty line to the end of the csv file, otherwise the last character of the last line will not be considered (reason: line_end = line + strlen(line) - 1).

Erdal Küçük
  • 4,810
  • 1
  • 6
  • 11