1

I'm trying to read in a CSV file line by line and then split the lines into their values as read from the CSV file by separating the lines with the comma delimiter. Once successful, the goal is to read this 2D array into a sophisticated model in C as the input. For this I'm using getline() and strtok().

I'm new to C, and I've spent weeks getting to this point so please don't suggest different functions for this unless absolutely necessary. I'll post what I have so far and insert what warnings I'm getting and where if anyone could please help me figure out why this code won't produce the array. I think it may just be a pointer issue but I've been trying everything I can and it's not working.

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ARRAYLENGTH 9
#define ARRAYWIDTH 7

float data[ARRAYLENGTH][ARRAYWIDTH];

int main(void) {

  char *line = NULL;
  size_t len = 0;
  ssize_t read;

  FILE *fp;
  fp=fopen("airvariablesSend.csv", "r");

  if(fp == NULL){
    printf("cannot open file\n\n");
    return -1;
  }

  int k , l;
  char **token; //for parsing through line using strtok()

  char comma = ',';
  const char *SEARCH = &comma; //delimiter for csv 
  char *todata; 

  for (l=0; l< ARRAYLENGTH +1; l++){ 
    while ((read = getline(&line, &len, fp)) != -1) {

      //Getline() automatically resizes when you set pointer to NULL and 
      //len to 0 before you loop
      //Here, the delimiting character is the newline character in this 
      //form. Newline character included in csv file

      //printf("Retrieved line of length %zu :\n", read);
      printf("%s", line);

      //The first line prints fine here. Now once I try to parse through 
      //I start getting warnings:

      for(k = 0; k < ARRAYWIDTH; k++) { //repeats for max number of columns

        token = &line;
        while (strtok(token, SEARCH)){

          //I'm getting this warning for the while loop line:
          //warning: passing argument 1 of `strtok' from incompatible pointer type

          fscanf(&todata, "%f", token);

          //I'm getting these warnings for fscanf. I'm doing this because
          //my final values in the array to be floats to put in the  
          //model      

          //warning: passing argument 1 of `fscanf' from incompatible pointer type
          //warning: format `%f' expects type `float *', but argument 3 has type 
          // 'char **'  

          todata = &data[l][k];

          //And finally, I'm getting this warning telling me everything is 
          //incompatible.
          //warning: assignment from incompatible pointer type. 

          printf("%f", data[l][k]);
        }

      }

    }
  }       

  free(line);
  //Free memory allocated by getline()
  fclose(fp);
  //Close the file.
  exit(EXIT_SUCCESS);
  return 0;
}
Nisse Engström
  • 4,738
  • 23
  • 27
  • 42
  • `fscanf` is taking `FILE*` as a first argument. You are passing `char**` to it. – Eugene Sh. Mar 23 '15 at 20:16
  • Also, no need for `strtok`, just use `strtof` or `strtod` to read the floating point values, using the value returned in `endptr` to advance to the read of the next number on successive calls. There are a number of examples on SO. – David C. Rankin Mar 23 '15 at 20:25
  • 'read' is the name of a C library function, in stdio.h, so should not be redefined as a variable name. – user3629249 Mar 25 '15 at 03:37
  • the loop is controlled by 'while (strtok(token, SEARCH)){' for all loops executions. however, strtok first parameter should be a pointer to the string only on the first execution and NULL for all following executions for that string. – user3629249 Mar 25 '15 at 03:39

2 Answers2

1

This example shows how to use the function strok() : call it once on the line pch = strtok (line," ,"); then call it on NULL in the while loop : pch = strtok (NULL," ,");. The function sscanf() may be used to parse the string. It is similar to fscanf() for files, but be careful if you need to call it many times on the same string (or here).

The index of a for loop needs to be change as well : it must be for (l=0; l< ARRAYLENGTH; l++)

I suggest you the following code, starting from yours :

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ARRAYLENGTH 9
#define ARRAYWIDTH 7

float data[ARRAYLENGTH][ARRAYWIDTH];

int main(void) {

    char *line = NULL;
    size_t len = 0;
    ssize_t read;

    FILE *fp;
    fp=fopen("bla.csv", "r");

    if(fp == NULL){
        printf("cannot open file\n\n");
        return -1;
    }

    int k , l;
    //char **token; //for parsing through line using strtok()

    for (l=0; l< ARRAYLENGTH ; l++){
        while ((read = getline(&line, &len, fp)) != -1) {
            printf("%s", line);
            for(k = 0; k < ARRAYWIDTH; k++) {
                char *pch = strtok (line,",");
                while (pch != NULL)
                {
                    if(sscanf(pch, "%f",&data[l][k])!=1){printf("bad file\n");exit(1);}
                    printf("%f\n", data[l][k]);
                    pch = strtok (NULL, ","); //delimiter for csv
                }


            }

        }
    }
    if (line){
        free(line);
    }
    fclose(fp);
    return 0;
}
Community
  • 1
  • 1
francis
  • 9,525
  • 2
  • 25
  • 41
  • Thank you @francis. My output is still a little wonky and it's repeating some values but this is 10x better than what I had before. At least I have individual values as output and not just the line! – spacedancechimp Mar 23 '15 at 20:54
  • Welcome on stackoverflow ! The sample code in my answer prints both the line and the obtained values. If you add a description of a failing case by editing your question, i may be able to help... EDIT : i think i saw the problem : i will edit my answer. – francis Mar 23 '15 at 21:01
1

Using getline:

While strtok is fine, it is unnecessary when converting the values directly to numbers with strtof, strtol, .. or the like. Unless you are using the values as string values, you will have to call a conversion routine (and do appropriate error checking) anyway. The conversion routines already set an end pointer for you that can be used to parse the input. The point being, why use two functions to accomplish what one was intended to do to begin with? The following makes use of getline and strtof:

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

#define ARRAYLENGTH 9
#define ARRAYWIDTH 7

int main (void) {

    char *line = NULL;      /* initialize ALL variables */
    size_t len = 0;
    ssize_t read = 0;
    float data[ARRAYLENGTH][ARRAYWIDTH] = {{0},{0}};
    size_t al = 0;          /* array length counter     */
    size_t aw = 0;          /* array width counter      */
    FILE *fp = NULL;

    if (!(fp = fopen ("airvariablesSend.csv", "r"))) {
        fprintf (stderr, "error: file open failed.\n");
        return 1;  /* do not return -1 to the shell */
    }

    while ((read = getline (&line, &len, fp)) != -1)
    {
        char *p = line;     /* pointer to line      */
        char *ep = NULL;    /* end pointer (strtof) */

        /* strip trailing '\n' or '\r' 
         * not req'd here, but good habit 
         */
        while (read > 0 && (line[read-1] == '\n' || line[read-1] == '\r'))
            line[--read] = 0;

        errno = 0;
        aw = 0;
        while (errno == 0)
        {
            /* parse/convert each number in line    */
            data[al][aw] = strtof (p, &ep);

            /* note: overflow/underflow checks omitted */
            /* if valid conversion to number */
            if (errno == 0 && p != ep)
            {
                aw++;                   /* increment index      */
                if (aw == ARRAYWIDTH)   /* check for full row   */
                    break;
                if (!ep) break;         /* check for end of str */
            }

            /* skip delimiters/move pointer to next (-) or digit   */
            while (*ep && *ep != '-' && (*ep <= '0' || *ep >= '9')) ep++;
            if (*ep)
                p = ep;
            else
                break;
        }

        al++;
        if (al == ARRAYLENGTH)          /* check length full    */
            break;
    }   

    if (line) free(line);
    if (fp) fclose(fp);

    printf ("\nArray Contents:\n\n");
    for (al = 0; al < ARRAYLENGTH; al++) {
        for (aw = 0; aw < ARRAYWIDTH; aw++)
            printf (" %8.2f", data[al][aw]);
        printf ("\n");
    }

    printf ("\n");

    exit(EXIT_SUCCESS);
}

Note: _GNU_SOURCE and string.h are unnecessary for this code, but have been left in case they are need in the remainder of your code.

Input

$ cat airvariablesSend.csv

-1.21,2.30,3.41,4.52,5.63,6.74,7.85
1.21,-2.30,3.41,4.52,5.63,6.74,7.85
1.21,2.30,-3.41,4.52,5.63,6.74,7.85
1.21,2.30,3.41,-4.52,5.63,6.74,7.85
1.21,2.30,3.41,4.52,-5.63,6.74,7.85
1.21,2.30,3.41,4.52,5.63,-6.74,7.85
1.21,2.30,3.41,4.52,5.63,6.74,-7.85
1.21,2.30,3.41,4.52,5.63,-6.74,7.85
1.21,2.30,3.41,4.52,-5.63,6.74,7.85

Output

$ ./bin/getlinefloatcsv

Array Contents:

    -1.21     2.30     3.41     4.52     5.63     6.74     7.85
     1.21    -2.30     3.41     4.52     5.63     6.74     7.85
     1.21     2.30    -3.41     4.52     5.63     6.74     7.85
     1.21     2.30     3.41    -4.52     5.63     6.74     7.85
     1.21     2.30     3.41     4.52    -5.63     6.74     7.85
     1.21     2.30     3.41     4.52     5.63    -6.74     7.85
     1.21     2.30     3.41     4.52     5.63     6.74    -7.85
     1.21     2.30     3.41     4.52     5.63    -6.74     7.85
     1.21     2.30     3.41     4.52    -5.63     6.74     7.85

Using fscanf Only:

Of course, if your intention was to use fscanf and do away with getline, then your input routine reduces to:

#include <stdio.h>
#include <stdlib.h>

#define ARRAYLENGTH 9
#define ARRAYWIDTH 7

int main (void) {

    float data[ARRAYLENGTH][ARRAYWIDTH] = {{0},{0}};
    size_t al = 0;          /* array length counter     */
    size_t aw = 0;          /* array width counter      */
    FILE *fp = NULL;

    if (!(fp = fopen ("airvariablesSend.csv", "r"))) {
        fprintf (stderr, "error: file open failed.\n");
        return 1;  /* do not return -1 to the shell */
    }

    for (al =0; al < ARRAYLENGTH; al++)
        fscanf (fp, "%f,%f,%f,%f,%f,%f,%f", &data[al][0], &data[al][1], 
                &data[al][2], &data[al][3], &data[al][4], &data[al][5], &data[al][6]);

    if (fp) fclose(fp);

    printf ("\nArray Contents:\n\n");
    for (al = 0; al < ARRAYLENGTH; al++) {
        for (aw = 0; aw < ARRAYWIDTH; aw++)
            printf (" %8.2f", data[al][aw]);
        printf ("\n");
    }

    printf ("\n");

    exit(EXIT_SUCCESS);
}

However, note: using fscanf is far less flexible than using either getline or fgets. It relies on the input format string matching the data exactly to prevent a matching failure. While this is fine in some cases, where flexibility is needed, reading a line at a time with getline of fgets is the better choice. (all it takes is a stray character to torpedo the fscanf conversion)

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Thank you so much @David C. Rankin ! All my values printed out perfectly as they are in the csv. using the 'strtof()' function. I used that and 'getline()' because I will have >3000 rows for my real data set. – spacedancechimp Mar 24 '15 at 13:52
  • Glad I could help. `getline` is the recommended choice for all line input. Just note that it isn't 100% portable to all compilers, some of the MS compilers have problems. However, nothing else will automatically allocate for you and return the number of characters actually read. One other **note:**, if reading with `getline` (having it allocate), and passing the line to functions that alter the string (e.g. `strtok`, etc.) **make a copy of the line BEFORE** passing to `strtok`, otherwise your code will segfault when `getline` attempts to free the memory it allocated. `:p` – David C. Rankin Mar 24 '15 at 14:25
  • Also, if my answer was help, you can **accept** the answer by clicking next to the number at the top-left. If somebody else's was more helpful, then accept theirs. – David C. Rankin Mar 24 '15 at 14:27
  • Okay great. Thanks! Yes your answer was the most helpful, I will definitely accept it but I have one more question before I do and the question goes unnoticed. @David C. Rankin , Is there a way to keep the negative sign attached to some of my values using the `getline` and `strtof` example? I tried changing the `len`, `read`, `al` and `aw` variables to `int` type but that didn't change the negatives back to negative. Is this possible here? – spacedancechimp Mar 24 '15 at 15:22
  • and I get this error: pointer targets in passing argument 2 of ‘getline’ differ in signedness – spacedancechimp Mar 24 '15 at 15:23
  • That generally means there is an `int` where there should be a `size_t` or `unsigned int`. (it is generally a **warning**, not **error**) Meaning, you are passing, or attempting to compare a `signed` with `unsigned` number. Make sure `argument 2` for `getline` is `size_t`. What/where is this error? **AND** (What compiler are you using??) Not a MS compiler is it? – David C. Rankin Mar 24 '15 at 15:29
  • The error occurs on the `getline` while loop. Yes I changed `len` and `read` and `al` and `aw` to `int`. I need the types to be signed because I have negative values in my data. – spacedancechimp Mar 24 '15 at 15:34
  • Oh, oh, **no, no, no, no**, don't change the variable types. I'll fix the code. It is skipping the `-` signes `:p` – David C. Rankin Mar 24 '15 at 15:35
  • Note the change to this line `while (*ep && *ep != '-' && (*ep <= '0' || *ep >= '9')) ep++;` (that just added `*ep != '-'`) which means don't skip the `-` signes `:p` – David C. Rankin Mar 24 '15 at 15:41
  • the OP ask to not change the functions used. so -1 – user3629249 Mar 25 '15 at 04:00
  • You idiot! Do you not see the extended conversation above your comment with the OP. Grow up and learn to read. This kind of BS is not tolerated on SO. – David C. Rankin Mar 25 '15 at 05:54