-1

I am trying to read in a file that contains digits operated by commas and store them in an array without the commas present.

For example: processes.txt contains

0,1,3
1,0,5
2,9,8
3,10,6

And an array called numbers should look like:

0 1 3 1 0 5 2 9 8 3 10 6

The code I had so far is:

FILE *fp1;
char c; //declaration of characters

fp1=fopen(argv[1],"r"); //opening the file



int list[300];


c=fgetc(fp1); //taking character from fp1 pointer or file
int i=0,number,num=0;

while(c!=EOF){ //iterate until end of file
    if (isdigit(c)){ //if it is digit
        sscanf(&c,"%d",&number); //changing character to number (c)
        num=(num*10)+number;

    }
    else if (c==',' || c=='\n') { //if it is new line or ,then it will store the number in list
        list[i]=num;
        num=0;
        i++;

    }

    c=fgetc(fp1);

}

But this is having problems if it is a double digit. Does anyone have a better solution? Thank you!

Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
ANT12
  • 1
  • 3
    Please note that [`fgetc`](https://en.cppreference.com/w/c/io/fgetc) return an **`int`**. This is actually very important for your `EOF` check. – Some programmer dude Mar 22 '19 at 14:10
  • 3
    Change `c` to an `int`, otherwise `while(c!=EOF)` will not work. `sscanf(&c,"%d",&number);` should be `number = c - '0';`. – mch Mar 22 '19 at 14:10
  • 2
    Furthermore note that `&c` is not a null-terminated string. You can't use it as such (for example as a source in `sscanf`). – Some programmer dude Mar 22 '19 at 14:11
  • A very common mistake. It must be an int because otherwise EOF would have to be in the range 0 to 255, which are all valid char values. – Neil Mar 22 '19 at 14:12
  • Should be `while(c != EOF && i < 300)` for security reason :) – Igor Galczak Mar 22 '19 at 14:20
  • Look at the function `strtok` (I prefer to use the reentrant version `strtok_r`) it should be easier. – Chromz Mar 22 '19 at 14:25
  • Note: a line may not be empty and must have at least one number (otherwise an entry will be made in the array for the empty line). – Paul Ogilvie Mar 22 '19 at 14:37
  • 2
    [Why must the variable used to hold getchar's return value be declared as int?](https://stackoverflow.com/q/18013167/995714) – phuclv Mar 22 '19 at 14:48

3 Answers3

2

For the data shown with no space before the commas, you could simply use:

while (fscanf(fp1, "%d,", &num) == 1 && i < 300)
    list[i++] = num;

This will read the comma after the number if there is one, silently ignoring when there isn't one. If there might be white space before the commas in the data, add a blank before the comma in the format string. The test on i prevents you writing outside the bounds of the list array. The ++ operator comes into its own here.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

First, fgetc returns an int, so c needs to be an int.

Other than that, I would use a slightly different approach. I admit that it is slightly overcomplicated. However, this approach may be usable if you have several different types of fields that requires different actions, like a parser. For your specific problem, I recommend Johathan Leffler's answer.

int c=fgetc(f);

while(c!=EOF && i<300) {
    if(isdigit(c)) {
        fseek(f, -1, SEEK_CUR);
        if(fscanf(f, "%d", &list[i++]) != 1) {
            // Handle error
        }
    }
    c=fgetc(f);
}

Here I don't care about commas and newlines. I take ANYTHING other than a digit as a separator. What I do is basically this:

read next byte
if byte is digit:
     back one byte in the file
     read number, irregardless of length
else continue

The added condition i<300 is for security reasons. If you really want to check that nothing else than commas and newlines (I did not get the impression that you found that important) you could easily just add an else if (c == ... to handle the error.

Note that you should always check the return value for functions like sscanf, fscanf, scanf etc. Actually, you should also do that for fseek. In this situation it's not as important since this code is very unlikely to fail for that reason, so I left it out for readability. But in production code you SHOULD check it.

klutt
  • 30,332
  • 17
  • 55
  • 95
  • If `argv[1]` is `/dev/tty`, the seek won't work. You could use `ungetc()` reliably. I'm not convinced that the 'character at a time' approach is sensible when it can be done so succinctly with `fscanf()` — as in my answer. – Jonathan Leffler Mar 22 '19 at 14:50
0

My solution is to read the whole line first and then parse it with strtok_r with comma as a delimiter. If you want portable code you should use strtok instead.

A naive implementation of readline would be something like this:

static char *readline(FILE *file)
{
    char *line = malloc(sizeof(char));
    int index = 0;
    int c = fgetc(file);
    if (c == EOF) {
        free(line);
        return NULL;
    }
    while (c != EOF && c != '\n') {
        line[index++] = c;
        char *l = realloc(line, (index + 1) * sizeof(char));
        if (l == NULL) {
            free(line);
            return NULL;
        }
        line = l;
        c = fgetc(file);
    }
    line[index] = '\0';
    return line;
}

Then you just need to parse the whole line with strtok_r, so you would end with something like this:

int main(int argc, char **argv)
{
    FILE *file = fopen(argv[1], "re");
    int list[300];
    if (file == NULL) {
        return 1;
    }
    char *line;
    int numc = 0;
    while((line = readline(file)) != NULL) {
        char *saveptr;
        // Get the first token
        char *tok = strtok_r(line, ",", &saveptr);
        // Now start parsing the whole line
        while (tok != NULL) {
            // Convert the token to a long if possible
            long num = strtol(tok, NULL, 0);
            if (errno != 0) {
                // Handle no value conversion
                // ...
                // ...
            }
            list[numc++] = (int) num;
            // Get next token
            tok = strtok_r(NULL, ",", &saveptr);
        }
        free(line);
    }
    fclose(file);
    return 0;
}

And for printing the whole list just use a for loop:

for (int i = 0; i < numc; i++) {
    printf("%d ", list[i]);
}
printf("\n");
Chromz
  • 183
  • 3
  • 11