1

Currently trying to parse an input file in C to extract variables.

The input file looks like:

% this is a comment
x 3
% another comment
y 5.0
% one last comment
z 4

x, y and z are predefined variables in my C class. My goal is to parse this file so that int x has the value 3, y has the value 5.0, and z has the value 4. Any line starting with % is ultimately ignored

I've managed to do this using fgets and sscanf - here is the attempt:

while (!feof(inputFile)) {
    fgets(ch,500,inputFile);
    if (sscanf(ch, "x %d", &x)) {
        printf("x: %d\n", x);
    } else if (sscanf(ch, "y %lf", &y)) {
        printf("y: %lf\n", y);
    } else if (sscanf(ch, "z %d", &z)) {
        printf("z: %d\n", z);
    }

and this prints out the desired results. However now I'm trying to use fgets and strtok because I don't think I can get the above code to work with a matrix (i.e. if I had in the input file (note in this case a will also be predefined in my c file):

a
1 -1 3
2 1 6
9 3 0

I would like to store those values in a 3x3 matrix which I don't think is possible using sscanf (especially if the matrix dimensonality is changeable - however it will always be an n * n matrix). My attempt with using fgets and strtok is:

while (!feof(inputFile)) {
    fgets(ch,500,inputFile);
    token = strtok(ch, " ");
    while (token) {
        if (strcmp(token, "x")) {
            printf("%s", "x found");
            // Here is my problem - how do I look at the next token where the value for x is stored

        }
        token = strtok(NULL, " ");
        break;
}
    break;
}

The problem in my code is stated in a comment. I've been thinking about this for a while, trying various things. I think the difficulty comes from understanding how strtok works - initially I was trying to store every token into an array.

Any help is appreciated in helping me work out how I can replicate my existing code to use strtok instead of sscanf so I can then work on parsing matrices.

I know there's a few parsing questions out there but I've seen none that tackle how to parse a matrix as such.

Thanks

BLUEPIXY
  • 39,699
  • 7
  • 33
  • 70
  • 3
    Start by replacing `while (!feof(inputFile)) { fgets(ch,500,inputFile);` by `while (fgets(ch,500,inputFile)) {` (there probably is a duplicate post somewhere with `why (!feof()) {...}` is always wrong.) – wildplasser Oct 05 '15 at 23:09
  • Does it effectively do the same thing? I believe I tried the latter initially and got some weird results - however it's likely I messed something up. Will do this, thanks. –  Oct 05 '15 at 23:10
  • You know what the return value of `sscanf()` means, don't you? What do you mean 3 x 3 matrix, can you show an example? – Iharob Al Asimi Oct 05 '15 at 23:12
  • From my understanding, it returns the number of items matched - but I've seen it used the way I've used it in my code. –  Oct 05 '15 at 23:14
  • [while-feof-file-is-always-wrong](http://stackoverflow.com/a/5432517/3386109) is the post wildplasser is referring to. Take a look at `strtol` and `strtod` for a solution to your problem. – user3386109 Oct 05 '15 at 23:15
  • And be carefull with the `break;`s, Eugene! – wildplasser Oct 05 '15 at 23:16
  • @user3386109 - thanks. Regarding strtol and strtod, let me work out if I'm missing something that can help me with my problem. –  Oct 05 '15 at 23:17

1 Answers1

0

Start by breaking the line into an array of tokens. Then, based on the first token, you can use either strtol to strtod to convert the remaining tokens. The following code demonstrates how to break the input lines into an array of tokens

char line[500];
char *token[200];
int tokenCount = 0;
while ( fgets( line, sizeof line, fp ) != NULL )
{
    token[0] = strtok( line, " " );
    if ( strcmp( token[0], "%" ) == 0 )   // skip comments
        continue;

    // break the line into tokens
    int i;
    for ( i = 1; i < 200; i++ )
        if ( (token[i] = strtok( NULL, " " )) == NULL )
            break;
    tokenCount = i;

    // output the token array
    for ( i = 1; i < tokenCount; i++ )
        printf( "%s: %s\n", token[0], token[i] );
}

The for loop generates an array index from 1 to 200. strtok( NULL, " " ) extracts the next token from the line, and returns a pointer to that token (or returns NULL if there are no more tokens). The returned value is stored in token[i]. If the return value was NULL the loop breaks.

user3386109
  • 34,287
  • 7
  • 49
  • 68
  • Thanks for this. Just wondering if you can explain how the second bit of the code (`//break the line into tokens`) works? –  Oct 06 '15 at 00:46
  • @F.Tahir Added an explanation at the end, hope that helps. – user3386109 Oct 06 '15 at 00:57
  • Okay, so basically we take in a line from a file, break it into an array of strings, do something to it, and go to the next line in the file. If this is correct, is the old array values just written over? –  Oct 06 '15 at 01:06
  • Yes, both the input line and the array of token pointers will be overwritten for each line. But the expectation is that you will convert the tokens to numbers and store those numbers elsewhere (or use them before moving on to the next line). – user3386109 Oct 06 '15 at 01:10
  • Just for clarification - the line `token[0] = strtok( line, " " );` - does this store the entire line into the first index of the array? And then we split this up –  Oct 06 '15 at 01:20
  • 1
    The first time you call `strtok`, you pass a pointer to the whole line, and `strtok` returns a pointer to the first token. From then on, you pass `NULL` to `strtok` and it returns a pointer to the next token. So, for example, the line `token[0] = strtok(line," ");` will set `token[0]` to `"x"`, `"y"`, `"z"` or `"%"`. – user3386109 Oct 06 '15 at 01:25
  • for some reason if I use `if (strcmp(token[0], "a") == 0) { printf("%s\n", "matrix detected") }`, this if statement is never triggered even though a matrix `a` is declared in the input file. My understanding (after testing) is that any variable thats declared on its own line will be ignored, i.e. `a 1` will be captured, but `a \n 1` won't be –  Oct 06 '15 at 19:09
  • This is because I split on whitespace, and `a` has no whitespace and thus isn't stored as a token. What are some potential solutions to this? –  Oct 06 '15 at 19:18
  • @F.Tahir The `"a"` should still be stored in token[0], even with no whitespace following it. I suggest starting a new question to discuss the parsing of the array. That way you can show the code that you're trying. I'll take a look in a few hours. – user3386109 Oct 06 '15 at 19:27
  • The problem is because the delimiter is " " so I thought the a wouldn't be stored. The output also shows no "a" being printed. Currently doing some debugging. –  Oct 06 '15 at 19:33
  • Wondering if you've had a chance to look at this? –  Oct 06 '15 at 23:52
  • @F.Tahir Looking at the code in my answer, the reason that `"a"` would not be printed is because `i` starts at 1 and the `tokenCount` would also be 1. So nothing would print, even though `token[0]` would be `"a"`. But that's assuming that you're using the exact same print loop that I put in the answer. – user3386109 Oct 07 '15 at 01:26
  • @F.Tahir The best thing to do is start a new question, so that you can post your latest code. – user3386109 Oct 07 '15 at 01:27
  • I will start a new question tomorrow if I don't get this working. If token[0] = "a", why doesn't `strcmp(token[0], "a") == 0` work? –  Oct 07 '15 at 01:55
  • It *should* work, so there's something else going on in the code. That's why I need to see the code. – user3386109 Oct 07 '15 at 01:58
  • http://pastebin.com/yrgvsTS6 - let me know if this is sufficient enough or if you really want me to make a new topic. I've tried a few different things such as splitting on new line etc and none seem to flag the `else if (strcmp(token[0], "a") == 0) {` condition –  Oct 07 '15 at 02:12
  • If I change the delim to `" \n"` as opposed to `" "`, this triggers the flag condition, but unsure how I'd get it to store in a 2d array. –  Oct 07 '15 at 02:30