-1

I have this code where its read multiple files and print a certain value. After reading files, at a certain moment my while loop stop and show a segmentation fault ...

Here is my code

int main () {

    const char s[2] = ",";
    const char s2[2] = ":";

    char var1[] = "fiftyTwoWeekHigh\"";
    char *fiftyhigh;
    char *fiftyhigh2;
    char *fiftyhigh_token;
    char *fiftyhigh2_token;
   
    char var2[] = "fiftyTwoWeekLow\"";
    char *fiftylow;
    char *fiftylow2;
    char *fiftylow_token;
    char *fiftylow2_token;

    char var3[] = "regularMarketPrice\"";
    char *price;
    char *price2;
    char *price_token;
    char *price2_token;
   
    FILE *fp;
    char* data = "./data/";
    char* json = ".json";
    char line[MAX_LINES];
    char line2[MAX_LINES];
    int len;
    char* fichier = "./data/indices.txt";

    fp = fopen(fichier, "r");

    if (fp == NULL){
        printf("Impossible d'ouvrir le fichier %s", fichier);
        return 1;
    }

    while (fgets(line, sizeof(line), fp) != NULL) {
        char fname[10000];
        len = strlen(line);
        if (line[len-1] == '\n') {
            line[len-1] = 0;
        }
        
        int ret = snprintf(fname, sizeof(fname), "%s%s%s", data, line, json);
        if (ret < 0) {
            abort();
        }
        printf("%s\n", fname);
        
        FILE* f = fopen(fname, "r");

        while ( fgets( line2, MAX_LINES, f ) != NULL ) {
            fiftyhigh = strstr(line2, var1);
            fiftyhigh_token = strtok(fiftyhigh, s);
            fiftyhigh2 = strstr(fiftyhigh_token, s2);
            fiftyhigh2_token = strtok(fiftyhigh2, s2);
            printf("%s\n", fiftyhigh2_token);

            fiftylow = strstr(line2, var2);
            fiftylow_token = strtok(fiftylow, s);
            fiftylow2 = strstr(fiftylow_token, s2);
            fiftylow2_token = strtok(fiftylow2, s2);
            printf("%s\n", fiftylow2_token);

            price = strstr(line2, var3);
            price_token = strtok(price, s);
            price2 = strstr(price_token, s2);
            price2_token = strtok(price2, s2);
            printf("%s\n", price2_token);
        
            //printf("\n%s\t%s\t%s\t%s\t%s", line, calculcx(fiftyhigh2_token, price2_token, fiftylow2_token), "DIV-1", price2_token, "test");
            
        }
        fclose(f);
    }
    fclose(fp);
    return 0;
}

and the output is :

./data/k.json
13.59
5.31
8.7
./data/BCE.json
60.14
46.03
56.74
./data/BNS.json
80.16
46.38
78.73
./data/BLU.json
16.68
2.7
Segmentation fault

It is like my program stop because it can't reach a certain data at a certain file... Is there a way to allocate more memory ? Because my MAX_LINES is already set at 6000.

Aquaaa
  • 13
  • 1
  • 4
  • 1
    What is the input? Make sure the returned pointers are not `NULL` before passing them to the next steps. – MikeCAT Mar 21 '21 at 14:20
  • The input is a directory with files. My program read each .json files and print 3 values of these files (I have to take values between commas and after ":"). – Aquaaa Mar 21 '21 at 14:23
  • If you pass a return value from `fiftylow = strstr(line2, var2);` to `strtok()` which is `NULL` you are fooling `strtok()` to believe that it is a continuation of its splitting process. – Weather Vane Mar 21 '21 at 14:26
  • 1
    `strtok` does not make copies of the token. Every `fgets` overwrites the line and your tokens. Use `malloc` and `strcpy`. – Paul Ogilvie Mar 21 '21 at 14:26
  • `strdup()` is handy if you have access to it, but importantly don't set arbitrary array limits you might smash thorugh. This code is full of rampant duplication which makes it a lot harder to understand than it should be. – tadman Mar 21 '21 at 14:26
  • If you're using JSON, where's your JSON parser? Do use a library to do it properly. This code is *extremely* brittle. – tadman Mar 21 '21 at 14:28
  • It's also worth noting that if you're not stuck using C, this kind of stuff is absolutely effortless in something like Node.js. It's also pretty easy in Ruby, Python, or many other scripting languages that have an easy to use JSON library built-in. – tadman Mar 21 '21 at 14:29
  • I can't use JSON parser... I know... Have to do it with standards libraries. – Aquaaa Mar 21 '21 at 14:30
  • You'd have to post at least the content of `./data/BLU.json`. – Armali Mar 21 '21 at 14:44
  • @Armali the content of this json is too long to post here, but every json file are on one line and every json files have these keys. So in BLU.json : "fiftyTwoWeekHigh":16.68,"fiftyTwoWeekLow":2.7,"regularMarketPrice":5.23 So it seems to doesnt work with regularMarketPrice, but it works for the two others files. – Aquaaa Mar 21 '21 at 14:48
  • Not more than 6000 characters have been read from the file. Why do you think 6000 characters were _too long to post here_? – Armali Mar 21 '21 at 15:18
  • Even if I change my MAX_LINES variable to like 30000, it stops at the same place. It's like it can't reach more than 3 files. – Aquaaa Mar 21 '21 at 15:21
  • That's irrelevant; if the error occurs when _MAX_LINES is already set at 6000_, posting 6000 characters (if there are that many at all) is enough. – Armali Mar 21 '21 at 15:23
  • What is `MAX_LINES`? Where is it defined? – Paul Ogilvie Mar 21 '21 at 15:41
  • `strtok` may return null if it could not find a token. You don't check for that. Any following function calls may now fail. – Paul Ogilvie Mar 21 '21 at 15:44
  • @PaulOgilvie My define MAX_LINES is at the top of my program, before the main function. I tried to add some if conditions to see if my token is null but it shows the same result as before. – Aquaaa Mar 21 '21 at 15:47

2 Answers2

0
  • Did you mean '\0' ?
if (line[len-1] == '\n') {
  line[len-1] = 0;
}

I advise you to use gdb to see where the segfault occurs and why. I don't think you have to allocate much more memory. But the segfault may happens because you don't have anymore data and you still print the result.

Use if(price2_token!=NULL) printf("%s\n", price2_token); for example.

Tendocat
  • 49
  • 3
  • It is meant to be newline as written, although not a robust way of [Removing trailing newline character from fgets() input](https://stackoverflow.com/questions/2693776/removing-trailing-newline-character-from-fgets-input) Note that `0` and `'\0'` are the same: an `int` with value 0 although the latter is considered to be clearer in the context of a `char` array. – Weather Vane Mar 21 '21 at 14:46
  • I tried but it doesnt change... It is weird, when there's more than 3 files in my directory, it shows a segmentation fault. – Aquaaa Mar 21 '21 at 15:18
0

I'm assuming that the lines in your file look something like this:

{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100, ... }

In other words it's some kind of JSON format. I'm assuming that the line starts with '{' so each line is a JSON object.

You read that line into line2, which now contains:

{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100, ... }\0

Note the \0 at the end that terminates the string. Note also that "fiftyTwoWeekLow" comes first, which turns out to be really important.

Now let's trace through the code here:

fiftyhigh = strstr(line2, var1);
fiftyhigh_token = strtok(fiftyhigh, s);

First you call strstr to find the position of "fiftyTwoWeekHigh". This will return a pointer to the position of that field name in the line. Then you call strtok to find the comma that separates this value from the next. I think that this is where things start to go wrong. After the call to strtok, line2 looks like this:

{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100\0 ... }\0

Note that strtok has modified the string: the comma has been replaced with \0. That's so you can use the returned pointer fiftyhigh_token as a string without seeing all the stuff that came after the comma.

fiftyhigh2 = strstr(fiftyhigh_token, s2);
fiftyhigh2_token = strtok(fiftyhigh2, s2);
printf("%s\n", fiftyhigh2_token);

Next you look for the colon and then call strtok with a pointer to the colon. Since the delimiter you're passing to strok is the colon, strtok ignores the colon and returns the next token, which (because the string we're looking at, which ends after "100," has no more colons) is the rest of the string, in other words, the number.

So you've gotten your number, but probably not in the way you expected? There was really no point in the second call to strtok since (assuming the JSON was well-formed) the position of "100" was just fiftyhigh2+1.

Now we try to find "fiftyTwoWeekLow:"

fiftylow = strstr(line2, var2);
fiftylow_token = strtok(fiftylow, s);
fiftylow2 = strstr(fiftylow_token, s2);
fiftylow2_token = strtok(fiftylow2, s2);
printf("%s\n", fiftylow2_token);

This is basically the same process, and after you call strtok, line2 like this:

{"fiftyTwoWeekLow":32\0"fiftyTwoWeekHigh":100\0 ... }\0

Note that you're only able to find "fiftyTwoWeekLow" because it comes before "fiftyTwoWeekHigh" in the line. If it had come after, then you'd have been unable to find it due to the \0 added after "fiftyTwoWeekHigh" earlier. In that case, strstr would have returned NULL, which would cause strtok to return NULL, and then you'd definitely have gotten a seg fault after passing NULL to strstr.

So the code is really sensitive to the order in which the fields appear in the line, and it's probably failing because some of your lines have the fields in a different order. Or maybe some fields are just missing from some lines, which would have the same effect.

If you're parsing JSON, you should really use a library designed for that purpose. But if you really want to use strtok then you should:

  1. Read line2.
  2. Call strtok(line2, ",") once, then repeatedly call strtok(NULL, ",") in a loop until it returns null. This will break up the line into tokens that each look like "someField":100.
  3. Isolate the field name and value from each of these tokens (just call strchr(token, ':') to find the value). Do not call strtok here, because it will change the internal state of strtok and you won't be able to use strtok(NULL, ",") to continue processing the line.
  4. Test the field name, and depending on its value, set an appropriate variable. In other words, if it's the "fiftyTwoWeekLow" field, set a variable called fiftyTwoWeekLow. You don't have to bother to strip off the quotes, just include them in the string you're comparing with.
  5. Once you've processed all the tokens (strtok returns NULL), do something with the variables you set.

You may be to pass ",{}" as the delimiter to strtok in order to get rid of any open and close curly braces that surround the line. Or you could look for them in each token and ignore them if they appear.

You could also pass "\"{},:" as the delimiter to strtok. This would cause strtok to emit an alternating sequence of field names and values. You could call strtok once to get the field name, again to get the value, then test the field name and do something with the value.

Using strtok is a pretty primitive way of parsing JSON, but it will will work as long as your JSON only contains simple field names and numbers and doesn't include any strings that themselves contain delimiter characters.

Willis Blackburn
  • 8,068
  • 19
  • 36