0

I have two different structs, where following is a node of a linked list

typedef struct following{
    char nick[6];
    int last_message;
    bool first_time;
    struct following *next;
}following;

typedef struct user{                     
    char nick[6];
    char name[26];
    int n_messages;
    int n_following;
    int n_followers;
    following *arr_following;
    following *arr_unfollowed;
}user;

I have to fill the user struct by reading from a file like this:

zsq4r Pseu Donym 3 1 2;zero7 2 true!
zero7 James Bond 4 3 3;zsq4r 3 true!zero7 4 false!MrPym 1 true!
MrPym A Perfect Spy 1 3 1;zsq4r 3 true!zero7 4 true!AlecS 1 true!
AlecS He Who Came from the Cold 1 0 1;

The content delimited by the ";" is to fill the user struct and the content delimited by "!" to fill the following struct.

Note: the "second" element of each line of the file will be the name of the user, which can go up to 25 chars and can be separated by white space. For example, "He Who Came from the Cold" is a valid name.

I tried to fill them like this:

void read_from_file(hashtable *active_users, FILE *fp_active){
    const char *delimiter1 = "!";
    const char *delimiter2 = ";";
    char *last_token;
    char buffer[1540];
    while(fgets(buffer, 1540, fp_active)) {
        user *new_user = malloc(sizeof(user));
        last_token = strtok( buffer, delimiter2);
        while( last_token != NULL ){
            sscanf(last_token,"%s %[^\n] %d %d %d", new_user->nick, new_user->name, &new_user->n_messages, &new_user->n_following,
                   &new_user->n_followers);
            last_token = strtok( NULL, delimiter1);
        }
        insert(active_users, new_user);
    }
}

Although the "last_token" variable is holding the correct part of the string read from the file at each loop, I can't find away to fill both structs, since sscanf is only filling part of the user struct.

Any help would be appreciated.

MiguelD
  • 409
  • 1
  • 7
  • 16
  • @melpomene it was just to save space and avoid making the question larger, i can change that if needed... – MiguelD May 31 '18 at 16:49
  • Please, do it... – Scheff's Cat May 31 '18 at 16:50
  • @user3121023 maybe this %26[^\n] will be better? – MiguelD May 31 '18 at 17:07
  • @user3121023 i need to pick up the name of the user that can be separated by whitespace and i never know how long it will be. – MiguelD May 31 '18 at 17:12
  • @user3121023 yes the name of the user can contain numbers, the only restriction is the max size of 25 chars – MiguelD May 31 '18 at 17:15
  • @user3121023 my problem is after separating the content of the struct user fromt he content of the struct following how can i fill both, since i can only hold in the final_token string one part at a time – MiguelD May 31 '18 at 17:26
  • You're going to have to hope that no-one's name ever contains a digit, judging from the data. It is formatted sloppily, which is depressingly common. It would be better presented as a full CSV or similar format, or with fixed-width fields, or ... or almost anything other than what's shown. – Jonathan Leffler May 31 '18 at 17:47
  • @JonathanLeffler the file is written by me with other function, i can change its format if there is a better one that will make reading it easier – MiguelD May 31 '18 at 17:50
  • 2
    I'd argue that a newline makes an adequate end delimiter. Why not use semicolons to delimit each field? `zsq4r;Pseu Donym;3;1;2;zero7;2;true` — though I see you have a list of triples at the end. Maybe adopt a leaf from JSON: `zsq4r;Pseu Donym;3;1;2[zero7;2;true]` where you can use other characters than `[]` to surround the data, and one set of `[]` per triple. Basically, you design the format so that it is easy to parse using the tools you want to use to parse it. You should probably read whole lines and then parse with [`sscanf()`](http://stackoverflow.com/questions/3975236) or ad hoc. – Jonathan Leffler May 31 '18 at 17:57
  • @JonathanLeffler i will adapt the file to those formats and see what i can do from there, thanks for the tip – MiguelD May 31 '18 at 18:00
  • Do consider going with CSV, though the variable number of triples at the end complicates the matter. There are CSV libraries available for formatting and scanning data. I suspect you wouldn't be encouraged to use such a library, though. JSON is less sensible because it is harder to format and using a library to handle JSON is pretty much a _sine qua non._ – Jonathan Leffler May 31 '18 at 18:04
  • @JonathanLeffler one more question, considering that i use ";" to separate the items how would i recieve the integers in the sscanf? since i have to apply %[^;] and it doesnt work for int – MiguelD May 31 '18 at 18:12
  • `sscanf(last_token,"%s %[^\n] %d %d %d" ...` is strange as there will never be anything scanned from a _line_ after `"%s %[^\n] `. Try `sscanf(last_token,"%s %[^0-9] %d %d %d"` – chux - Reinstate Monica May 31 '18 at 18:29
  • @chux yes it was something re-used from another situation, i noticed it doesnt work for this case – MiguelD May 31 '18 at 18:34
  • You'll have to read the `true` and `false` values as strings (using `"…; %[^;];…"` style scan sets) and convert to boolean afterwards. You would use `"…;%d ;…` to capture the strings, where the space before the second semicolon allows spaces after the number, just in case (but you could probably omit that and it would probably still work OK). – Jonathan Leffler May 31 '18 at 19:19
  • @JonathanLeffler thanks it worked, i was missing the ";" after the %[^;] – MiguelD May 31 '18 at 19:26

1 Answers1

0

Based on the tips from the comments i manage to solve my problem, changing the aspect of the file to this:

zero7;James Bond;2;1;0[MrPym;1;true](zero7;0;false)
MrPym;A Perfect Spy;1;0;1
zsq4r;Pseu Donym;3;1;2[zero7;2;true]
zero7;James Bond;4;3;3[zsq4r;3;true][zero7;3;false][MrPym;1;true]
MrPym;A Perfect Spy;1;3;1[zsq4r;3;true][zero7;4;true][AlecS;1;true]
AlecS;He Who Came from the Cold;1;0;1

And from that i used the following code to extract the information to the different structs:

void read_from_file(hashtable *active_users, hashtable *inactive_users, FILE *fp_active, FILE *fp_inactive){
    char m_bool[6];
    char *first_token;
    char *last_token;
    char buffer[1540];
    char buffer2[1540];
    while(fgets(buffer, 1540, fp_active)) {
        strcpy(buffer2, buffer);
        user *new_user = malloc(sizeof(user));
        new_user->arr_following = NULL;
        new_user->arr_unfollowed = NULL;
        last_token = strtok( buffer, "[");
        sscanf(last_token,"%[^;]; %[^;]; %d; %d; %d", new_user->nick, new_user->name, &new_user->n_messages, &new_user->n_following,
               &new_user->n_followers);
        last_token = strtok( NULL, "[");
        while(last_token != NULL){
            following *tmp_following = malloc(sizeof(following));
            sscanf(last_token," %[^;]; %d; %5s", tmp_following->nick, &tmp_following->last_message, m_bool);
            if(strcmp(m_bool, "true]") == 0)
                add(&new_user->arr_following, tmp_following->nick, tmp_following->last_message, true);
            else
                add(&new_user->arr_following, tmp_following->nick, tmp_following->last_message, false);
            last_token = strtok( NULL, "[");
        }
        first_token = strtok( buffer2, ")");
        while(first_token != NULL && strcmp(first_token, buffer2) != 0){
            following *tmp_following = malloc(sizeof(following));
            sscanf(first_token," %[^;]; %d; %5s", tmp_following->nick, &tmp_following->last_message, m_bool);
            if(strcmp(m_bool, "true]") == 0)
                add(&new_user->arr_unfollowed, tmp_following->nick, tmp_following->last_message, true);
            else
                add(&new_user->arr_unfollowed, tmp_following->nick, tmp_following->last_message, false);
            first_token = strtok( NULL, "(");
        }
        insert2(active_users, new_user);
    }
}

I had to re-run the file string 2 times for each line because there was 3 types of tokens, ; [ (

MiguelD
  • 409
  • 1
  • 7
  • 16