-1

Doing some project for school, I encountered the following problem:

sscanf reads different value that the one expected.

I want to read something like this:

1 0 185336079 0 0 168231418 -256 0 255 1
2 0 185336079 -256 0 168231418 -256 0 255 2
3 0 185336079 -256 0 168231418 -256 0 255 3
4 0 185336079 0 0 0 0 0 255 4

This is the code i currently use:

FILE *fd = open_fd("/proc/firewall", "r");

while ((read = getline(&line, &len, fd)) != -1) {
    sscanf(line, "%d %c %d %d %d %d %d %d %c %c\n", &num,
        &rule_u.inbound_outbound,
        &rule_u.source_ip,
        &rule_u.source_netmask,
        &rule_u.source_port,
        &rule_u.destination_ip,
        &rule_u.destination_netmask,
        &rule_u.destination_port,
        &rule_u.protocol,
        &rule_u.action);

    printf("scanf read rule action : %c\n", rule_u.action);
    printf("sscanf whole line:\n%s\n",line);
    convert_rule_from_u();
    print_rule();
}

The output the above code generates is the following:

scanf read rule action : 5
sscanf whole line:
1 0 185336079 0 0 168231418 -256 0 255 1

scanf read rule action : 5
sscanf whole line:
2 0 185336079 -256 0 168231418 -256 0 255 2

scanf read rule action : 5
sscanf whole line:
3 0 185336079 -256 0 168231418 -256 0 255 3

scanf read rule action : 5
sscanf whole line:
4 0 185336079 0 0 0 0 0 255 4

The expected output should be the following:

scanf read rule action : 1
sscanf whole line:
1 0 185336079 0 0 168231418 -256 0 255 1

scanf read rule action : 2
sscanf whole line:
2 0 185336079 -256 0 168231418 -256 0 255 2

scanf read rule action : 3
sscanf whole line:
3 0 185336079 -256 0 168231418 -256 0 255 3

scanf read rule action : 4
sscanf whole line:
4 0 185336079 0 0 0 0 0 255 4

rule_u struct

typedef struct rule_struct_u {
    unsigned char inbound_outbound;
    unsigned int source_ip;
    unsigned int source_netmask;
    unsigned int source_port;
    unsigned int destination_ip;
    unsigned int destination_netmask;
    unsigned int destination_port;
    unsigned char protocol;
    unsigned char action;
} rule_struct_u;

What am I doing wrong?

  • 1
    what are the types? – information_interchange Jan 08 '18 at 02:46
  • added in edit, thanks! – Dimitrie Mititelu Jan 08 '18 at 02:49
  • 1
    What's the weird behavior? I am too lazy to study all this output when you just tell me where to look for the problem. – nicomp Jan 08 '18 at 02:52
  • 1
    `%c` reads a single character. In this case the ASCII values of `'2'` and `'5'` for `protocol` and `action` respectively. – jodag Jan 08 '18 at 02:54
  • 1
    type was unsigned char and i was using %c to read, worked most of the time, except 1 case, now i found out that i should have used %hhu, and now it works properly! i was so frustrated, looked at this problem for over 2h. I am retarded. Thanks guys! – Dimitrie Mititelu Jan 08 '18 at 02:56
  • Trailing white space in a `scanf()` format string is a UX disaster. – Jonathan Leffler Jan 08 '18 at 03:01
  • However, you're also running into the problem that `%c` at the end is not reading all of 255 — you need a `%d` (or maybe `%hhd` or `%hhu`) for the second last column, and should review all the other %c columns; maybe they should be numbers too? (See [`scanf()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/scanf.html)) (The trailing newline is also a disaster; it doesn't mean what you think it means. Don't do it!) – Jonathan Leffler Jan 08 '18 at 03:04
  • @JonathanLeffler thanks for looking over my problem so you say it should be like this: `sscanf(line, "%d%hhu%d%d%d%d%d%d%hhu%hhu",...` is it ok? i think it read the whole 255 before, i was playing around with it, and it was working. or at least looked like it – Dimitrie Mititelu Jan 08 '18 at 03:10
  • 2
    The spaces in between the conversion specifications are OK; they make it more readable. The newline at the end of the string means "read white space until you get to something that isn't white space", which is OK if the input is from a file but is a disaster if you're ever going to type the data by hand. The last but one field on your sample data is demonstrably not a single character. You probably want a number that fits in an `unsigned char` so `%hhu` is appropriate. The other two `%c` columns only hold zeros in the data, but if they could hold numbers bigger than 9, you need `%hhu`. – Jonathan Leffler Jan 08 '18 at 03:13
  • 1
    I'll note that your structure wastes a fair amount of space. Your one byte field at the start has 3 bytes padding after it, and there are two more bytes of padding at the end. Consider moving the three `unsigned char` fields adjacent to each other to save 4 bytes per structure (you'd still have 1 padding byte, but that's an improvement on 5). Think about whether that matters. The 'inbound/outbound' column might well be 0 or 1, so the `%c` might well be OK for that (but note that the values will be `'0'` and `'1'`, not `0` and `1`!). You know the data (or need to know the data) — your call. – Jonathan Leffler Jan 08 '18 at 03:17
  • Thanks for the complete anwser, i would upvote it if possible! Didn't think about the padding at all :( (i don't always work in c) thanks for pointing that out! Have a nice day! ^^ – Dimitrie Mititelu Jan 08 '18 at 03:21
  • 1
    Note that file means 'disk file'; pipes, FIFOs and sockets will be disasters too, as well as terminals. The `scanf()` won't return until the next message appears on the input; it needs something other than a newline or space or tab to say "that's all". That probably isn't what you want. – Jonathan Leffler Jan 08 '18 at 03:22
  • @chux: OK. It is still not a good idea, but yeah. Can you deduplicate? Or do I have to? – Jonathan Leffler Jan 08 '18 at 04:17
  • @DimitrieMititelu Coding `printf("scanf read rule action : %c\n", rule_u.action);` added confusion. Code is not using `scanf()`, but `sscanf()`. Title, question and code should refer to the true function in question. – chux - Reinstate Monica Jan 08 '18 at 04:28
  • @DimitrieMititelu The "The output of ### line is this:" is still unclear. It this the output of your code or the expected output? What exactly about your output that is not as expected? – chux - Reinstate Monica Jan 08 '18 at 04:30
  • @DimitrieMititelu `printf("sscanf whole line:\n%s\n",line);` is also is odd. `getline()` is reading `line`, not `sscanf()`. – chux - Reinstate Monica Jan 08 '18 at 04:34
  • 1
    Some general tips: when you ask a question like this you should explain what output you expect and why. Also, why don't you print out the other elements that are read in? It would likely help debug the problem. It would also likely help you debug what's going on if you had some input that has more differences in the fields - it might help show what's being read into `rule_u.action`. – Michael Burr Jan 08 '18 at 04:34
  • sorry guys, will try better in the future :( thanks for the help! – Dimitrie Mititelu Jan 08 '18 at 05:51

3 Answers3

2

Code has trouble at the end of each line in trying to read "255 0" with format specifiers "%c %c".

This scans and saves characters '2' and and first '5'. OP wanted to save as integers with a desired result of 255 and 0.

Use "%hhu %hhu" to read numeric text in and save as integers into unsigned char.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

In general, using trailing white space in a format string for the scanf() family of functions is a bad idea — see Trailing blank in scanf() format string for more details. If the function is sscanf() or one of its relatives, reading from a string rather than a file stream, then it isn't a catastrophe.

However, you're also running into the problem that the second last%c at the end is not reading all of 255 — you need a %d (or maybe %hhd or %hhu) for the second last column, and should review all the other %c columns; maybe they should be numbers too?

So you say it should be like this: sscanf(line, "%d%hhu%d%d%d%d%d%d%hhu%hhu",... — is it ok? I think it read the whole 255 before; I was playing around with it, and it was working, or at least looked like it.

The spaces in between the conversion specifications are OK; they make it more readable. The newline at the end of the string means "read white space until you get to something that isn't white space", which is OK if the input is from a (disk) file or a string but is a disaster if the input is coming from a terminal, pipe, FIFO, socket. A call to scanf() won't return until the next message appears on the input; it needs something other than a newline or space or tab to say "that's all". That probably isn't what you want.

The last but one field on your sample data is demonstrably not a single character. You probably want a number that fits in an unsigned char so %hhu is appropriate. The other two %c columns only hold zeros in the data, but if they could hold numbers bigger than 9, you need %hhu.

The 'inbound/outbound' column might well be 0 or 1, so the %c might well be OK for that (but note that the values will be '0' and '1', not 0 and 1!). You know the data (or need to know the data) — your call.

I'll note that your structure wastes a fair amount of space. Your one byte field at the start has 3 bytes padding after it, and there are two more bytes of padding at the end. Consider moving the three unsigned char fields adjacent to each other to save 4 bytes per structure (you'd still have 1 padding byte, but that's an improvement on 5). Think about whether that matters (the chances are it is not critical).

Also, you should test the return value from sscanf() — or scanf() — to ensure that you got the correct input.

if (sscanf(line, "%d %hhu %d %d %d %d %d %d %hhu %hu", &num,
    &rule_u.inbound_outbound,
    &rule_u.source_ip,
    &rule_u.source_netmask,
    &rule_u.source_port,
    &rule_u.destination_ip,
    &rule_u.destination_netmask,
    &rule_u.destination_port,
    &rule_u.protocol,
    &rule_u.action) != 10)
{
    …handle error…do not pass go, do not collect $200…
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

Your penultimate conversion specifier is %c, which only wants a single character. Your input has 3: "255" (or some such).

Hence, you are getting off-track, and each line is reading one more item from the last line than the prior iteration.

Dúthomhas
  • 8,200
  • 2
  • 17
  • 39