1

So I'm slowly continuing learning C. And now, I have a task, to read data from a file and sort it.

File data:

House naming 1 30 300
House naming 2 45 450
.......
House naming 10 5 120

So, first value: House naming, can be any naming like Empire state building

The second value is: House address (I chose only integer values)

The third value is: Age of a building

The fourth value is: Kilowatt-hour/year

Programm must take data from a file -> Print it out -> Sort(how? see below) -> print out again, sorted.

Sorting:

  • kwh < 200 - sustainable building,
  • kwh < 300 && age < 40 - needs renovation,
  • kwh > 300 && age > 40 - set for demolition.

Here's the code:

#include <stdio.h>
#include <stdlib.h>
#include "input.h"

int main(void) {
    int kwh;
    int age;
    char building[SIZE];
    int addr;
    char buff[SIZE];
    FILE *fi;

    // opening the files and checking if it succeeded
    fi = fopen(F_INPUT, "r");
    if (fi == NULL) {
        printf("Error opening input file \"%s\"", F_INPUT);
        exit(EXIT_INPUT_FAIL);
    }
    while (fgets(buff, sizeof(buff), fi) != NULL) {
        sscanf(buff, "%s %d %d %d", building, &addr,&age,&kwh);
        if (kwh < 200) {
            puts(buff);
            printf("Sustainable\n");
        } else
        if (kwh < 300 && age < 40) {
            puts(buff);
            printf("Needs renovation\n");
        } else
        if (kwh > 300 && age > 40) {
            puts(buff);
            printf("IN DEMOLITION LIST\n");
        }
    }
    /* close the files when they're not needed anymore */
    fclose(fi);
    return 0;
}

I've combined a few steps to make it a bit easier, reads data -> outputs already marked 1) Sustainable, 2) Needs renovation, 3) set for demolition.

The problem is somewhere in the while loop and I think it's in sscanf function. In my logic, if I am not mistaken, it must read a string from a file, using logic(look at sscanf and input file): char value, integer, integer, integer. Programm reads the file, outputs the data, but marks all buildings as sustainable.

What you suggest to read more carefully or what logic is better to choose for reading multiple strings.

Output:

House naming 1 30 300
Sustainable
House naming 2 45 450
Sustainable
........
House naming 10 5 120
Sustainable
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1
    Note that format specifier `%s` stops at the first whitespace, so it can't be used to read, as your example, "Empire state building". One solution could be to break the input string into an array of token pointers with `strtok`, extract integers from the last three, and build a new string from the first ones remaining. – Weather Vane Mar 08 '20 at 21:56
  • Your `sscanf` is not working as you expect it to. Always check the return values of functions, in this case `sscanf`. And invest time into learning how to use a debugger. Also useful: [How to debug small programs](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) – kaylum Mar 08 '20 at 21:57
  • 1
    sscanf with %s reads until the first whitespace encountered.. If the name field can have multiple words (including numbers?) it can be raher daunting. Try splitting into multiple space-delimited fields; then the final 3 fields are the numbers, and the first n-3 comprise the name. – FredK Mar 08 '20 at 21:57
  • 1
    Can "House naming" include digits? like "Stdio 54"? – chux - Reinstate Monica Mar 08 '20 at 22:52
  • @chux-ReinstateMonica Yea, it can include digits and can not. If I am right, the task says nothing about namings(choose on your own). – Alexey Kozlov Mar 09 '20 at 21:03
  • @WeatherVane @kaylum Why in this case `sscanf` reads perfectly with whitespaces? https://www.tutorialspoint.com/c_standard_library/c_function_sscanf.htm – Alexey Kozlov Mar 09 '20 at 21:12
  • 2
    Note that in the example on the page you linked, there are 4 format specifiers, and 4 text sequences in the input separated by whitespace. If there were **five** text sequences, for example `"Saturday Sunday March 25 1989"` it would not work properly. Format `%s` stops **at the first whitespace**. The best thing you can do IMO is to write a short test program to explore how `scanf` behaves, and with no other purpose. – Weather Vane Mar 09 '20 at 21:36

2 Answers2

2

Your problem is tricky to solve with sscanf() because there is no explicit separator between the house name and the 3 numeric fields. %s is inappropriate: it parses a single word. In your program, sscanf() actually fails to convert the numbers and returns 1 for all lines, leading to undefined behavior when you compare the numeric values that are actually uninitialized.

Here is a modified version using the %[ conversion specification:

#include <stdio.h>
#include <stdlib.h>

#define F_INPUT  "input.txt"
#define EXIT_INPUT_FAIL  1

int main(void) {
    char buff[256];
    char building[100];
    int addr, age, kwh;
    FILE *fi;

    // opening the files and checking if it succeeded
    fi = fopen(F_INPUT, "r");
    if (fi == NULL) {
        printf("Error opening input file \"%s\"", F_INPUT);
        exit(EXIT_INPUT_FAIL);
    }
    while (fgets(buff, sizeof(buff), fi) != NULL) {
        /* parse the building name upto and excluding any digit,
           then accept 3 integral numbers for the address, age and power */
        if (sscanf(buff, "%99[^0-9]%d%d%d", building, &addr, &age, &kwh) != 4) {
            printf("parsing error: %s", buff);
            continue;
        }
        if (kwh < 200) {
            puts(buff);
            printf("Sustainable\n");
        } else
        if (kwh < 300 && age < 40) {
            puts(buff);
            printf("Needs renovation\n");
        } else
        if (kwh > 300 && age > 40) {
            puts(buff);
            printf("IN DEMOLITION LIST\n");
        }
        // allocate structure with building details and append it to the list or array of buildings
    }
    /* close the files when they're not needed anymore */
    fclose(fi);
    // sort the list or array and print it
    // free the list or array
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • I've found this example on the tutorials point, where `sscanf` reads with whitespaces. It's pretty the same as mine, doesn't it? https://www.tutorialspoint.com/c_standard_library/c_function_sscanf.htm – Alexey Kozlov Mar 09 '20 at 21:09
  • @AlexeyKozlov: the example in this help page reads 2 words with 2 `%s` conversion specifications. It is possible to include spaces with the `%[...]` conversion specification but you need a separator... I just updated my answer to use this approach, stopping on digits. – chqrlie Mar 10 '20 at 06:44
2

Reading the line from the file into a string via fgets() is a good first step as OP has done.

Can "House naming" include digits? like "Stdio 54"?
Yea, it can include digits and can not. If I am right, the task says nothing about namings.

The next part is tricky as there is not a unique separator between the house name and the following 3 integers.

One approach would be to find the 3 trailing integers and then let the remaining beginning text as the house name.

  while (fgets(buff, sizeof(buff), fi) != NULL) {
    int address, age, power;
    char *p = buff + strlen(buff);  // start as buff end
    p = reverse_scan_int(buff, p, &power);
    p = reverse_scan_int(buff, p, &age);
    p = reverse_scan_int(buff, p, &address);
    if (p) {
      *p = '\0';
      trim(buff);  // remove leading trailing white-space
      printf("house_name:'%s', address:%d age:%d power:%d\n", buff, address,
          age, power);
    } else {
      printf("Failed to parse '%s'\n", buff);
    }
  }

Now all we need is reverse_scan_int(). Sample untested code idea:

#include <ctype.h>
#include <stdbool.h>
char *reverse_scan_int(const char *begin, char *one_past, int *i) {
  if (one_past == NULL) {
    return NULL;
  }
  // is...() functions work best with unsigned char
  unsigned char *p = (unsigned char *) one_past;
  // Skip trailing whitespace;
  while (p > begin && isspace(p[-1])) {
    p--;
  }
  // Skip digits;
  bool digit_found = false;
  while (p > begin && isdigit(p[-1])) {
    p--;
    digit_found = true;
  }
  if (!digit_found)
    return NULL;
  if (p > begin && (p[-1] == '-' || p[-1] == '+')) {
    p--;
  }
  *i = atoi(p); // More roubust code would use `strtol()`, leave for OP.
  return (char *) p;
}

There are lots of ways to trim a string including this one.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • What benefit do you find in using `bool` aside from just using an `int`? I've thought through that a number of times and there is no guaranteed size for `bool` that would save a couple of bytes over int (though many compilers implement it with a type smaller than `int`), then you get the additional processing to handle the shorter type in the register (though that can potentially allow the processor to optimize something else in the same register if the type is smaller). So I've tended to just stick with `int` for flags, any persuasive counter argument I'm missing? Readability? – David C. Rankin Mar 09 '20 at 23:04
  • @DavidC.Rankin Fair points. Benefit: 6.001 vs 1/2 dozen of the other. IMO, for demo/learner code and for future trends `bool` is the way to go for clarity and more optimal code. Might `int` end up more optimal? Perhaps, one could profile, but not that important for such a minor linear potential gain. IAC, for speed, I would have compared a before/after `p` pointers and dropped the `digit_found`, yet I found this more clear. – chux - Reinstate Monica Mar 10 '20 at 02:06
  • From a general standpoint, I do think it adds readability as it clearly designates that variable as a `true/false` indication rather than it simply being an `int`. I guess it's also a grey-brain thing with there being no `bool` until C99 [Is bool a native C type?](https://stackoverflow.com/questions/1608318/is-bool-a-native-c-type/1608350), so it makes your thought on future trends, for the new programmers going forward, a good point. – David C. Rankin Mar 10 '20 at 02:31