0

I know this is a very trivial question but I just need quick help. I have been trying to figure this out for a while now. All I am trying to do is read only integers from a text file that has the form

 8 blah blah
 10 blah blah
 2 blah blah
 3 blah blah

I ultimately want to take the numbers only, store them in an array and put those numbers in a BST. My BST works fine when I have a file with just numbers, but not with the specified file format.

It doesn't matter what blah is I just want to get the numbers and store them in an array. I can do this if I take out the blah's. Using fscanf, I got my code to store the first number which is 8, but it stops there. Also in this example there are four lines but it doesn't matter how many lines are in the file. It could be 12 or 6. How can I properly do this. Below is my poor attempt to solve this.

 fscanf(instructionFile, "%d", &num);

I also tried doing something like

 while(!feof(instructionFile)){
  fscanf("%d %s %s", &num, string1, string2);
 }

To store everything and only use the integers, but my BST doesn't work when I do something like that.

Chibuikem
  • 51
  • 1
  • 6
  • Stop using fscanf. Use `fgets`, and then parse with `strtol` – William Pursell Mar 24 '17 at 00:57
  • Also, see http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong – William Pursell Mar 24 '17 at 00:57
  • Are the numbers always at the start of a line? Is there always a number in the line? Can there be many numbers on a line? Can the number be surrounded by letters or punctuation? Do you have to worry about signs? – Jonathan Leffler Mar 24 '17 at 01:04
  • @JonathanLeffler I got it to work, but we can assume the numbers are always the start of the line and we do not have to worry about signs and there is only one number per line. I can imagine how complex this would be if there were multiple numbers per line at different places – Chibuikem Mar 24 '17 at 20:14
  • Glad you got is solved. Yes, those issues I mentioned in my previous comment all make it harder to deal with. It depends on the context, and it sounds like you've got a fairly benign set of data to work with. Don't forget to accept the most helpful answer, assuming at least one of them was helpful. That let's others know that the question has been resolved. – Jonathan Leffler Mar 24 '17 at 21:04

3 Answers3

1

Use fgets() to fetch a line of input, and sscanf() to get the integer. In your example use of fscanf(), the first call would read an int, and the next calls would fail since the next item in the input stream is not an int. After each failure, the bad input is left in the input stream. By getting a line at a time, you can scan the line at your leisure, before fetching another line of input.

Here is an example of how you might do this. And note that you should not use feof() to control the read loop; instead, use the return value from fgets(). This code assumes that the first entry on a line is the data you want, perhaps with leading whitespace. The format string can be modified for slightly more complex circumstances. You can also use strtok() if you need finer control over parsing of the lines.

#include <stdio.h>
#include <stdlib.h>

#define MAX_LINES  100

int main(void)
{
    FILE *fp = fopen("data.txt", "r");
    if (fp == NULL) {
        fprintf(stderr, "Unable to open file\n");
        exit(EXIT_FAILURE);
    }

    char buffer[1000];
    int arr[MAX_LINES];
    size_t line = 0;

    while ((fgets(buffer, sizeof buffer, fp) != NULL)) {
        if (sscanf(buffer, "%d", &arr[line]) != 1) {
            fprintf(stderr, "Line formatting error\n");
            exit(EXIT_FAILURE);
        }
        ++line;
    }

    for (size_t i = 0; i < line; i++) {
        printf("%5d\n", arr[i]);
    }

    fclose(fp);

    return 0;
}

It would be good to add a check for empty lines before the call to sscanf(); right now an empty line is considered badly formatted data.

Output for your example file:

    8
   10
    2
    3
ad absurdum
  • 19,498
  • 5
  • 37
  • 60
1

If you want to pick out only integers from a mess of a file, then you actually need work through each line you read with a pointer to identify each beginning digit (or beginning - sign for negative numbers) converting each integer found one at a time. You can do this with a pointer and sscanf, or you can do this with strtol making use of the endptr parameter to move to the next character following any successful conversion. You can also use character-oriented input (e.g. getchar or fgetc) manually performing the digit identification and conversion if you like.

Given you started with the fgets and sscanf approach, the following continues with it. Whether you use sscanf or strtol, the whole key is to advance the start of your next read to the character following each integer found, e.g.

#include <stdio.h>
#include <stdlib.h>

#define MAXC 256

int main (int argc, char **argv) {

    char buf[MAXC] = "";    /* buffer to hold MAXC chars at a time */
    int nval = 0;           /* total number of integers found */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    while (fgets (buf, MAXC, fp)) {

        char *p = buf;      /* pointer to line */
        int val,            /* int val parsed */
            nchars = 0;     /* number of chars read */

        /* while chars remain in buf and a valid conversion to int takes place
         * output the integer found and update p to point to the start of the
         * next digit.
         */
        while (*p) {
            if (sscanf (p, "%d%n", &val, &nchars) == 1) {
                printf (" %d", val);
                if (++nval % 10 == 0)     /* output 10 int per line */
                    putchar ('\n');
            }
            p += nchars;        /* move p nchars forward in buf */

            /* find next number in buf */
            for (; *p; p++) {
                if (*p >= '0' && *p <= '9') /* positive value */
                    break;
                if (*p == '-' && *(p+1) >= '0' && *(p+1) <= '9') /* negative */
                    break;
            }
        }
    }
    printf ("\n %d integers found.\n", nval);

    if (fp != stdin) fclose (fp);     /* close file if not stdin */

    return 0;
}

Example Input

The following two input files illustrate picking only integers out of mixed input. Your file:

$ cat dat/blah.txt
 8 blah blah
 10 blah blah
 2 blah blah
 3 blah blah

A really messy file

$ cat ../dat/10intmess.txt
8572,;a -2213,;--a 6434,;
a- 16330,;a

- The Quick
Brown%3034 Fox
12346Jumps Over
A
4855,;*;Lazy 16985/,;a
Dog.
11250
1495

Example Use/Output

In your case:

$ ./bin/fgets_sscanf_int_any_ex < dat/blah.txt
 8 10 2 3
 4 integers found.

With the really messy file:

$ ./bin/fgets_sscanf_int_any_ex <dat/10intmess.txt
 8572 -2213 6434 16330 3034 12346 4855 16985 11250 1495

 10 integers found.

Look things over and let me know if you have any questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
1

A simple way to "read only integers" is to use fscanf(file_pointer, "%d", ...) and fgetc() when that fails

int x;
int count;
while ((count = fscanf(file_pointer, "%d", &x)) != EOF) {
  if (count == 1) { 
    // Use the `int` in some fashion (store them in an array)
    printf("Success, an int was read %d\n", x);
  } else {
    fgetc(file_pointer); // Quietly consume 1 non-numeric character
  }
}

I got my code to store the first number which is 8, but it stops there.

That is because the offending non-numeric input remains in the FILE stream. That text needs to be consumed in some other way. Calling fscanf(instructionFile, "%d", &num); again simple results in the same problem: fscanf() fails as initial input is non-numeric.


Note: OP's code is missing the FILE pointer

// fscanf(????, "%d %s %s", &num, string1, string2);    
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256