0

I am reading up a file with Text and integer in it. Need to extract the only integer skipping text.

I have implemented the code reading integers but how to skip text coming in between and continuing reading integer.

Input :

01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
some text
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
#include<bits/stdc++.h>
using namespace std;
int main(void){
 unsigned int number1,number2,number3,number4; 

          FILE* in_file = fopen("example.txt", "r"); 
           FILE* in_file1 = fopen("wrte.txt", "w"); 
           if (! in_file ) 
             {  
                printf("oops, file can't be read\n"); 
                exit(-1); 
             } 

          // attempt to read the next line and store 
          // the value in the "number" variable 

          while (fscanf(in_file,"%08x", &number1) == 1){ 
                fprintf(in_file1,"%08x\n", number1); 
             }
    fclose(in_file1);
    fclose(in_file);
return 0;
}

Expected output : Each 01000000 in a single line without text in it

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
aks
  • 27
  • 8
  • 3
    (OT: [Why should I not #include ?](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h)) – Biffen May 24 '19 at 14:50

5 Answers5

1

I can think of a couple simple ways to do this.

You can read everything as a string, and then discarding anything that cannot be converted to an integer.

std::string token;
while (filein >> token) // read string, exit on failure
{
    try 
    {
        int value = std::stoul(token, 0, 16); // try to turn string into integer
        use value        
    }
    catch (const std::invalid_argument &) // couldn't convert. do nothing
    {
    }
}

Documentation for std::stoul.

If non-numeric data is frequent (as in it's not exceptional) you may not want to use want to use exceptions. In this case look into using strtoul and performing the error handling yourself.

Documentation for strtoul.

You can also use >> to read integers as integers and check for success. When the read fails, clear the fail bit and read as a string. If you can't read a string the file is broken or completely consumed. Stop reading. If you can read a string, throw it out and go back to reading integers

while (true)
{
    int value;
    if (filein >> std::hex >> value) // read a number
    {
        use value
    }
    else // failed to read number
    {
        filein.clear(); // clear fail bit
        std::string junk;
        if (!(filein >> junk)) // read a string
        {
            break; // no more readable data. exit loop
        }
         // do nothing with junk.
    }
}

You can improve on this with ignore and by checking for end of file and exiting before trying to read the string

Documentation for std::stoul.

user4581301
  • 33,082
  • 7
  • 33
  • 54
0

I pretty much never rely on things like fscanf.

I would read lines of text and parse them intelligently. If you know the line is space-delimited, you can split at spaces then look at each chunk uniquely. if (isdigit(first character in chunk)) then int value = atoi(chunk).

You can even be more careful and make sure the entire chunk represents a legal number before calling atoi.

Joseph Larson
  • 8,530
  • 1
  • 19
  • 36
0

While you would generally use the C++ iostream library for file I/O, there is nothing that says you can't use the C cstdio functions such as fscanf -- as long as you use them correctly (and often they will be faster than the iostream approach)

In your case, you have a lot of numbers with some text in the middle that you are attempting to read with fscanf in a loop. That's fine, that's simple enough to do, but ... you must correctly handle the matching failure case which will occur when you attempt to read 's' with the "%08x" conversion specifier.

When a matching failure occurs, character extraction from the stream stops at the point of failure, leaving everything beginning with the character causing the failure (and what follows it) unread in the input buffer. Unless you properly extract the characters causing the matching failure from the input stream, you will likely encounter an endless loop as the characters causing the failure remain unread, just waiting to bite you again on the next attempted read.

So, how to handle the matching failure? The cctype header provides the isdigit macro that allows you to simply test if the next character in the input stream is a digit. You test the character by first reading with fgetc (or getc - same thing but often implemented as a macro) and then testing with isdigit, e.g.

            int c = fgetc(in_file);             /* read next char */
            while (c != EOF && !isdigit(c))     /* check EOF and isdigit */
                c = fgetc(in_file);             /* get next char */

Above you read the next character, then enter a loop validating you haven't reached EOF and then checking if c is Not a digit. If those conditions are met, you get the next character a check again, until you reach EOF Or you find the next digit in the input stream. But now you have a problem, you have already read the digit from the stream, how is fscanf going to be able to read it as part of the next integer?

Simple -- put it back in the input stream:

            if (c != EOF)                       /* if not EOF, then digit */
                ungetc (c, in_file);            /* put back for next read */

Now you are in a position to read all 64 integer values from in_file with a simple loop, e.g.

    while (1) { /* loop continually until EOF */
        int rtn = fscanf (in_file,"%08x", &number1);    /* validate return */
        if (rtn == EOF)         /* if EOF, break loop */
            break;
        else if (rtn == 0) {    /* handle matching failure */
            int c = fgetc(in_file);             /* read next char */
            while (c != EOF && !isdigit(c))     /* check EOF and isdigit */
                c = fgetc(in_file);             /* get next char */
            if (c != EOF)                       /* if not EOF, then digit */
                ungetc (c, in_file);            /* put back for next read */
        }
        else    /* good read, output number */
            fprintf (out_file, "%08x\n", number1); 
    }

(note: your output file has been renamed from in_file1 to out_file -- always use meaningful variable names)

Now some clean up. When you open in_file, you validate the file is open for reading. Fine, but for the error condition you exit (-1);. Don't return negative values to the shell. You have two constants to indicate success/failure names EXIT_SUCCESS (0) and EXIT_FAILURE (value 1, Not -1).

While you did check that in_file was open for reading, you wholly failed to check if your output file was open for writing? Always validate the return of all input/output stream and I/O functions. Otherwise attempting to write to an stream in an error state invokes Undefined Behavior.

Putting it altogether, you could do:

#include <cstdio>
#include <cstdlib>
#include <cctype>

using namespace std;

int main (void) {

    unsigned int number1; 

    FILE* in_file = fopen ("example.txt", "r"); 
    FILE* out_file = fopen ("wrte.txt", "w"); 

    if (!in_file) {     /* validate file open for reading */
        printf ("oops, file can't be read\n"); 
        exit (1);       /* don't return negative values to the shell */
    }
    if (!out_file) {    /* validate file open for writing */
        printf ("oops, file can't be read\n"); 
        exit (1);       /* don't return negative values to the shell */
    }

    while (1) { /* loop continually until EOF */
        int rtn = fscanf (in_file,"%08x", &number1);    /* validate return */
        if (rtn == EOF)         /* if EOF, break loop */
            break;
        else if (rtn == 0) {    /* handle matching failure */
            int c = fgetc(in_file);             /* read next char */
            while (c != EOF && !isdigit(c))     /* check EOF and isdigit */
                c = fgetc(in_file);             /* get next char */
            if (c != EOF)                       /* if not EOF, then digit */
                ungetc (c, in_file);            /* put back for next read */
        }
        else    /* good read, output number */
            fprintf (out_file, "%08x\n", number1); 
    }
    fclose (in_file);
    fclose (out_file);
}

Example Output File

$ cat wrte.txt
01000000
01000000
01000000
01000000
...
01000000

All 64 values are written which you can confirm with wc -l, e.g.

$ wc -l < wrte.txt
64

Look things over and let me know if you have further questions. The same logic would apply if you were using the iostream library, the function names are slightly different (some identical) but are implemented as member-functions instead.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
0

Sure that's a pretty simple task, you'll just need to modify your loop a little and use feof as your loop condition:

while(feof(in_file) == 0) {
    if(fscanf(in_file, " %8x ", &number1) > 0) {
        fprintf(in_file1,"%08x\n", number1);
    } else {
        fscanf(in_file, " %*s ");
    }
}

Live Example

I'd like to also suggest that you abandon FILE* and start using fstreams in , but that's just a convenience suggestion.

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
0

Here is a readable solution with emphasis on simplicity:

// -*- compile-command: "g++ data.cpp; ./a.out "; -*-
#include <fstream>
#include <iomanip>
#include <iterator>
#include <string>

int main()
{
  std::ifstream fin("data.in");
  std::ofstream fout("data.out");

  auto fin_iter = std::istream_iterator<std::string>(fin);
  const auto fin_iter_end = std::istream_iterator<std::string>();

  while (fin_iter != fin_iter_end)
  {
    try
    {
      fout << std::setfill('0') << std::setw(8) << std::stoul(*fin_iter) << " ";
    }
    catch (...)
    {
    }
    ++fin_iter;
  };

  fin.close();
  fout.close();

  return 0;
}

Here is the "idea":

To answer your question, you can "skip text coming in between integers" as in that case stoul throws an exception. If we catch an exeception, we do nothing, otherwise we write the converted integer into the output file.

data.in

01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
some text
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000
01000000 01000000 01000000 01000000

data.out

01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000 01000000
Picaud Vincent
  • 10,518
  • 5
  • 31
  • 70