1

So I have data in a text like this:

Alaska         200 500
New Jersey     400 300
.
.

And I am using ifstream to open it.

This is part of a course assignment. We are not allowed to read in the whole line all at once and parse it into the various pieces. So trying to figure out how to read each part of every line.

Using >> will only read in "New" for "New Jersey" due to the white space/blank in the middle of that state name. Have tried a number of different things like .get(), .read(), .getline(). I have not been able to get the whole state name read in, and then read in the remainder of the numeric data for a given line.

I am wondering whether it is possible to read the whole line directly into a structure. Of course, structure is a new thing we are learning...

Any suggestions?

Kevin Anderson
  • 4,568
  • 3
  • 13
  • 21
James Kunz
  • 93
  • 5
  • ***we don't read in the whole line all at once and parse it into the various pieces*** Do you mean you are not permitted to read a whole line at a time? – drescherjm Oct 20 '17 at 22:10
  • reading in the whole line and parsing with sprintf() etc hasn't been taught, so have to stick with what we have learned in class so far .. :( – James Kunz Oct 20 '17 at 22:22
  • 1
    There are better options than `sscanf()` in C++. Read a line of data to a string (something you've already tried, given your mention of using `getline()`), and then use a `istringstream` to parse it. – Peter Oct 20 '17 at 22:47
  • @Peter, `istringstream` and `cin` are the same `basic_istream`.... Parsing a `string` of an entire line might be more useful than working with a stream. – Daniel Trugman Oct 20 '17 at 22:49
  • 1
    @DanielTrugman - for fixed style of input like this, a string stream is easily sufficient for parsing a string once it is read. And I'm saying to parse a line using stringstream, not the whole file. – Peter Oct 20 '17 at 22:57
  • If your example input has no tabs, and has consistent spacing, then you might consider binary reading using std::basic_istream& read( char_type* s, std::streamsize count ) and getting only text strings by performing a) read 1st 14 chars into state, b) read 4 chars, and converting to integer, c) read 3 chars and converting to an integer, d) and then finish the line (read char by char until eoln) – 2785528 Oct 20 '17 at 23:02

4 Answers4

0

Here is a line by line parsing solution that doesn't use any c-style parsing methods:

std::string line;
while (getline(ss, line) && !line.empty()) {
    size_t startOfNumbers = line.find_first_of("0123456789");
    size_t endOfName = line.find_last_not_of(" ", startOfNumbers);
    std::string name = line.substr(0, endOfName); // Extract name

    std::stringstream nums(line.substr(startOfNumbers)); // Get rest of the line
    int num1, num2;
    nums >> num1 >> num2; // Read numbers

    std::cout << name << " " << num1 << " " << num2 << std::endl;
}
Daniel Trugman
  • 8,186
  • 20
  • 41
  • Oops, the \t was probably me typing up the info for the post, the actual data does not have tabs but instaead spaces between the columns of data .. sorry about that – James Kunz Oct 20 '17 at 22:20
  • 1
    Mixing styles of input on one stream (e.g. reading a line using `getline()`, and then using `>>` for streaming) is a really bad idea - it's a good way to either unintentionally skip/discard input or (if you do the sequence repeatedly in a loop) to get an infinite loop. – Peter Oct 20 '17 at 22:50
  • @Peter, what are you talking about? The stream becomes invalid and any sane loop will exit safely. And how would it unintentionally skip/discard inputs? Where will they go? You just have to use them carefully, as any other API in programming... – Daniel Trugman Oct 20 '17 at 22:54
  • There are plenty of Q&A on here with problems that result from mixing styles of input. – Peter Oct 20 '17 at 22:59
  • @JamesKunz, update the answer to work without the `\t` – Daniel Trugman Oct 20 '17 at 23:10
0

Can't you just read the state name in a loop?

Read a string from cin: if the first character of the string is numeric then you've reached the next field and you can exit the loop. Otherwise just append it to the state name and loop again.

GrahamS
  • 9,980
  • 9
  • 49
  • 63
0

If you can't use getline, do it yourself: Read and store in a buffer until you find '\n'. In this case you probably also cannot use all the groovy stuff in std::string and algorithm and might as well use good ol' C programming at that point.

Once you have grabbed a line, read your way backwards from the end of the line and

  1. Discard all whitespace until you find non whitespace.
  2. Gather characters found into token 3 until you find whitepace again.
  3. Read and discard the whitespace until you find the end of token 2.
  4. Gather token 2 until you find more whitespace.
  5. Discard the whitespace until you find the end of token 1. The rest of the line is all token 1.
  6. convert token 2 and token 3 into numbers. I like to use strtol for this.

You can build all of the above or Daniel's answer (use his answer if at all possible) into an overload of operator>>. This lets you

mystruct temp;
while (filein >> temp)
{
    // do something with temp. Stick it in a vector, whatever
}

The code to do this looks something like (Stealing wholesale from What are the basic rules and idioms for operator overloading? <-- Read this. It could save your life one day)

std::istream& operator>>(std::istream& is, mystruct & obj)
{
  // read obj from stream

  if( /* no valid object of T found in stream */ )
    is.setstate(std::ios::failbit);

  return is;
}
user4581301
  • 33,082
  • 7
  • 33
  • 54
-2

Here's another example of reading the file word by word. Edited to remove the example using the eof check as the while loop condition. Also included a struct as you mentioned that's what you just learned. I'm not sure how you're supposed to use your struct, so I just made it simple and had it contain 3 variables, a string, and 2 ints. To verify it reads correctly it couts the contents of the struct variables after its read in which includes printing out "New Jersey" as one word.

#include <iostream>
#include <fstream>
#include <string>
#include <stdlib.h> // for atoi
using namespace std;

// Not sure how you're supposed to use the struct you mentioned.  But for this example it'll just contain 3 variables to store the data read in from each line
struct tempVariables
{
        std::string state;
        int number1;
        int number2;
};

// This will read the set of characters and return true if its a number, or false if its just string text
bool is_number(const std::string& s)
{
    return !s.empty() && s.find_first_not_of("0123456789") == std::string::npos;
}


int main()
{
tempVariables temp;

ifstream file;
file.open("readme.txt");
std::string word;
std::string state;
bool stateComplete = false;
bool num1Read = false;
bool num2Read = false;
if(file.is_open())

{
  while (file >> word)
  {
      // Check if text read in is a number or not
      if(is_number(word))
      {
        // Here set the word (which is the number) to an int that is part of your struct
        if(!num1Read)
        {
           // if code gets here we know it finished reading the "string text" of the line
           stateComplete = true;
           temp.number1 = atoi(word.c_str());
           num1Read = true;  // won't read the next text in to number1 var until after it reads a state again on next line
        }
        else if(!num2Read)
        {

           temp.number2 = atoi(word.c_str());
           num2Read = true; // won't read the next text in to number2 var until after it reads a state agaon on next line
        }

      }
      else
      {
        // reads in the state text
        temp.state = temp.state + word + " ";
      }

      if(stateComplete)
      {
        cout<<"State is: " << temp.state <<endl;
        temp.state = "";
        stateComplete = false;
      }
      if(num1Read && num2Read)
      {
        cout<<"num 1: "<<temp.number1<<endl;
        cout<<"num 2: "<<temp.number2<<endl;
        num1Read = false;
        num2Read = false;
      }
  }
}

return 0;

}
CAMD_3441
  • 2,514
  • 2
  • 23
  • 38
  • Yep, I found all kind of elegant solutions reading the whole line and parsing it into the pieces .. but not something "we have learned" yet :( Unless I could define the pieces of data as a structure and read it all directly into a structure ..?? Structures are part of the latest chapter we are doing ... – James Kunz Oct 20 '17 at 22:30
  • @Daniel Trugman - But it doesn't read the whole line at once, its reading in each set of alphanumeric characters individually. – CAMD_3441 Oct 20 '17 at 22:30
  • So what functions "have you learned"? just the ones you mentioned in your original post? – CAMD_3441 Oct 20 '17 at 22:31
  • What do you mean read into pieces of a structure? Do you mean detect if what's read in is a string or a number? – CAMD_3441 Oct 20 '17 at 22:37
  • [Why is iostream::eof inside a loop condition considered wrong?](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong) – user4581301 Oct 20 '17 at 23:33