15

I have formatted data like the following:

Words          5
AnotherWord    4
SomeWord       6

It's in a text file and I'm using ifstream to read it, but how do I separate the number and the word? The word will only consist of alphabets and there will be certain spaces or tabs between the word and the number, not sure of how many.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
TheOnly92
  • 1,723
  • 2
  • 17
  • 25
  • I DO NOT KNOW if it is spaces or tabs between the words and the number, there will not be spaces within the word. – TheOnly92 Aug 24 '10 at 11:35
  • if your file format gets more complicated, you might want to try regular expressions for each line. Boost provides a lib for that. – Tobias Langner Aug 24 '10 at 13:06

4 Answers4

22

Assuming there will not be any whitespace within the "word" (then it will not be actually 1 word), here is a sample of how to read upto end of the file:

std::ifstream file("file.txt");
std::string str;
int i;

while(file >> str >> i)
    std::cout << str << ' ' << i << std::endl;
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
Donotalo
  • 12,748
  • 25
  • 83
  • 121
3

The >> operator is overridden for std::string and uses whitespace as a separator

so

ifstream f("file.txt");

string str;
int i;
while ( !f.eof() )
{
  f >> str;
  f >> i;
  // do work
}
mmmmmm
  • 32,227
  • 27
  • 88
  • 117
3

sscanf is good for that:

#include <cstdio>
#include <cstdlib>

int main ()
{
  char sentence []="Words          5";
  char str [100];
  int i;

  sscanf (sentence,"%s %*s %d",str,&i);
  printf ("%s -> %d\n",str,i);

  return EXIT_SUCCESS;
}
Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442
2

It's actually very easy, you can find the reference here
If you are using tabs as delimiters, you can use getline instead and set the delim argument to '\t'. A longer example would be:

#include <vector>
#include <fstream>
#include <string>

struct Line {
    string text;
    int number;
};

int main(){
    std::ifstream is("myfile.txt");
    std::vector<Line> lines;
    while (is){
        Line line;
        std::getline(is, line.text, '\t');
        is >> line.number;
        if (is){
            lines.push_back(line);
        }
    }
    for (std::size_type i = 0 ; i < lines.size() ; ++i){
        std::cout << "Line " << i << " text:  \"" << lines[i].text 
                  << "\", number: " << lines[i].number << std::endl;
    }
}
sbi
  • 219,715
  • 46
  • 258
  • 445
default
  • 11,485
  • 9
  • 66
  • 102
  • @Donatalo: if you include yes. Although, you need to include string if you want to use getline as well, so you have a valid point :) Edited my answer – default Aug 24 '10 at 11:43
  • This will actually even read strings with whitespaces (other than `'\t'`) in it. I have a few issues with it, though: 1. You need to check `is` immediately before pushing onto the vector. 2. The error check before the loop must return an `int` (and is unneeded anyway). Assuming you'll fix these, I've up-voted your answer despite them. – sbi Aug 24 '10 at 12:05
  • @sbi: thanks, edited. I I guess that's where the check for `is` should be (to make sure that the number can be read). It was some time ago I worked with iostreams.. – default Aug 24 '10 at 12:19
  • @Michael: No, that's still wrong. What happens if reading the integer fails? With stream input, you either want to check _every_ input operation or you rely on `operator>>()` being a no-op when the stream is in a bad state and check _after the last_ input. – sbi Aug 24 '10 at 13:37
  • @sbi: did you remove your vote or did someone else downvote me? – default Aug 24 '10 at 14:38
  • @Michael: No, I didn't, but someone down-voted you. (You should see it on the "Reputation" tab on your "Recent Activity" page.) I thought it's considered bad around here to down-vote without leaving a comment as to the reasons, but... – sbi Aug 24 '10 at 19:59
  • @Michael: `` Now you have one unneeded test in there. If you don't mind I'll go in there and fix it. If you happen to disagree, you can always change it back later, Ok? – sbi Aug 24 '10 at 20:00
  • 1
    Ok, I also added a few [missing `std::`](http://stackoverflow.com/questions/2879555/c-stl-how-to-write-wrappers-for-cout-cerr-cin-and-endl/2880136#2880136) and wrapped a line so that you don't need to scroll to read the code. Hope that's alright with you. If not, feel free to change it back. I didn't compile the code, so I hope it's as good as it looks to me... – sbi Aug 24 '10 at 20:05
  • @sbi: sorry :) thanks though. one comment: if `is` is Ok at the end of the loop, it is checked once again at the start of the loop (where we should know if it is ok, right?) also, what happends if getline fails? Or we don't need to check that? Also, I agree with you about commenting downvotes.. I thought it was a pretty good answer :( perhaps they'll upvote it now that you corrected it :) – default Aug 25 '10 at 06:40
  • @Michael: Yes, I suppose if you break out of the loop on error, you could change it back to an endless loop. Once `is` goes into a bad state, all further input operations will fail. So if `std::getline()` fails, `is >> line.number` will fail, too, and you still have `is` in a fail state. (That's why I wrote "you rely on `operator>>()` being a no-op when the stream is in a bad state and check _after the last_ input" up there.) That's also the reason you don't need to check the stream up-front. If it fails to open, input will fail and you will fall out of the loop with an error. – sbi Aug 25 '10 at 08:45
  • @sbi. It's logical, although I've never though of it. Thanks for the lesson :) – default Aug 25 '10 at 08:57