As others have already pointed out (see for example herohuyongtao's answer), the loop condition and how you put str1
into the istringstream
must be fixed.
However, there is an important issue here that everybody has missed so far: you don't need the istringstream
at all!
vec.reserve(the_number_of_words_you_exptect_at_least);
while (infile >> str1) {
vec.push_back(str1);
}
It gets rid of the inner loop which you didn't need in the first place, and doesn't create an istringstream
in each iteration.
If you need to parse further each line and you do need an istringstream
, create it outside of the loop and set its string buffer via istringstream::str(const string& s)
.
I can easily imagine your loops being very slow: Heap allocation on Windows is outrageously slow (compared to Linux); I got bitten once.
Andrei Alexandrescu presents (in some sense) a similar example in his talk Writing Quick Code in C++, Quickly. The surprising thing is that doing unnecessary heap allocations in a tight loop like the one above can be slower than the actual file IO. I was surprised to see that.
You didn't tag your question as C++11 but here is what I would do in C++11.
while (infile >> str1) {
vec.emplace_back(std::move(str1));
}
This move constructs the string at the back of the vector, without copying in. We can do it because we don't need the contents of str1
after we have put it into the vector. In other words, there is no need to copy it into a brand new string at the back of the vector, it is sufficient to just move its contents there. The first loop with the vec.push_back(str1);
may potentially copy the contents of str1
which is really unnecessary.
The string implementation in gcc 4.7.2 is currently copy on write, so the two loops have identical performance; it doesn't matter which one you use. For now.
Unfortunately, copy on write strings are now forbidden by the standard. I don't know when the gcc developers are going to change the implementation. If the implementation changes, it may make a difference in performance whether you move (emplace_back(std::move(s))
) or you copy (push_back(s)
).
If C++98 compatibility is important for you, then go with push_back()
. Even if the worst thing happens in the future and your string is copied (it isn't copied now), that copy can be turned into a memmove()
/ memcpy()
which is blazing fast, most likely faster than reading the contents of the file from the hard disk so file IO will most likely remain the bottleneck.