I'd suggest using <regex>
library if the compiler of yours supports C++11.
#include <fstream>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <regex>
const std::regex ws_re(":| +");
void printTokens(const std::string& input)
{
std::copy( std::sregex_token_iterator(input.begin(), input.end(), ws_re, -1),
std::sregex_token_iterator(),
std::ostream_iterator<std::string>(std::cout, "\n"));
}
int main()
{
const std::string text1 = "...:---:...";
std::cout<<"no whitespace:\n";
printTokens(text1);
std::cout<<"single whitespace:\n";
const std::string text2 = "..:---:... ..:---:...";
printTokens(text2);
std::cout<<"multiple whitespaces:\n";
const std::string text3 = "..:---:... ..:---:...";
printTokens(text3);
}
The description of library is on cppreference. If you are not familiar with regular expressions, the part in the code above const std::regex ws_re(":| +");
means that there should be either ':' symbol or (or
in regular expressions denoted by pipe symbol '|') any amount of whitespaces ('+' stands for 'one or more symbol that stands before the plus sign'). Then one is able to use this regular expression to tokenize any input with std::sregex_token_iterator
. For more complex cases than whitespaces, there is wonderful regex101.com.
The only disadvantage I could think of is that regex engine is likely to be slower than simple handwritten tokenizer.