9

How do you split a string into tokens in C++?

Qix - MONICA WAS MISTREATED
  • 14,451
  • 16
  • 82
  • 145
BobS
  • 557
  • 3
  • 12
  • 16
  • 1
    dup of http://stackoverflow.com/questions/236129/c-how-to-split-a-string – ididak Nov 09 '08 at 07:07
  • 16
    How about some of the examples from the following: http://www.codeproject.com/KB/recipes/Tokenizer.aspx They are very efficient and somewhat elegant. The String Toolkit Library makes complex string processing in C++ simple and easy. –  Dec 08 '10 at 05:23

6 Answers6

16

this works nicely for me :), it puts the results in elems. delim can be any char.

std::vector<std::string> &split(const std::string &s, char delim, std::vector<std::string> &elems) {
    std::stringstream ss(s);
    std::string item;
    while(std::getline(ss, item, delim)) {
        elems.push_back(item);
    }
    return elems;
}
Evan Teran
  • 87,561
  • 32
  • 179
  • 238
  • 2
    Why return elems. When it is passed into the function as reference parameter? – Martin York Nov 09 '08 at 02:47
  • 1
    oh, just for convenience. So if you need you can do something like: split(line, ',', elems).at(2); it's entirely unnecessary to return it. – Evan Teran Nov 09 '08 at 04:39
  • 1
    This does not handle empty delimited strings correctly, e.g. split(",", ',') should return two empty strings, but the code above just returns one. This can be solved by initializing ss with "s + delim" and handling the special case that an empty string should return an empty list (rather than a list with one empty string). – Johannes Overmann Nov 11 '21 at 23:52
5

With this Mingw distro that includes Boost:

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <ostream>
#include <algorithm>
#include <boost/algorithm/string.hpp>
using namespace std;
using namespace boost;

int main() {
    vector<string> v;
    split(v, "1=2&3=4&5=6", is_any_of("=&"));
    copy(v.begin(), v.end(), ostream_iterator<string>(cout, "\n"));
}
Shadow2531
  • 11,980
  • 5
  • 35
  • 48
4

You can use the C function strtok:

/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="- This, a sample string.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}

The Boost Tokenizer will also do the job:

#include<iostream>
#include<boost/tokenizer.hpp>
#include<string>

int main(){
   using namespace std;
   using namespace boost;
   string s = "This is,  a test";
   tokenizer<> tok(s);
   for(tokenizer<>::iterator beg=tok.begin(); beg!=tok.end();++beg){
       cout << *beg << "\n";
   }
}
Imbue
  • 3,897
  • 6
  • 40
  • 42
3

Try using stringstream:

std::string   line("A line of tokens");
std::stringstream lineStream(line);

std::string token;
while(lineStream >> token)
{
}

Check out my answer to your last question:
C++ Reading file Tokens

Community
  • 1
  • 1
Martin York
  • 257,169
  • 86
  • 333
  • 562
3

See also boost::split from String Algo library

string str1("hello abc-*-ABC-*-aBc goodbye");
vector<string> tokens;
boost::split(tokens, str1, boost::is_any_of("-*")); 
// tokens == { "hello abc","ABC","aBc goodbye" }

Sergey Skoblikov
  • 5,811
  • 6
  • 40
  • 49
1

It depends on how complex the token delimiter is and if there are more than one. For easy problems, just use std::istringstream and std::getline. For more complex tasks or if you want to iterate the tokens in an STL-compliant way, use Boost's Tokenizer. Another possibility (although messier than either of these two) is to set up a while loop that calls std::string::find and updates the position of the last found token to be the start point for searching for the next. But this is probably the most bug-prone of the 3 options.

Michel
  • 1,456
  • 11
  • 16