1

Here's the code. I have no idea why it doesn't recognize that it needs to copy the memory, and I can't force it to.

 string message="The quick brown fox jumped over the lazy dog.";

  vector<char*> words;

  while(message.length()>0){
    char wtf[message.substr(0,message.find(" ")).c_str().length()]=message.substr(0,message.find(" ")).c_str();
    words.push_back(wtf);
    message=message.substr(message.find(" ")+1);
  }

I see that there are similar threads, but none on this. Also, it seems like a shortcomming that C++ can't deal with this that easily.

Cat Plus Plus
  • 125,936
  • 27
  • 200
  • 224
Josh
  • 295
  • 1
  • 9
  • 1
    Whoa. Can you explain what you are trying to do, and what is going wrong? If you're getting a compiler error, why not post it? – John Zwinck Jul 17 '11 at 20:15
  • 2
    http://stackoverflow.com/questions/236129/how-to-split-a-string-in-c –  Jul 17 '11 at 20:18
  • 2
    Regarding "it seems like a shortcomming that C++ can't deal with this that easily"...it is true that text processing is not one of C++'s strong suits, but you're the one writing code that calls `find` three times with the same arguments. PEBCAK. – John Zwinck Jul 17 '11 at 20:20
  • Sorry. The error changes depending on what workaround I attempt, but it always comes down to the inability to convert const char* to char* That's about it. It's part of a program to obscure messages input to it. – Josh Jul 17 '11 at 20:20
  • Can you explain a little more what the actual purpose or goal of the code is? – John Zwinck Jul 17 '11 at 20:21
  • In addition to the other problems people have pointed out, you can't execute .length() on the const char* that c_str() returns. Instead, you can use strlen() on it (but with the c_str() as the argument, not the object). – Ben Hocking Jul 17 '11 at 20:22
  • @Josh: That's the entire point of `const`. – Cat Plus Plus Jul 17 '11 at 20:23
  • Half your problem is misxing two different languages together into one piece of code. Treat C/C++ as completely different languages. – Martin York Jul 17 '11 at 21:26

6 Answers6

6

How to break text into words (the easy way)

#include <string>
#include <sstream>
#include <vector>
#include <algorithm>
#include <iterator>

int main()
{
    std::string message="The quick brown fox jumped over the lazy dog.";

    std::vector<std::string>    words;

    std::stringstream   stream(message);                  // 1: Create a stream
    std::copy(std::istream_iterator<std::string>(stream), // 2: Copy words from the stream
              std::istream_iterator<std::string>(),
              std::back_inserter(words));                 //    into the back of the vector.
}

A break down on how it works (for 12 year old's learning to program)

  • The operator >> when applied to a stream (and a string) reads a single (white) space separated word.

    std::string message="The quick brown fox jumped over the lazy dog.";
    std::stringstream   stream(message);
    
    std::string  word;
    stream >> word; // Reads "The"
    stream >> word; // Reads "quick"
    stream >> word; // Reads "brown" etc...
    
  • The istream_iterator is an adapter for streams that make them look like containers.
    It reads items from the stream of type 'T' using the operator >>

    std::stringstream   stream("The quick brown fox jumped over the lazy dog.");
    std::istream_iterator<std::string> i(stream);
    
    std::string  word;
    
    word = *i;      // de-reference the iterator to get the object.   Reads "The"
    ++i;
    word = *i; ++i; // Reads "quick"
    word = *i; ++i; // Reads "brown" etc
    
    // Works for any type that uses >> to read from the stream 
    std::stringstream   intstream("99 40 32 64 20 10 9 8 102");
    std::istream_iterator<int> i(stream);  // Notice the type is int here
    
    int   number;
    
    number = *i;      // de-reference the iterator to get the object.   Reads "99"
    ++i;
    number = *i; ++i; // Reads "44"
    number = *i; ++i; // Reads "32" etc
    
  • The standard algorithms all work on iterators.
    std::copy iterates over the source and places each item in the destination:

    int    src[] = { 1, 2, 3, 4, 5, 6 };
    int    dst[] = { 0, 0, 0, 0, 0, 0 };
    
    std::copy(src, src+6, dst); // copies src into dst
                                // It assumes there is enough space in dst
    
  • The back_inserter is an adapter that uses push_back to add items to the container.
    We could have made sure that the destination vector was the correct size. But it is easier to use the back_inserter to make sure the vector is dynamically sized.

    int    src[] = { 1, 2, 3, 4, 5, 6 };
    std::vector<int> dst; // Currently has zero size
    
    std::copy(src, src+6, std::back_inserter(dst)); // copies src into dst
                                                    // back_inserter expands dst to fit
                                                    // by using push_back
    
  • Putting it all back together:

    // Create a stream from the string
    std::stringstream   stream(message);
    
    // Use std::copy to copy from the string.
    //     The stream is iterated over a word at a time.
    //     because the istream iterator is using std::string type.
    //
    //     The istream_iterator with no parameters is equivelent to end.
    //
    //     The destination appends the word to the vector words.
    std::copy(std::istream_iterator<std::string>(stream), // 2: Copy words from the stream
              std::istream_iterator<std::string>(),
              std::back_inserter(words));                 //    into the back of the vector.
    
Martin York
  • 257,169
  • 86
  • 333
  • 562
  • 1
    O, crap. I'm twelve, and what is this? Ummmm, WAY above my level. Let me look these up... – Josh Jul 17 '11 at 21:40
5

You want to split a string into tokens by space. You should use a proper tool — Boost.Tokenizer for example. Your code is wrong in several ways:

  1. You cannot define an array like that, the dimension must be a compile-time constant expression.
  2. Arrays are not pointers, and you cannot assign to an array.
  3. c_str returns a pointer that's valid as long as the string object is valid.
  4. Don't use char* unless you need to. Make that vector hold std::string.
  5. c_str returns a char* which doesn't have length member function, or any member function for that matter.

This indicates that you lack some fundamental knowlegde in C++. You should perhaps read a good book on C++. So, no, it's not a shortcoming of C++.

Really, just use Boost.Tokenizer. It has an example of splitting by space in the docs.

Community
  • 1
  • 1
Cat Plus Plus
  • 125,936
  • 27
  • 200
  • 224
1

string's substr method returns a new temporary string, while the c_str method returns a pointer right into the memory of that temporary string. Simply put, holding a pointer into the temporary buffer results in undefined behavior. If you wan to keep sub-strings, use the string class instead. (i.e. vector<string>)

Shiroko
  • 1,437
  • 8
  • 12
  • Wait, that sounds like bad behavior of the garbage collector. – Josh Jul 17 '11 at 20:29
  • C++ doesn't have a garbage collector. If you use `char*` or other raw pointers it's *your* job to manage the memory, which is precisely why you should avoid using them if possible. –  Jul 17 '11 at 20:35
  • @Josh: Look into exception safety, RAII, and smart pointers while you're at it :) C++ is quite needy... – Merlyn Morgan-Graham Jul 17 '11 at 21:03
  • @Josh: Don't learn delete. Learn the library classes that will delete for you. It's much, much safer, and much faster to learn and use too. – Puppy Jul 17 '11 at 22:08
0

Do you want copies of the words in the array, or just pointers into the original?

If you want copies, then you'll need a vector of strings.

This code will create copies of the words:

vector<string> words;
string::iterator it1 = message.begin();
string::iterator it2 = find(message.begin(), message.end(), ' ');
while (it2 != message.end())
{
    words.push_back(string(it1, it2));
    it1 = ++it2;
    it2 = find(it2, message.end(), ' ');
}
words.push_back(string(it1, it2));

This code will give you pointers into the original words:

vector<char*> words;
string::iterator it1 = message.begin();
string::iterator it2 = find(message.begin(), message.end(), ' ');
while (it2 != message.end())
{
    words.push_back(&*it1);
    it1 = ++it2;
    it2 = find(it2, message.end(), ' ');
}
words.push_back(&*it1);
Peter Alexander
  • 53,344
  • 14
  • 119
  • 168
  • Give me time to process this one. New stuff. – Josh Jul 17 '11 at 20:30
  • I want the data to persist until the end of main() at the moment, so which would you suggest? They need to persist to a later loop in main where they get scattered/randomized. – Josh Jul 17 '11 at 20:36
  • the original pointer option gives me this error: corruptionEncryption.cpp:36:38: error: conversion from ‘std::vector::iterator’ to non-scalar type ‘std::basic_string::iterator’ requested corruptionEncryption.cpp:37:62: error: no matching function for call to ‘find(std::vector::iterator, std::vector::iterator, char)’ corruptionEncryption.cpp:38:27: error: no match for ‘operator!=’ in ‘it2 != words.std::vector<_Tp, _Alloc>::end [with _Tp = char*, _Alloc = std::allocator, std::vector<_Tp, _Alloc>::iterator = __gnu_cxx::__norma... – Josh Jul 17 '11 at 20:43
  • Apologies: most of those `words` are meant to be `message`. I'll fix it [edit: done] – Peter Alexander Jul 17 '11 at 20:45
  • I was lazy and copy/paste'd them. – Josh Jul 17 '11 at 20:48
0

OMG

char *message="The quick brown fox jumped over the lazy dog.";

vector<char*> words;

int p=0;
while(message[p])
{
 words.push_back(&message[p++]);
}

//if you Must use a string, change to 
words.push_back(&message.c_str()[p++]);

This is, supposing you want Pointers to each character (I don't know why you would like that but that is what you have coded).

What are you trying to do BTW?

Valmond
  • 2,897
  • 8
  • 29
  • 49
0

Typically, you would just use vector<string>. As a general rule in C++, it's a huge warning sign to see raw pointers- either use smart pointers and allocate the memory dynamically, or use value types - such as std::string.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • I had to use them for a artificial neural network I spent 6 months on in high school--they're familiar. – Josh Jul 17 '11 at 20:29
  • @Josh: That's probably the *problem*. Nobody who knows C++ would write anything like the above code. – Puppy Jul 17 '11 at 20:52
  • Well, I did it and it worked. I think I broke it to the point where I got around const by having it point to something already initialized...It was awhile ago and the code died with my last HDD. – Josh Jul 17 '11 at 21:12