0

I got a text file that contain lots of line like the following:

data[0]: a=123 b=234 c=3456 d=4567 e=123.45 f=234.56

I am trying to extract the number out in order to convert it to a csv file in order to let excel import and recognize it.

My logic is, find the " " character, then chop the data out. For example, chop between first " " and second " ". Is it viable? I have been trying on this but I did not succeed.

Actually I want to create a csv file like

a, b, c, d, e, f

123, 234, 3456 .... blablabla

234, 345, 4567 .... blablabla

But it seems it is quite difficult to do this specific task. Are there any utilities/better method that could help me to do this?

neurothew
  • 185
  • 1
  • 2
  • 11

2 Answers2

1

I suggest you take a look at boost::tokenizer, this is the best approach I have found. You will find several example on the web. Have also a look at this high-score question.

Steps: for each line:

  1. Cut string in two parts using the : character
  2. Cut the right part into several strings using space character
  3. separate the values using the = character, and stuff these into a std::vector<std::string>
  4. Put these values in a file.

Last part can be something like:

std::ofstream f( "myfile.csv" );
for( const auto& s: vstrings )
    f << s << ',';
f << "\n";
Community
  • 1
  • 1
kebs
  • 6,387
  • 4
  • 41
  • 70
  • This solution seems viable. I am not a c++ professional but a new hand. Does it mean I can just download the whole boost library then put it in some place(I dont know where to put it), such that I can use those function? – neurothew Jun 18 '14 at 08:35
  • I am using Visual Studio 2013, I have selected the additional included directories to the boost directory, there are no errors in #include sentence, but there are several errors like "Error: namespace "boost" has no member "char_separator". What is happening? – neurothew Jun 18 '14 at 09:01
  • For your second comment: did you check [this](http://www.boost.org/doc/libs/1_55_0/libs/tokenizer/char_separator.htm) ? Try to make the example work as a stand-alone program, then elaborate up to your use-case. – kebs Jun 18 '14 at 09:14
1

A easy way with no non-Standard libraries is:

std::string line;
while (getline(input_stream, line))
{
    std::istringstream iss(line);

    std::string word;
    if (is >> word)   // throw away "data[n]:"
    {
        std::string identifier;
        std::string value;
        while (getline(iss, identifier, '=') && is >> value)
            std::cout << value << ",";
        std::cout << '\n';
    }
}

You can tweak it if training commas are causing excel any trouble, add more sanity checks (e.g. that value is numeric, that fields are consistent across all lines), but the basic parsing above is a start.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252