Using C++, I would like to split the rows of a string (CSV file in this case) where some of the fields may contain delimiters that are escaped (using "") and should be seen as literals. I have looked at the various questions already posed by have not found a direct answer to my problem.
Example of CSV file data:
Header1,Header2,Header3,Header4,Header5
Hello,",,,","world","!,,!,",","
Desired string vector after splitting:
["Hello"],[",,,"],["world"],["!,,!,"],[","]
Note: The CSV is only valid if the number of data columns equal the number of header columns.
Would prefer a non-boost / third-party solution. Efficiency is not a priority.
EDIT: Code below implementing regex from @ClasG at least satisfies the scenario above. I am drafting fringe test cases but would love to hear when / where it breaks down...
std::string s = "Hello,\",,,\",\"world\",\"!,,!,\",\",\"\"";
std::string rx_string = "(\"[^\"]*\"|[^,]*)(?:,|$)";
regex e(rx_string);
std::regex_iterator<std::string::iterator> rit ( s.begin(), s.end(), e );
std::regex_iterator<std::string::iterator> rend;
while (rit!=rend)
{
std::cout << rit->str() << std::endl;
++rit;
}