0

I'm reading in CSV files and trying to remove the outer quotes. I'm currently using this:

std::string cell = "input "my quoted" cell"; // from `getline()`
std::stringstream cs;
std::string unquoted;
cs << cell;
cs >> std::quoted(unquoted);

This does work, but it seems to me that this is very inefficient, since I have to create a std::stringstream each time. Is there a direct way of removing the quotes (and escaping the inner quotes)?

Thank you in advance!

Markstar
  • 639
  • 9
  • 24
  • What are you parsing? Some JSon? If yest then just use some ready library. NlohmannJson is nice RapidJSon too. – Marek R Oct 07 '22 at 09:11
  • I'd do it manually. – HolyBlackCat Oct 07 '22 at 09:13
  • @HolyBlackCat: What do you mean by manually? Going through the string by character? – Markstar Oct 07 '22 at 09:19
  • Yep. Most likely you need to unescape other characters as well (`\n`, etc), which `std::quoted` doesn't do. – HolyBlackCat Oct 07 '22 at 09:20
  • Please explain why do you need this. It is highly portable you reinventing the wheel. It is a good practice o use ready solutions. Also remember about corner cases, like: quotes inside, escape sequences. Note `std::quoted` covers those. – Marek R Oct 07 '22 at 09:32
  • @MarekR: As I said above, I'm reading a CSV file (with header data and then reading each line into objects and aborting if there is an error). I'm aware that `std::quoted` escapes nested quotes, that's why I would like to use this instead of reinventing the wheel. But that should not matter for the question, which is simply asking whether or not there is a way to avoid creating a stream for each cell. – Markstar Oct 07 '22 at 10:03
  • So [RapidCSV](https://github.com/d99kris/rapidcsv) or [Boost Tokenizer](https://stackoverflow.com/a/1122720/1387438). – Marek R Oct 07 '22 at 10:10
  • @MarekR: No, I need to read the file row by row, cell by cell and decide individually what to do with that cell's data before proceeding. Thank you for replying, but your replies are not relevant for the question. – Markstar Oct 07 '22 at 10:31
  • Boost Tokenizer allows to do everything as you described. You feed line to it and it splits columns and drops quotes if needed for each column. If you not happy wit it try to implement this by hand then we will help you to make it better. – Marek R Oct 07 '22 at 10:39

1 Answers1

0

No it's not possible to quote or unquote with std::quote() without using a stream.

Instead, reuse a stream like the following example. It's fairly efficient.

std::ifstream file_in;
...
for (std::string line; std::getline(file_in, line); ) {
    std::istringstream line_in(line);
    while (...) { // tokenization loop(parse delimiters)
        std::string s;
        line_in >> std::quoted(s);
        ...
    }
}
relent95
  • 3,703
  • 1
  • 14
  • 17