1

I have been looking around for a solution to escape single quotes in a std::string, without finding a clean way to do it.

This post gives few solutions like this one:

std::wstring regex_escape(const std::wstring& string_to_escape) {
    static const boost::wregex re_boostRegexEscape( _T("[\\^\\.\\$\\|\\(\\)\\[\\]\\*\\+\\?\\/\\\\]") );
    const std::wstring rep( _T("\\\\\\1&") );
    std::wstring result = regex_replace(string_to_escape, re_boostRegexEscape, rep, boost::match_default | boost::format_sed);
    return result;
}

Quite cool but too complicated for my requirement. Is there an easier, more understandable (and standard) way to solve this problem (without hitting performance)?

Note: maybe I am finding the above too compicated because I don't really understand what this line is doing: const std::wstring rep( _T("\\\\\\1&") )

Community
  • 1
  • 1
Kam
  • 5,878
  • 10
  • 53
  • 97
  • 3
    [Raw string literals.](http://en.cppreference.com/w/cpp/language/string_literal) – David G Oct 25 '14 at 02:36
  • 2
    You can use raw string literals: `std::wstring s = LR"(no need to escape in here)";` [Raw string literals](http://www.stroustrup.com/C++11FAQ.html#raw-strings). – Galik Oct 25 '14 at 02:39
  • Well I learned something today...hadn't heard of raw string literals in C++11. Finally, a way of getting an asymmetric delimiter for string...though it costs 2 characters on the lead and the asymmetric delimiters themselves cost 2 characters. [Rebol and Red](http://blog.hostilefork.com/why-rebol-red-parse-cool/) by using `{` and `}` as string delimiters does it better. *(But hindsight and lack of need to be compatible with something like C syntax allows a lot of leeway in language design, C++ has the whole Frankenstein thing going on...I should be C++ for Halloween.)* – HostileFork says dont trust SE Oct 25 '14 at 02:56
  • 1
    `[\\^\\.\\$\\|\\(\\)\\[\\]\\*\\+\\?\\/\\\\]` can be simplified to `[^.$|()\\[\\]*+?/\\\\]`. But **wait a minute**, that one is for escaping meta characters in regex. Are you sure you have the same requirement? You are talking about `escape single quotes` – nhahtdh Oct 25 '14 at 03:28

1 Answers1

5

I'm quite impressed by the very large number of people who will give an answer using Regular Expressions to do something extremely simple such as escaping one character in a string. You mentioned performance, using regular expressions is definitively not going to be fast unless you have a rather complicated test to perform before the transformation or if your end users is in control of the transformation (i.e. they have to write the regex.)

Frankly, in this case you should just write it with a simple loop:

 std::string result;
 size_t const len(input.length());
 result.reserve(len + 10);  // assume up to 10 single quotes...
 for(size_t idx(0); idx < len; ++idx)
 {
     if(input[idx] == '\'')
     {
          result += "\\\'";
     }
     else
     {
          result += input[idx];
     }
 }

This is likely to give you the best performance. Yeah, it's not just one simple function call... some people would search for '\'' with find(), the scanning would be very close to this scanning, but copying a substr() generally costs more than copying the characters as you scan.

Note that if you are using boost there is a replace_all() function in there which you could use too. It would be cleaner, but you did not mention boost... There is an answer with replace_all() (among other solutions):

How to find and replace string?

Community
  • 1
  • 1
Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
  • The OP uses Boost Regex, while it is not necessary that the whole lib is installed, I think it is still OK to offer a solution in Boost if it is shorter. I think using a function to do this allows to you think at higher level and write less of the codes to do string manipulation - I think it is the same motivation when regex is used. – nhahtdh Oct 25 '14 at 04:17
  • I guess it depends how much you worry about the performance aspect of the thing. – Alexis Wilke Oct 25 '14 at 04:29
  • 2
    Part of the performance aspect is that string-escape functions only rarely have to do anything, so they should optimize for the case where there is nothing to escape. `string::find` will quite probably use SSE to do up to 16 comparisons in parallel, so it is likely to reject a string without apostrophes much faster than a test-and-copy loop. In order to properly take advantage of that, though, you need an "in-place" API. The transform isn't really done in place -- the new string is swapped -- but if there is nothing to do, the identity transform is in-place. – rici Oct 25 '14 at 04:54