1

have a string as:

access/2/NOTIF/PI/%24cname%3D/bldg/temp/s/2%24

When I try to run following code,

size_t found = str.find_first_of("NOTIF");
if (found != std::string::npos) {
    std::cout << "NOTIF found" << " at pos: " << found << std::endl;
    std::string substr = str.substr(found+8, m_name.length());
    std::cout << "SUBSTR: " << substr << std::endl;
}

I correctly get the position of N, which is 9. However when I try to subsr for '$', which is string is encoding as %24, it fails. Ideally, I am looking to extract a sub string between $ and $ (i.e. between %24 and %24). Substring is somehow is not recognizing this %24 as $.

What could be the problem here? Do I have to preprocess this before I can call substr?

AnilJ
  • 1,951
  • 2
  • 33
  • 60
  • "Do I have to preprocess this before I can call substr?" Yep you do. – Denilson Amorim May 12 '15 at 22:13
  • 5
    Why it should not fail? "%24" if definitely different then "$" even in size. If you want to achieve some kind of encoding you need to say what this encoding is. C++ don't guesses what kind of encoding you using. You also can look for "%24" – senfen May 12 '15 at 22:14
  • `std::string` know nothing about encoding, it is just container of chars. So in your string `%24` is 3 chars, and `std::string` do not know that `$` encoded as `%24`. – gomons May 12 '15 at 22:14
  • Is there any encoding API which can be used as preprocessor to avoid this? – AnilJ May 12 '15 at 22:16
  • Why do you want to preprocess the string? Isn't searching for `%24` enough? – Praetorian May 12 '15 at 22:22
  • Yes, it is. Just curious if it can be made more clean. It's easy to search for $ than %24, isn't it? – AnilJ May 12 '15 at 22:24
  • If it's possible you can use `QString` from Qt. It's encoding aware. C++ alone is not a very good tool to process strings when encoding matters. Also you can look at it's source code if you need to develop your own and get ideas. – max May 12 '15 at 22:25
  • The preprocessing also important and more clean when you have other % characters in the string. For example in above case, there is also %3D. – AnilJ May 12 '15 at 22:29
  • 1
    @AnilJ You're overthinking the whole thing. The presence of `%3D` is not a problem. Maybe you think it is because you were using the wrong member function to search in your example. [`find_first_of`](http://en.cppreference.com/w/cpp/string/basic_string/find_first_of) looks for *any* of the characters in the search string. What you want is [`find`](http://en.cppreference.com/w/cpp/string/basic_string/find). – Praetorian May 12 '15 at 22:31
  • related URL decoding question: http://stackoverflow.com/q/2673207/103167 – Ben Voigt May 12 '15 at 22:52

1 Answers1

1

Ideally, I am looking to extract a sub string between $ and $ (i.e. between %24 and %24)

Then search for %24, don't bother with passing the string through some API to convert it back to $.

auto first = s.find("%24");               // Look for first %24
auto second = s.find("%24", first + 1);   // Look for second %24
std::cout << s.substr(first + 3, second - (first + 3)); // This is the substring you're looking for

Live demo

Praetorian
  • 106,671
  • 19
  • 240
  • 328
  • 1
    Canonicalizing first is probably better, since not all programs (browsers, mostly) encode exactly the same set of characters in URLs. – Ben Voigt May 12 '15 at 22:50