4

I want to use a regex on the reverse of a string.

I can do the following but all my sub_matches are reversed:

string foo("lorem ipsum");
match_results<string::reverse_iterator> sm;

if (regex_match(foo.rbegin(), foo.rend(), sm, regex("(\\w+)\\s+(\\w+)"))) {
    cout << sm[1] << ' ' << sm[2] << endl;
}
else {
    cout << "bad\n";
}

[Live example]

What I want is to get out:

ipsum lorem

Is there any provision for getting the sub-matches that are not reversed? That is, any provision beyond reversing the strings after they're matched like this:

string first(sm[1]);
string second(sm[2]);

reverse(first.begin(), first.end());
reverse(second.begin(), second.end());

cout << first << ' ' << second << endl;

EDIT:

It has been suggested that I update the question to clarify what I want:

Running the regex backwards on the string is not about reversing the order that the matches are found in. The regex is far more complex that would be valuable to post here, but running it backwards saves me from needing a look ahead. This question is about the handling of sub-matches obtained from a match_results<string::reverse_iterator>. I need to be able to get them out as they were in the input, here foo. I don't want to have to construct a temporary string and run reverse on it for each sub-match. How can I avoid doing this.

Community
  • 1
  • 1
Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • So you want to find a match, then reverse the word order of the matching fragments, then write it to cout/wherever? – Michael McPherson Aug 04 '15 at 17:03
  • @MichaelMcPherson Well I want to run the `regex` on the reverse of the string, yes. But I don't want my matches to be reversed... if that's possible. – Jonathan Mee Aug 04 '15 at 17:15
  • Okay, so you want to run a regex on a string that has been reversed character for character, and then output regular (non-reversed) words? – Michael McPherson Aug 04 '15 at 17:17
  • @MichaelMcPherson Right, as I mention [here](http://stackoverflow.com/questions/31815075/how-can-i-use-a-regex-on-the-reverse-of-a-string?noredirect=1#comment51556409_31815298) we're talking about a `regex` that is significantly more complex than this example, but it can be simplified by working in reverse. However, constructing `string`s and reversing them for each sub-match seems an unnecessary burden. – Jonathan Mee Aug 04 '15 at 17:20
  • What's the expected load? Doing individual reversing is definitely a pain, but if your problem set is small enough, it might still be the most efficient in terms of time (your time). – Michael McPherson Aug 04 '15 at 17:32
  • @MichaelMcPherson I'm not sure what you mean by "load"? You're right that this probably isn't tremendously significant. I just wanted to do it a better way if possible. – Jonathan Mee Aug 04 '15 at 17:39
  • By load, I was referring to how often this activity needs to happen per program iteration. If you're doing it twice, probably the string-and-reverse method is fine. If you're doing it a half a million times then yes, some sort of optimization is necessary. – Michael McPherson Aug 04 '15 at 17:56
  • 2
    Either way, this seems to be a bound up problem. If there was more information about the regex or the target string, maybe people could give you a better answer. But for the data presented here ("A string has been reversed. I have n matches in reversed form. How to I un-reverse them?") I think running reverse is the only actual answer. – Michael McPherson Aug 04 '15 at 17:59
  • 1
    If you are running reverse iterators on the input string you are going to end up with reversed string matches. The way I see it you have two options. a) reverse the matches. b) use reverse iterators from the matches in whatever it is you want to do with them. – Galik Aug 04 '15 at 18:07

2 Answers2

5

You could just reverse the order in which you use the results:

#include <regex>
#include <string>
#include <iostream>

using namespace std;

int main()
{
    string foo("lorem ipsum");
    smatch sm;

    if (regex_match(foo, sm, regex("(\\w+)\\s+(\\w+)"))) {
        cout << sm[2] << ' ' << sm[1] << endl; // use second as first
    }
    else {
        cout << "bad\n";
    }
}

Output:

ipsum lorem
Galik
  • 47,303
  • 4
  • 80
  • 117
  • This does not accomplish what I want. The regex is significantly more complex than the one that I have posted here in the example. In fact there is no way to accomplish this regex without look ahead support in the regex library. I am able to get around some of the complexity by iterating backwards, but I do not want to have to then construct `string`s and `reverse` them for each sub-match. – Jonathan Mee Aug 04 '15 at 17:18
  • 3
    @JonathanMee Well maybe you could try posting the actual regex? Or at least something that possesses the same intractability as your problem? – Galik Aug 04 '15 at 17:24
1

This is absolutely possible! The key is in the fact that a sub_match inherits from pair<BidirIt, BidirIt>. Since sub_matches will be obtained from: match_results<string::reverse_iterator> sm, the elements of the pair a sub_match inherits from will be string::reverse_iterators.

So for any given sub_match from sm you can get the forward range from it's second.base() to it's first.base(). You don't have to construct strings to stream ranges but you will need to construct an ostream_iterator:

ostream_iterator<char> output(cout);

copy(sm[1].second.base(), sm[1].first.base(), output);
output = ' ';
copy(sm[2].second.base(), sm[2].first.base(), output);

Take heart though, there is a better solution on the horizon! This answer discusses string_literals as of right now no action has been taken on them, but they have made it into the "Evolution Subgroup".

Community
  • 1
  • 1
Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288