0

In Perl, I can replace characters with their lowercase version like so:

my $str = "HELLO WORLD HOW ARE YOU TODAY";
$str =~ s/([AEIOU])/\L\1/g;
print $str;  # HeLLo WoRLD HoW aRe You ToDaY";

How can I do this with a C++ std::regex_replace? Can I flip it into some sort of mode that activates magic features such as this?

(The real search pattern is more complex, otherwise I'd just do it by hand without a regex!)

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Why not use `boost::regex`? It supports case change operators in the replacement pattern. For `std::regex`, you will need to add the callback functionality. – Wiktor Stribiżew Nov 09 '18 at 13:30
  • @WiktorStribiżew How can I do that? – Lightness Races in Orbit Nov 09 '18 at 13:33
  • See https://stackoverflow.com/questions/46909477/selectively-replace-doublequotes-in-a-stdstring-in-c/46909674#46909674 or https://stackoverflow.com/a/38292474/3832970 – Wiktor Stribiżew Nov 09 '18 at 13:35
  • related: [`[re]/table 26`](http://eel.is/c++draft/re#tab:re:matchflag). According to the `match_­flag_­type` chosen, the regex replace function conforms with [ECMAScript regex](https://tc39.github.io/ecma262/#sec-string.prototype.replace) or [POSIX regex](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html). – YSC Nov 09 '18 at 13:38
  • [Looks like you can't](https://stackoverflow.com/questions/53112726/how-do-i-use-stdregex-replace-to-replace-string-into-lowercase). Not sure if you want to close as a dupe or not. – NathanOliver Nov 09 '18 at 14:00
  • @NathanOliver Yeah that's the badger - thanks. Can't dupeclose though cos no accepted answer; figures. – Lightness Races in Orbit Nov 09 '18 at 14:35
  • Ugh, it doesn't do lookbehind either? FML! – Lightness Races in Orbit Nov 09 '18 at 14:38
  • From what I have seen C++'s regex is limited in what it supports. It's strange that we only have a partial regex library. Maybe one day it will evolve into something better :-) – NathanOliver Nov 09 '18 at 14:40

1 Answers1

0

No, there isn't.

In the end I adapted various bits of code from SO into the following:

#include <regex>
#include <string>
#include <iostream>

// Regex replacement with callback.
// f should be compatible with std::string(const std::smatch&).
// Adapted from code by John Martin (CC-by-SA 3.0).
//   (https://stackoverflow.com/a/37516316/560648)
template <typename UnaryFunction>
inline std::string RegexReplace(const std::string& source, const char* const re_str, UnaryFunction f, size_t* numMatches = nullptr)
{
    try
    {
        std::string s;

        std::smatch::difference_type positionOfLastMatch = 0;
        std::string::const_iterator first = std::begin(source), last = std::end(source);
        auto endOfLastMatch = first;

        if (numMatches)
            *numMatches = 0;

        auto callback = [&](const std::smatch& match)
        {
            auto positionOfThisMatch = match.position(0);
            auto diff = positionOfThisMatch - positionOfLastMatch;

            auto startOfThisMatch = endOfLastMatch;
            std::advance(startOfThisMatch, diff);

            s.append(endOfLastMatch, startOfThisMatch);
            s.append(f(match));

            auto lengthOfMatch = match.length(0);

            positionOfLastMatch = positionOfThisMatch + lengthOfMatch;

            endOfLastMatch = startOfThisMatch;
            std::advance(endOfLastMatch, lengthOfMatch);

            if (numMatches)
                (*numMatches)++;
        };

        std::regex re{re_str};
        std::sregex_iterator begin(first, last, re), end;
        std::for_each(begin, end, callback);

        s.append(endOfLastMatch, last);
        return s;
    }
    catch (const std::regex_error&)
    {
        return "";
    }
}

int main()
{
    // Change all letters but the first to lower case, but only in
    // words starting with a vowel

    const std::string str = "HELLO WORLD HOW ARE YOU TODAY";
    auto lowercaseSecondSubmatch = [](const std::smatch& match) -> std::string
    {
        if (match.size() != 3)
            return "WTF";

        std::string result = match.str(0);

        // Lowercase the second submatch
        const auto pos = match.position(2) - match.position(0);
        const auto len = match.length(2);
        const auto start = result.begin() + pos;
        const auto end   = result.begin() + pos + len;
        std::transform(start, end, start, ::tolower);

        return result;
    };

    size_t n = 0;
    std::cout << RegexReplace(str, R"REGEX(\b([AEIOUY])([A-Z]+))REGEX", lowercaseSecondSubmatch, &n) << '\n';
    std::cout << n << " matches\n";
}

(live demo)

It's certainly not perfect, but it works in a crunch.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055