0

I have a map that stores id to value mapping, an input string can contain a bunch of ids. I need to replace those ids with their corresponding values. For example:

string = "I am in #1 city, it is now #2 time" // (#1 and #2 are ids here)
id_to_val_map = {1 => "New York", 2 => "summer"}

Desired output:

"I am in New York city, it is now summer time"

Is there a way I can have a callback function (that takes in the matched string and returns the string to be used as replacement) ? std::regex_replace doesn't seem to support that.

The alternative is to find all the matches, then compute their replacement values, and then perform the actual replacement. Which won't be that efficient.

Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • You can use `std::regex_replace`, but it seems like you have to use it multiple times. Honestly, I would parse string by hand if the holder value is actually `#1` and there can be no `#` anywhere else in the text. – Viacheslav Kroilov Oct 25 '19 at 22:16
  • regex then `std::map` – doug Oct 25 '19 at 22:17
  • Are you going to make replacements multiple times on the same template string (`string` in your example), or just one time per template? – BeeOnRope Oct 26 '19 at 01:41

2 Answers2

3

You might do:

const std::map<int, std::string> m = {{1, "New York"}, {2, "summer"}};
std::string s = "I am in #1 city, it is now #2 time";

for (const auto& [id, value] : m) {
    s = std::regex_replace(s, std::regex("#" + std::to_string(id)), value);
}
std::cout << s << std::endl;

Demo

Jarod42
  • 203,559
  • 14
  • 181
  • 302
0

A homegrown way is to use a while loop with regex_search() then
build the output string as you go.

This is essentially what regex_replace() does in a single pass.

No need to do a separate regex for each map item which has overhead of
reassignment on every item ( s=regex_replace() ) as well as covering the same
real estate with every pass.

Something like this regex

 (?s)
 ( .*? )                       # (1)
 (?:
      \#
      ( \d+ )                       # (2)
   |  $
 )

with this code

typedef std::string::const_iterator SITR;
typedef std::smatch X_smatch;
#define REGEX_SEARCH std::regex_search

std::regex _Rx =  std::regex( "(?s)(.*?)(?:\\#(\\d+)|$)" );

SITR start = oldstr.begin();
SITR end   = oldstr.end();
X_smatch m;

std::string newstr = "";

while ( REGEX_SEARCH( start, end, m, _Rx ) )
{
    newstr.append( m[1].str() );

    if ( m[2].matched ) {
    {
        // append the map keys value here, do error checking etc..
        // std::string key = m[2].str();
        int ndx = std::atoi( m[2].str() );
        newstr.append( mymap[ ndx ] );
    }
    start = m[0].second;
}

// assign the old string with new string if need be
oldstr = newstr;
  • instead of `std::atoi( m[2].str() );` use `std::stoi( m[2] );` which is much better. See [Why shouldn't I use atoi()?](https://stackoverflow.com/q/17710018/995714) – phuclv Oct 13 '20 at 09:55
  • @phuclv The only speed benefit I can see is that since `std::stoi()` is using a reference to a string it has the advantage of knowing its length ahead of time and using that as a loop counter instead of atoi() which checks the contents for NULL or '\0' to determine the end of string. They both use a pointer and arithmatic to access the array of chars (ie. char*). And since `m[n].str()` will always return a fully qualified null terminated char ptr to it, the only error would be not a number NAN where the return value is always 0. Nothing to see here. –  Oct 14 '20 at 17:56
  • I also did notice that `std::stol()` is mostly replaced with `std::stoi()` as most compilers are for 64 bit registers making it 63 bits + sign. If you have a number that needs to be 64 bits, its for display purposes, not for calculation. –  Oct 14 '20 at 18:08
  • @Maxt8r not only you have a known length, you also don't have the undefined behavior that `atoi` exhibits. The code is also shorter and cleaner. And if you want unsigned then there are also `stoui/stoul/stoull` – phuclv Oct 15 '20 at 00:36