3

So I'm trying to write a way to replace "100¢" to "100 cents" using Regex. The pattern I'm using is (\d+)(¢). On top of that, I'm also trying to replace other things, so I would need a Dictionary data structure to hold all these regex patterns as keys, and the values I'd want to replace them with as the dictionary value.

The code I have so far is this:

        var replacementsMap = new Dictionary<string, string>()
        {
            {@"(\d+)(¢)", "$1 cents"}
        };

There would be more in the dictionary, but to keep it simple, I'll just add that one pattern-value pair. I'm using back references to have the first capturing group with "cents" after it rather than the symbol.

Ex: 5¢ -> 5 cents

To replace, I'm doing it like this:

        string input = "100¢";
        Console.WriteLine(input); //showing original input


        var regex = new Regex(String.Join("|",replacementsMap.Keys));

        var newStr = regex.Replace(input, m => replacementsMap[m.Value]);
        Console.WriteLine(newStr); //showing new input

The error I'm getting is this, and I'm not really sure where I'm going wrong with my implementation:

Unhandled exception. System.Collections.Generic.KeyNotFoundException: The given key '100¢' was not present in the dictionary.
   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at Program.<>c__DisplayClass1_0.<Main>b__0(Match m) in Program.cs:line 79
   at System.Text.RegularExpressions.Regex.<>c.<Replace>b__99_0(ValueTuple`5& state, Match match)
   at System.Text.RegularExpressions.Regex.RunAllMatchesWithCallback[TState](String inputString, ReadOnlySpan`1 inputSpan, Int32 startat, TState& state, MatchCallback`1 callback, RegexRunnerMode mode, Boolean reuseMatchObject)
   at System.Text.RegularExpressions.Regex.RunAllMatchesWithCallback[TState](String input, Int32 startat, TState& state, MatchCallback`1 callback, RegexRunnerMode mode, Boolean reuseMatchObject)   
   at System.Text.RegularExpressions.Regex.Replace(MatchEvaluator evaluator, Regex regex, String input, Int32 count, Int32 startat)
   at System.Text.RegularExpressions.Regex.Replace(String input, MatchEvaluator evaluator)
   at Program.Main() in Program.cs:line 79
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Have you set breakpoints to look at things like `m.Value`? – madreflection Dec 08 '22 at 23:41
  • Yeah, m.Value = "100¢ Here's a link of me hovering over the breakpoint: https://imgur.com/a/ieGdf8R – chadveloper Dec 08 '22 at 23:46
  • Shouldnt this portion `m => replacementsMap[m.Value]);` be `m => replacementsMap[m.Key]);`? (replacing .Value with .Key) – hijinxbassist Dec 08 '22 at 23:50
  • @hijinxbassist I don't think so, I get an error when I do that: https://imgur.com/a/KZJarww – chadveloper Dec 08 '22 at 23:56
  • My mistake, I misunderstood what m was in that context. – hijinxbassist Dec 09 '22 at 00:02
  • That error is because are looking up your dictionary item from a value and not a key, the value will not be in the dictionary's key list. In your dictionary your key is the regular expression, but when you lookup the value in your dictionary you're passing it the value and not the key. Basically it looks like this line, `var newStr = regex.Replace(input, m => replacementsMap[m.Value]);` should actually be `var newStr = regex.Replace(input, m => replacementsMap[m.Key]);`. – quaabaam Dec 09 '22 at 00:23
  • 1
    @quaabaam: hijinxbassist already suggested that and was proven wrong. `m` is a `Match` object. The problem is that `m.Value` contains the actual matched value, whereas the dictionary contains patterns. – madreflection Dec 09 '22 at 00:25
  • @madreflection Do you have any suggestions? I'd like to be able to use multiple regex patterns that are tied to specific values, which is why I went with a dictionary to store it all. I know I could do stuff like `var replacementsMap = new Dictionary() {"¢", "cents"} };` but I need the replacement to handle those complex patterns – chadveloper Dec 09 '22 at 00:28
  • If performance is not a concern, it may be easiest for you to just loop through the items in your dictionary and perform the replacements one-by-one. This way you always have the pattern (the Key) on hand to retrieve the replacement Value. – Seanharrs Dec 09 '22 at 00:36
  • 1
    Related: [Regex replace multiple groups](https://stackoverflow.com/questions/7351031/regex-replace-multiple-groups) – Theodor Zoulias Dec 09 '22 at 04:42

1 Answers1

2

The problem is that when you have a match, this match doesn't contain the information on the original pattern that matched. So you cannot do a lookup in your dictionary because you don't have the pattern that you use as keys in the dictionary.

Solution: When combining the patterns into one, surround each pattern with a named capturing group. Base the name on the index of the pattern in your list of patterns.

You can then get that name from the match information, retrieve the original pattern and the replacement pattern from the list using the auto-generated name and apply the individual pattern to the matched value.

Sample code:

string input = "I have 5$ and 4€ and 6¢";

// Use a List instead of Dictionary so we can retrieve the entries by index
List<(string pattern, string replacement)> replacementInstructions = new List<(string pattern, string replacement)> {
    (@"(\d+)(¢)", "$1 cents"),
    (@"(\d+)(€)", "$1 euros"),
    (@"(\d+)(\$)", "$1 dollars"),
};

// Create combined pattern with auto-named groups
StringBuilder builder = new StringBuilder();

for(int i=0; i < replacementInstructions.Count; i++)
{
    if(i > 0) builder.Append("|");

    var (pattern, _) = replacementInstructions[i];

    string groupName = "GN" + i;
    builder.Append("(?<" + groupName + ">" + pattern + ")");
}

string combinedPattern = builder.ToString();
Console.WriteLine("Combined Pattern: " + combinedPattern);

// Declare callback that will do the replacement
MatchEvaluator evaluator = (Match match) =>
{
    // Get named group that matched and its name
    Group group = (from Group g in match.Groups
                   where g.Success &&
                   g.Name.StartsWith("GN")
                   select g).First();
    string groupName = group.Name;

    // Get number from groupname 
    // and then entry from replacementInstructions based on number
    string numberString = groupName.Substring(2);
    int number = int.Parse(numberString);
    var (pattern, replacement) = replacementInstructions[number];

    // Apply replacement pattern on match
    return Regex.Replace(match.Value, pattern, replacement);
};


// Replace
string result = Regex.Replace(input, combinedPattern, evaluator);

Console.WriteLine("Result: " + result);

Output:

Combined Pattern: (?<GN0>(\d+)(¢))|(?<GN1>(\d+)(€))|(?<GN2>(\d+)(\$))
Result: I have 5 dollars and 4 euros and 6 cents
NineBerry
  • 26,306
  • 3
  • 62
  • 93