3

I have a (large) template and want to replace multiple values. The replacement needs to be done case insensitive. It must also be possible to have keys that does not exist in the template.

For example:

[TestMethod]
public void ReplaceMultipleWithIgnoreCaseText()
{
    const string template = "My name is @Name@ and I like to read about @SUBJECT@ on @website@, tag  @subject@";  
    const string expected = "My name is Alex and I like to read about C# on stackoverflow.com, tag C#";
    var replaceParameters = new List<KeyValuePair<string, string>>
    {
        new KeyValuePair<string, string>("@name@","Alex"),
        new KeyValuePair<string, string>("@subject@","C#"),
        new KeyValuePair<string, string>("@website@","stackoverflow.com"),
        // Note: The next key does not exist in template 
        new KeyValuePair<string, string>("@country@","The Netherlands"), 
    };
    var actual = ReplaceMultiple(template, replaceParameters);
    Assert.AreEqual(expected, actual);
}

public string ReplaceMultiple(
                  string template, 
                  IEnumerable<KeyValuePair<string, string>> replaceParameters)
{
    throw new NotImplementedException(
                  "Implementation needed for many parameters and long text.");
}

Note that if I have a list of 30 parameters and a large template, I do not want 30 large strings in memory. Using a StringBuilder seems to be an option, but other solutions are also welcome.

Solution I tried but did not work

Solution found here (C# String replace with dictionary) throws an exception when a key is not in the colletion, but our users makes mistakes and in that case I want to just leave the wromg key in the text. Example:

static readonly Regex re = new Regex(@"\$(\w+)\$", RegexOptions.Compiled);
static void Main2()
{
    // "Name" is accidentally typed by a user as "nam". 
    string input = @"Dear $nam$, as of $date$ your balance is $amount$"; 

    var args = new Dictionary<string, string>(
        StringComparer.OrdinalIgnoreCase) {
    {"name", "Mr Smith"},
    {"date", "05 Aug 2009"},
    {"amount", "GBP200"}};


    // Works, but not case insensitive and 
    // uses a lot of memory when using a large template
    // ReplaceWithDictionary many args
    string output1 = input;
    foreach (var arg in args)
    {
        output1 = output1.Replace("$" + arg.Key +"$", arg.Value);
    }

    // Throws a KeyNotFoundException + Only works when data is tokenized
    string output2 = re.Replace(input, match => args[match.Groups[1].Value]);
}
Community
  • 1
  • 1
Alex Siepman
  • 2,499
  • 23
  • 31
  • What have you tried? Seems like it's a bad idea to just ask other people to write the code for you (unless you're offering to pay) – mason Aug 29 '14 at 19:19
  • [You might want to check this answer: Regex replacements inside a StringBuilder](http://stackoverflow.com/a/3504888/342740) – Prix Aug 29 '14 at 19:21
  • @mason. I uses a string but that costs to much memory. I find it hard to do a multiple replace using a StrngBuilder. Just couldn't get this to work when it should be case INsensitive. – Alex Siepman Aug 29 '14 at 19:24
  • 2
    @AlexSiepman [A very good example](http://stackoverflow.com/a/1231815/342740), [another example here](http://stackoverflow.com/a/12007487/342740) and [yet another example over here.](http://stackoverflow.com/a/6524918/342740) Might not be optimal to your situation but would give you some ideas to test and start with. – Prix Aug 29 '14 at 19:45
  • @Prix the A "very good example" solves my problem. It is not case sensitive and it seems to be efficient too. Thank you very much! – Alex Siepman Aug 29 '14 at 20:32
  • Alex Siepman, abandon my "solution" there is a major bug in it. Will fix tomorrow. – CSharpie Aug 29 '14 at 21:13
  • @CSharpie Nice that you want to fix that bug because you where the only with a solution that I uses a StringBuilder. Other solutions takes much testing to see how memory efficient they are. Memory efficient is more important than speed in my project. – Alex Siepman Aug 30 '14 at 05:53
  • Its not that trivial though, might take some time – CSharpie Aug 30 '14 at 07:36
  • @CSharpie, I understand. This problem is much harder that it seems when you see it for the first time. Still curious... – Alex Siepman Aug 30 '14 at 07:41
  • Check now. I undeleted my answer. – CSharpie Aug 30 '14 at 08:31
  • @Prix The solution you told me aboud gave an exception in a real world scenario. See extra example I added to the post. – Alex Siepman Aug 30 '14 at 13:20
  • @AlexSiepman of course it doesn't do checks for existent keys you will have to do that yourself. you can easily use `ContainsKey` for example and if there is nothing to replace it should give you the string back as is. – Prix Aug 30 '14 at 14:40
  • @Prix, you are right, I did something like that in my own answer. I didn't accept that because it always needs tokens. And the solution of CSharpie is a true replacement for string.Replace(). – Alex Siepman Aug 30 '14 at 15:06

5 Answers5

4

Using a StringBuilder seems to be an option, but other solutions are also welcome.

Since you want case insensitive, I'd suggest (non StringBuilder):

public static string ReplaceMultiple(
              string template, 
              IEnumerable<KeyValuePair<string, string>> replaceParameters)
{
    var result = template;

    foreach(var replace in replaceParameters)
    {
        var templateSplit = Regex.Split(result, 
                                        replace.Key, 
                                        RegexOptions.IgnoreCase);
        result = string.Join(replace.Value, templateSplit);
    }

    return result;
}

DotNetFiddle Example

Erik Philips
  • 53,428
  • 11
  • 128
  • 150
  • Thanks, but this creates mutiple long strings in case of a long string that holds the template. – Alex Siepman Aug 29 '14 at 20:21
  • Certainly does! There are very few options with immutable strings along with case-insensitivity. Consider MVC does something similar but because it is case-sensitive, the performance suffered is minimal. – Erik Philips Aug 29 '14 at 20:24
  • +1 Because it performed much better then I expected. – Alex Siepman Aug 30 '14 at 15:14
3

This is based off of Marc's answer the only real change is the check during the replacement and the boundary regex rule:

static readonly Regex re = new Regex(@"\b(\w+)\b", RegexOptions.Compiled);
static void Main(string[] args)
{
    string input = @"Dear Name, as of dAte your balance is amounT!";
    var replacements = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
    {
        {"name", "Mr Smith"},
        {"date", "05 Aug 2009"},
        {"amount", "GBP200"}
    };
    string output = re.Replace(input, match => replacements.ContainsKey(match.Groups[1].Value) ? replacements[match.Groups[1].Value] : match.Groups[1].Value);
}

And here is a 5000 iterations test benchmark, have not looked at memory or anything else.

test

Replacement function is the one you have checked as the accepted answer.

Community
  • 1
  • 1
Prix
  • 19,417
  • 15
  • 73
  • 132
  • 1
    This might be even better if you use TryGetValue instead of ContainsKey. – CSharpie Aug 30 '14 at 16:25
  • @CSharpie well you would have to handle the null exception for no matches, and you need to convert it into a delegate and since you would have to output the value to a string you would generate more overhead than using what you already have pre-loaded, not so sure it would be better. – Prix Aug 30 '14 at 16:29
  • This looks like the ultimate solution! – Alex Siepman Aug 30 '14 at 17:31
  • @AlexSiepman how was it memory wise compared to the other method? – Prix Aug 30 '14 at 17:32
  • It is 2 times slower than the original Marc Gravall solution and uses 3 times more memory. I do not understand why it takes more memory. Also TryParseValue is actually slower that your solution. I didn't expect this! – Alex Siepman Aug 30 '14 at 18:18
  • @AlexSiepman probably from the `ContainsKey` and yes I was expecting it to be slower than the original as it have to verify whether the key exists. So overall its slight better in memory management against what you had accepted and a lot better in speed compared to that 40 times slower one. Later on I will play with it some more and see if there is anything that can improve the memory management, could you post the text you were using or what was its size? – Prix Aug 30 '14 at 18:37
  • I used a Lorum Ipsum text of 50.000 unicode-chars = 100.000 bytes. (Above the line of large object heap) I replaced the words: lorem, ut, ante, magna, curabitur. You can generate the text yourself from: [link](http://www.lipsum.com/) – Alex Siepman Aug 30 '14 at 20:49
  • @AlexSiepman nice I will use those as a sample ;), ***are you able to read it from a file or you have to have to have the text in memory?*** Just wondering because you could further reduce memory usage if the data is read piece by piece. – Prix Aug 30 '14 at 21:02
  • For the Stack-overflow test, I read it from a resourcefile, compiled in the application. I have considered to read the template (in the real world scenario) from a file using a stream. But replacing in a stram is hard to do and makes the code hard to read for future maintainment. – Alex Siepman Aug 31 '14 at 07:58
1

Here is an extension method that I created to solve this:

using System;
using System.Text;

public class Program
{
    public static void Main()
    {
        StringBuilder sb = new StringBuilder("This should show case {SensitiveOrInsensitive} tokenization");

        sb.ReplaceCaseInsensitive("{sensitiveorinsensitive}", "Insensitive");

        Console.WriteLine(sb.ToString());//output: "This should show case Insensitive tokenization"
    }
}

public static class SystemTextExtentionMethods
{
    public static StringBuilder ReplaceCaseInsensitive(this StringBuilder builder, string oldValue, string newValue)
    {
        //replace stirng insensitive 
        int index = builder.ToString().ToLower().IndexOf(oldValue.ToLower());
        while (index != -1)
        {
            var token = builder.ToString().Substring(index, oldValue.Length);
            builder.Replace(token, newValue);
            index = builder.ToString().ToLower().IndexOf(oldValue.ToLower());
        }
        return builder;
    }
}
0

I think I might have something you could try. I used something similar to it for email templates

    public string replace()
    {
        string appPath = Request.PhysicalApplicationPath;
        StreamReader sr = new StreamReader(appPath + "EmailTemplates/NewMember.txt");

        string template = sr.ReadToEnd();

        template = template.Replace("<%Client_Name%>",
            first_name.Text + " " + middle_initial.Text + " " + last_name.Text);

        //Add Customer data
        template = template.Replace("<%Client_First_Name%>", first_name.Text);
        template = template.Replace("<%Client_MI%>", middle_initial.Text);
        template = template.Replace("<%Client_Last_Name%>", last_name.Text);
        template = template.Replace("<%Client_DOB%>", dob.Text);

        return template;

    }

Inside of your template you can have tags such as <% %> as place holders for the values you want

Hope this helps!

waltmagic
  • 631
  • 2
  • 9
  • 22
0

The answer of Marc Gravell: C# String replace with dictionary can be changed an little bit so it does not throws an exception when the match can not be found. In this case it simply does not replace the match.

In case the string to be replace is tokenized, this is the solution:

static readonly Regex RegExInstance = new Regex(@"\$(\w+)\$", RegexOptions.Compiled);
public string  ReplaceWithRegEx(string template, Dictionary<string, string> parameters)
{
    return RegExInstance.Replace(template, match => GetNewValue(parameters, match));
}

private string GetNewValue(Dictionary<string, string> parameters, Match match)
{
    var oldValue = match.Groups[1].Value;
    string newValue;
    var found = parameters.TryGetValue(oldValue, out newValue);
    if (found)
    {
        return newValue;
    }
    var originalValue = match.Groups[0].Value;
    return originalValue;
}

I have tested the solution with a 100.000 bytes string, 7 keys and hundreds of replacements. It uses 7 times more memory then the lenght of the string. And it took only 0.002 seconds.

Community
  • 1
  • 1
Alex Siepman
  • 2,499
  • 23
  • 31