1

I'm using a regex to replace values within some html code. It correctly matches all instances within the html code but when using Regex.Replace() with back references it doesn't replace the back references.

For example

html = "<td>[element]elementreference='oldvalue';[/element]</td>";

html = Regex.Replace(html, @"(['""#(=])" + elementReference.Key + @"(['""#)];|&)", "$1" +   elementReference.Value + "$2", RegexOptions.IgnoreCase);

results in:

"<td>[element]elementreference=$1newvalue'[/element]</td>"

but if I use

html = "<td>[element]elementreference='oldvalue';[/element]</td>";

var regex = new Regex(@"(['""#(=])" + elementReference.Key + @"(['""#)];|&)", RegexOptions.IgnoreCase);
foreach (Match match in regex.Matches(html))
{
    html = html.Replace(match.Value, match.Groups[1] + elementReference.Value + match.Groups[2]);
}

the result is

"<td>[element]elementreference='newvalue'[/element]</td>"

which is what I expected.

Can anyone explain why using Regex.Replace() did not work?

EDIT

I am not attempting to replace the inner html, I am attempting to replace the 'oldvalue' part of [element]elementreference='oldvalue'[/element], which just happens to be in a html tag. My problem lies with the fact that I am trying to replace the apostrophe around the text, by using a back reference. This apostrophe could be a number of values, that is why I am using a back reference.

Lisa Young
  • 240
  • 1
  • 4
  • 10

1 Answers1

1

If I try your codes, neither does any replacement, because there is no semicolon after the value that you are trying to replace.

If you remove the semicolon from the regular expression, both works:

html = Regex.Replace(html, @"(['""#(=])" + "oldvalue" + @"(['""#)]|&)", "$1" + "asdf" + "$2", RegexOptions.IgnoreCase);

does the same as:

var regex = new Regex(@"(['""#(=])" + "oldvalue" + @"(['""#)]|&)", RegexOptions.IgnoreCase);
foreach (Match match in regex.Matches(html))
{
    html = html.Replace(match.Value, match.Groups[1] + "asdf" + match.Groups[2]);
}

Edit:

When I try the updated code from the question, it works fine:

string html;
KeyValuePair<string, string> elementReference = new KeyValuePair<string, string>("oldvalue", "newvalue");

html = "<td>[element]elementreference='oldvalue';[/element]</td>";

html = Regex.Replace(html, @"(['""#(=])" + elementReference.Key + @"(['""#)];|&)", "$1" + elementReference.Value + "$2", RegexOptions.IgnoreCase);

Console.WriteLine(html);

html = "<td>[element]elementreference='oldvalue';[/element]</td>";

var regex = new Regex(@"(['""#(=])" + elementReference.Key + @"(['""#)];|&)", RegexOptions.IgnoreCase);
foreach (Match match in regex.Matches(html)) {
  html = html.Replace(match.Value, match.Groups[1] + elementReference.Value + match.Groups[2]);
}

Console.WriteLine(html);

Output:

<td>[element]elementreference='newvalue';[/element]</td>
<td>[element]elementreference='newvalue';[/element]</td>
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • I missed the semi-colon when editing the line in my question. The regex needs the semi-colon – Lisa Young Mar 01 '13 at 16:06
  • @LisaYoung: If you add the semicolon in the original value, then the code works just fine as it is. – Guffa Mar 01 '13 at 16:18
  • Every time I've run the code, using Regex.Replace() the output does contains $1 instead of the back reference. – Lisa Young Mar 01 '13 at 16:38
  • @LisaYoung: I can't repeat that. When I try the code, pasted directly from the question with no modifiations at all, it works fine. – Guffa Mar 01 '13 at 16:50