I'm just trying to replace header tag inside some html with another string. My html looks like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>aboutus</title>
<header id="headerfasdfasdfasdf">
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer pulvinar commodo lorem, sit amet malesuada.</p>
</header>
<!-- #include virtual="/html/US/global_header.html" --><script type="text/javascript">
var header = document.getElementsByTagName("header");
var len = header.length
if(len > 1)
{
header[0].style.display = "none";
}
</script>
<!--ls:begin[component-1400226725207]-->
<!-- OTHER PART IS CUT FOR BREVITY -->
</html>
I tried to parse it with regex <header(.|\n|\r)*<\/header>
, but it works really slow until I remove |\r
part from it.
Also I have noticed that original regex works fine with html that doesn't contain comments like <!--ls:begin[component-1400226725207]-->
.
Note that I'm using .NET regex engine with C# and my replace code looks like this:
var regex = @"<header(.|\n|\r)*<\/header>";
var result = Regex.Replace(input, regex, to, RegexOptions.IgnoreCase);
Please help me understand why do I have this issue.