How can I replace Html inside pre tag? I would prefer to do that with Regex
<html>
<head></head>
<body>
<div>
<pre>
<html>
<body>
-----> hello! ----<
</body>
</html
</pre>
</div>
</body>
How can I replace Html inside pre tag? I would prefer to do that with Regex
<html>
<head></head>
<body>
<div>
<pre>
<html>
<body>
-----> hello! ----<
</body>
</html
</pre>
</div>
</body>
EDIT: As indicated by another answer, regex does not support HTML or XHTML completely, and so you will be better off using an HTML parser instead. I'm leaving my answer here for reference though.
What do you want to replace the content inside the pre-tags with?
I'm not familiar with the specific C# syntax, but provided C# uses Perl-style regexes, the following PHP-snippet might be helpful. The code below will replace the content inside the pre-tags with the string "(pre tag content was here)" (just tested with the command line PHP client):
<?php
$html = "<html><head></head><body><div><pre class=\"some-css-class\">
<html><body>
-----> hello! ----<
</body></html
</pre></div></body>"; // Compacting things here, for brevity
$newHTML = preg_replace("/(.*?)<pre[^<>]*>(.*?)<\/pre>(.*)/Us", "$1(pre tag content was here)$3", $html);
echo $newHTML;
?>
The ?
mark is to make the matching non-greedy (stop at first occurence of what comes after), and the mU
modifiers specifies "Unicode-character-support" and "single-line support". The latter is important to make .
match newlines also. The [^<>]*
part is for supporting attributes in the pre tag, such as <pre class="some-css-class">
(it will match any number of characters except for <
or >
.
UPDATE: As indicated by Martinho Fernandes in the comments below, the C# syntax for the above regex should be something like:
new Regex(@"(.*?)<pre[^<>]*>(.*?)<\/pre>(.*)", RegexOptions.SingleLine)
foo`. Don't know if it matters for the OP, though. – R. Martinho Fernandes Feb 16 '11 at 12:39
FAIL"">`. Seriously, [stop trying](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). – R. Martinho Fernandes Feb 16 '11 at 12:53
]*>(.*?)<\/pre>(.*)", RegexOptions.SingleLine)` (.NET regexes/strings are Unicode already).– R. Martinho Fernandes Feb 16 '11 at 14:35
RegEx match open tags except XHTML self-contained tags
Thank you martinho fernandes