0

consider this as the input string

 <h1 class="h1class"><span style="font-weight: bold;">abc</span></h1>

I want a regex to remove style="font-weight: bold;" from the above string.

I made a regex for same as style=[\s\w\W]*" , but " at the end of expression is not acceptable. and also I cannot use \W as the whole line will get selected then style="font-weight: bold;">abc</span></h1> but I want style="font-weight: bold;".

can anyone help me out to get the expected result.

Thanx in advance..!

  • negative character class -> something like this `[^"]*?"`. Also i would make your quantifier lazy using `?`, so that the first occurence of `"` gets matched. – bro Aug 12 '15 at 11:37
  • For all but the simplest strings a parser like Agility Pack is probably a better approach; http://stackoverflow.com/questions/13441470/htmlagilitypack-remove-script-and-style – Alex K. Aug 12 '15 at 11:39

4 Answers4

2

I want a regex to remove style="font-weight: bold;" from the above string.

Why would you want to use regular expressions for a fixed string replacement? Is String.Replace not enough for you?

input.Replace(@"style=""font-weight: bold;""", "");

That being said, you really should not work on HTML with string methods. Use a parser for any work that is even slightly more complex than the above.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
0
string input = @"<h1 class=""h1class""><span style=""font-weight: bold; "">abc</span></h1>";
var output = Regex.Replace(input, @"style=\"".+?\""", "");
Eser
  • 12,346
  • 1
  • 22
  • 32
  • 1
    While this code may answer the question, it would be better to include some context and explain how it works. Especially since regexes are not very human-readable. – ryanyuyu Aug 12 '15 at 13:13
0

Here is how to achieve the desired result using HtmlAgilityPack:

var html= "<h1 class=\"h1class\"><span style=\"font-weight: bold;\">abc</span></h1>";
HtmlAgilityPack.HtmlDocument hap = new HtmlAgilityPack.HtmlDocument();
hap.LoadHtml(html);
var nodes = hap.DocumentNode.Descendants("span");
if (nodes != null)
    foreach (var node in nodes)
       if (!string.IsNullOrEmpty((node.GetAttributeValue("style", string.Empty))))
           node.Attributes["style"].Remove();
Console.WriteLine(hap.DocumentNode.OuterHtml);

Output:

<h1 class="h1class"><span>abc</span></h1>

You can further adjust this according to your requirements.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

How about this Regex:

<([^>]*)(\sstyle=\".+?\"(\s|))(.*?)>

Replace pattern:

<$1$3>

See this Match.Replace demo on systemtextregularexpressions.com.

enter image description here

Result:

enter image description here

GRUNGER
  • 486
  • 3
  • 14