1

I have some html like the below:

<img src="/web/20110208042711im_/http://coralifeaqualight.com/wp-content/themes/xtheme/images/coralife-aqualight-pro.png" alt="">

What I want to be able to do is use regex to clear any html attributes that have no value, in this case: alt="". I cannot figure out how to look for any string that with a space and contains ="" as that would do it for me, does anyone know

Aaron Gibson
  • 1,280
  • 1
  • 21
  • 36

2 Answers2

1

Parsing HTML with Regex is generally considered a bad idea as there is too many edge cases. Read for yourself. http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

A favored solution is HTML Agility Pack

Also see this Stackoverflow question about the usage of Regex for HTML: here

Community
  • 1
  • 1
The Muffin Man
  • 19,585
  • 30
  • 119
  • 191
0

I'm no regex genius, but I believe String.Replace("\s\w+=\"\"", String.Empty) would do it, if you've got that whole tag in a string.

RedBrogdon
  • 5,113
  • 2
  • 24
  • 31