0

I used <a[^>]*?title=\"([^\"]*?\"[^>]*?> and found all Links with title tags. How can I find all Links with no title tags & title attribute? And how can I find all image ALT tags that are empty or have no ALT tag?

Jared Farrish
  • 48,585
  • 17
  • 95
  • 104
  • 1
    What happens if I define the title with the use of an SGML parsed entity? Huh? Huh? (HTML is far nastier than it appears to be at first glance, as Fredrik points out. **Use a proper parser.** We mean it.) – Donal Fellows May 27 '11 at 20:39

2 Answers2

2

See this classic post on SO RegEx match open tags except XHTML self-contained tags

Community
  • 1
  • 1
Fredrik Pihl
  • 44,604
  • 7
  • 83
  • 130
2

@Fredrik has you covered pretty well, I think, but here's an alternate general method for this kind of sophisticated find/replace in markup.

Since I'm no regex guru, I like to use jQuery + browser debugger tools + copy/pasting for this kind of thing. I view the page in Firefox(Chrome/dev tools works great, too), open up the Firebug console, and perform the actions in jQuery goodies, something like this:

$('a').each(function(){
  if ($(this).filter('[title]').length == 0) {
    //if there's no title attr
  } else if ($(this).attr('title') == "") {
    //if title is empty empty
  } 
  // etc.
});

// repeat pattern for imgs...

When you're done with your manipulations, copy the relevant section from the debugger (or just grab the whole <body>) and paste it back into your editor.

I find this method much easier to understand than regexes, but that's just because I'm not too bright. HTH.

peteorpeter
  • 4,037
  • 2
  • 29
  • 47