I used <a[^>]*?title=\"([^\"]*?\"[^>]*?>
and found all Links with title tags. How can I find all Links with no title tags & title attribute? And how can I find all image ALT tags that are empty or have no ALT tag?

- 48,585
- 17
- 95
- 104

- 1
- 1
-
1What happens if I define the title with the use of an SGML parsed entity? Huh? Huh? (HTML is far nastier than it appears to be at first glance, as Fredrik points out. **Use a proper parser.** We mean it.) – Donal Fellows May 27 '11 at 20:39
2 Answers
See this classic post on SO RegEx match open tags except XHTML self-contained tags

- 1
- 1

- 44,604
- 7
- 83
- 130
-
-
-
1Yikes! and what if the structure changes or goes multiline? Go for a html-parser before you hurt yourself – Fredrik Pihl May 27 '11 at 19:55
@Fredrik has you covered pretty well, I think, but here's an alternate general method for this kind of sophisticated find/replace in markup.
Since I'm no regex guru, I like to use jQuery + browser debugger tools + copy/pasting for this kind of thing. I view the page in Firefox(Chrome/dev tools works great, too), open up the Firebug console, and perform the actions in jQuery goodies, something like this:
$('a').each(function(){
if ($(this).filter('[title]').length == 0) {
//if there's no title attr
} else if ($(this).attr('title') == "") {
//if title is empty empty
}
// etc.
});
// repeat pattern for imgs...
When you're done with your manipulations, copy the relevant section from the debugger (or just grab the whole <body>
) and paste it back into your editor.
I find this method much easier to understand than regexes, but that's just because I'm not too bright. HTH.

- 4,037
- 2
- 29
- 47