2

I have the following string:

<img alt="over 40 world famous brandedWATCHES BRANDs to choose from
" src="http://www.fastblings.com/images/logo.jpg"></strong></a><br>

I want to define a regex pattern like <img alt="(.+?)" src="http://(.+?).(jpg|gif)">, but as you can see the target string has a linebreak in the alt attribute - so how can i incorporate this? the rule should be like "anything in the alt-attribute including linebreaks".

Mark Rushakoff
  • 249,864
  • 45
  • 407
  • 398
Fuxi
  • 7,611
  • 25
  • 93
  • 139
  • 2
    **DO NOT PARSE HTML USING Regular Expressions**! http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – SLaks Apr 07 '10 at 23:51
  • 1
    Ah, the `question` tag. In case we get some questions that aren't questions... – SLaks Apr 07 '10 at 23:51
  • The answer you selected is in fact incorrect. See my answer. – cletus Apr 08 '10 at 00:20

1 Answers1

4

By default, the . wildcard operator does not match newline characters (\n, \r). In other languages, there is a DOTALL mode (sometimes called single line mode) to make . match anything. Javascript doesn't have it for some reason. If you want the equivalent use [\s\S], which means any character is white space or is not white space so:

/<img alt="([\s\S]+?)" src="http:\/\/(.+?)\.(jpg|gif)">/

See Javascript regex multiline flag doesn’t work.

Also I escaped the . before jpg|gif otherwise it'll match any character and not the . that you intend.

That being said, parsing HTML with regexes is a really bad idea. What's more, unless there is relevant detail missing from your question, you can do this easily with jQuery attribute selectors:

$("img[src='http://.*\.gif|jpg']").each(function() {
  var alt = $(this).attr("alt");
  var src = $(this).attr("src");
  ...
});

Or if you want there to be an alt attribute:

$("img[alt][src='http://.*\.gif|jpg']").each(function() {
  var alt = $(this).attr("alt");
  var src = $(this).attr("src");
  ...
});
Community
  • 1
  • 1
cletus
  • 616,129
  • 168
  • 910
  • 942
  • Deleted my comment as I thought it works nevertheless as the accepted answer uses the same approach. Anyway good to hear that I didn't miss some obvious point ;) Btw +1 from me. – Felix Kling Apr 08 '10 at 00:13