To match plainly on the img src
you can do:
\<img src\=\"(\w+\.(gif|jpg|png)\")
And then if you only want the value that's in the img src
, you can do a match for anything in quotes ending in a picture extension (but this may get you false positives depending on what you want):
\w+\.(gif|jpg|png)
But to match just the value while ensuring that it follows img src
, you need a negative lookahead to do this (note that I added a matching group there):
(?!.*\<img src\=\")(\w+\.(gif|jpg|png))
Now to include the possibility of having image links in your image source:
(?!.*\<img src\=\")([\/\.\-\:\w]+\.(gif|jpg|png)?[\?\w+\%]+)
And then let's remove the false positives we get by fixing that lazy quantifier after (gif|jpg|png)
and moving it to after the next set (which matches data you may get in a JS link, etc.) and making sure we have an end quote:
(?!.*\<img src\=\")([\/\.\-\:\w]+\.(gif|jpg|png)([\?\w+\%]+)?)(?=\")
Note: This will match this data, but regular expressions don't parse HTML, and I personally don't recommend using regular expressions to look through HTML data unless you're doing it on a case-by-case basis. If you're wanting to do some URL/Image scraping via a script, look into an XML/HTML parser.
Sample data:
<a href="myfile.htm"><img src="picture.gif"></a>
<a href="index.htm"><img src="pic859.jpg"></a>
<a href="page-57.htm"><img src="859.png"></a>
<img id="test1" class="answer1" src="text.jpg">
<img src="http://media.site.com/media/img/staff/2013/ROTHBARD-350_s90x126.jpg?e3e29f4a7131cd3bc7c4bf334be801215db5e3c2%22%3E">
<img src="yahoo.com/images/imagename.gif">
HTML Source