1

Take the following code as an example

<a style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;" href="http://3.bp.blogspot.com/xxxxxxx.JPG"><img src="xxxxxxxxxxxxxx" alt="" width="200" height="150" border="0" /></a>

How can I create a regular expression to strip out any link tag containg the domain 'blogspot.com' from the img tag?

In the end I would want this

<img src="xxxxxxxxxxxxxx" alt="" width="200" height="150" border="0" />

Thanks in advance.

Luis
  • 27
  • 1
  • 3
  • 4
    [Don't parse HTML with regex!](http://stackoverflow.com/a/1732454/418066) – Biffen May 21 '15 at 17:34
  • *Please* read this: http://stackoverflow.com/a/1732454/62576 – Ken White May 21 '15 at 17:39
  • I need to do this because I've imported an entire blogspot site to wordpress and the posts have image links to blogspot. Any alternative you could point? – Luis May 21 '15 at 17:55

1 Answers1

0

First of all I suggest you to read this. If you still want the regex..

You can use the following to match:

<[^>]*?href\s*=\s*"[^>]*?blogspot\.com[^>]*>(<img[^>]*?\/>)<\/[^>]*>

And replace or extract with $1

See DEMO

Community
  • 1
  • 1
karthik manchala
  • 13,492
  • 1
  • 31
  • 55