1

I currently have this function to grab the image src field from CDATA in an RSS feed.

function echo_url($string) {
  preg_match_all('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#', $string, $match);
  return $match[0][0];
}

Only bad thing is that it's picking up the first URL that appears in the CDATA instead of the image src itself, is there anyway I can access or return just the image src instead of the first URL?

Here's an example of the CDATA I get.

![CDATA[Duration: 357 sec<br>URL: http://www.test.com/videos/999682/test-test-video<br><img src="http://test.com/images/test.jpg"><br><iframe src="http://embeds.test.com/embed/999682" frameborder=0 width=650 height=518 scrolling=no></iframe>]]

All I'm after in getting is the 'img src'.

Any help would be brilliant (bit of a beginner at php)

Thanks

Jonathan
  • 673
  • 1
  • 10
  • 32
  • http://test.com/320x240/463/1060103/23.jpg%3C/a%3E%3C/li%3E%3C/ul%3E%3Ch4%3EHere%20are%20the%20above%20URLs,%20in%20%3Cimg%3E%20tags.%20%3C/h4%3E%3Cimg%20src= As I'm hooking this into an RSS importer 'All Import WP' it seems to be adding in a lot of stuff after the initial img tag? – Jonathan Jul 23 '15 at 14:38
  • Rather than using a regex, you could parse the inner content as an HTML document, and then use DOM or SimpleXML to find the right element and attribute, as described here: http://stackoverflow.com/a/15850774/157957 – IMSoP Jul 23 '15 at 15:04

1 Answers1

1

If you simply want to return the image source, then this function will work:

/**
 * @param string $cdata
 * @return string|FALSE
 */
function getImgSrc($cdata)
{
    $found = preg_match("/img src=\"([^\"]+)\"/", $cdata, $match);
    if ($found)
    {
        return $match[1];
    }
    else
    {
        return FALSE;
    }
}
Tony
  • 124
  • 4