0

I have a web page source code that I want to use in my project. I want to use an image link in this code. So, I want to reach this link using regex in PHP.

That's it:

img src="http://imagelinkhere.com" class="image"

There is only one line like this. My logic is to get the string between

="

and

" class="image"

characters.

How can I do that with REGEX? Thank you very much.

alper_k
  • 512
  • 1
  • 6
  • 18
  • dont use regex to parse html http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 –  Dec 13 '12 at 09:13

6 Answers6

3

Don't use Regex for HTML .. try DomDocument

$html = '<html><img src="http://imagelinkhere.com" class="image" /></html>';

$dom = new DOMDocument();
$dom->loadHTML($html);
$img = $dom->getElementsByTagName("img");

foreach ( $img as $v ) {
    if ($v->getAttribute("class") == "image")
        print($v->getAttribute("src"));
}

Output

http://imagelinkhere.com
Community
  • 1
  • 1
Baba
  • 94,024
  • 28
  • 166
  • 217
  • The problem is, there is 30 or 40 images in this code. I want to use one of them, not all of them. They don't have " class="image" at the end. Only one has it, and I want to use that one. That's why I said between =" and " class="image" characters, and wanted to use regex. – alper_k Dec 13 '12 at 09:22
  • Can you add the fill HTML to http://pastbin.com .. so that i can have an idea what you want .. i still belive it can be done with DomDocument – Baba Dec 13 '12 at 09:24
  • Sure, that's code. http://pastebin.com/qBN7K7xe And that's the link I want to get: http://i.milliyet.com.tr/YeniAnaResim/2012/12/12/ruzgar-enerjisiyle-mayini-imha-ediyor-2869627.Jpeg – alper_k Dec 13 '12 at 09:27
  • just add `if ($v->getAttribute("class") == "image")` to the code – Baba Dec 13 '12 at 09:37
  • Thanks a lot, that's just simpler than the headache regex. – alper_k Dec 13 '12 at 09:40
1

Using

.*="(.*)?" .*

with preg replace gives you only the url in the first regex group (\1).

So complete it would look like

$str='img src="http://imagelinkhere.com" class="image"';
$str=preg_replace('.*="(.*)?" .*','$1',$str);
echo $str;

-->

http://imagelinkhere.com

Edit: Or just follow Baba's advice and use DOM Parser. I'll remember that regex will give you headaches when parsing html with it.

cb0
  • 8,415
  • 9
  • 52
  • 80
1
preg_match("/(http://+.*?")/",$text,$matches);
var_dump($matches);

The link would be in $matches.

Oldskool
  • 34,211
  • 7
  • 53
  • 66
DevMetal91
  • 340
  • 3
  • 13
0

There is several ways to do so :

1.you can use SimpleHTML Dom Parser which I prefer with simple HTML

2.you can also use preg_match

$foo = '<img class="foo bar test" title="test image" src="http://example.com/img/image.jpg" alt="test image" class="image" />';
$array = array();
preg_match( '/src="([^"]*)"/i', $foo, $array ) ;

see this thread

Community
  • 1
  • 1
Mina Kolta
  • 1,551
  • 2
  • 12
  • 22
0

I can hear the sound of hooves, so I have gone with DOM parsing instead of regex.

$dom = new DOMDocument();
$dom->loadHTMLFile('path/to/your/file.html');
foreach ($dom->getElementsByTagName('img') as $img)
{
    if ($img->hasAttribute('class') && $img->getAttribute('class') == 'image')
    {
        echo $img->getAttribute('src');
    }
}

This will echo only the src attribute of an img tag with a class="image"

Dale
  • 10,384
  • 21
  • 34
-1

Try using preg_match_all, like this:

preg_match_all('/img src="([^"]*)"/', $source, $images);

That should put all the URL's of the images in the $images variable. What the regex does is find all img src bits in the code and matches the bit between the quotes.

Oldskool
  • 34,211
  • 7
  • 53
  • 66