0

Here is my regex to get the image url on the page.

<?php       
        $url = $_POST['url'];       
        $data = file_get_contents($url);    
        $logo = get_logo($data);
        function get_logo($html) 
            {
                preg_match_all('/\bhttps?:\/\/\S+(?:png|jpg)\b/', $html, $matches);
                //echo "mactch : $matches[0][0]";
                return $matches[0][0];  
            }

?>

Is there any thing missing in regex? for some of the url it does not give image url though they have image in it.

for example: http://www.milanart.in/

it does not give image on that page.

Please No dome. I could not use it.

user123
  • 5,269
  • 16
  • 73
  • 121
  • possible duplicate of [How do you parse and process HTML/XML in PHP?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – Quentin Dec 12 '13 at 12:33

2 Answers2

1
<?php       
    $url = "http://www.milanart.in";       
    $data = file_get_contents($url);  
    $logo = get_logo($data);

    function get_logo($html) 
        {
            preg_match_all("/<img src=\"(.*?)\"/", $html, $matches);
            return $matches[1][0];  
        }
    echo 'logo path : '.$logo;
    echo '<img src="'.$url.'/'.$logo.'" />';
?>
keegzer
  • 383
  • 1
  • 4
  • 15
  • Thanks, but this sulution does not work for all case. Regex should be such that it should independantly scrap image url. check your above code for 'http://www.metacritic.com/movie/walter-lessons-from-the-worlds-oldest-people/critic-reviews' – user123 Dec 12 '13 at 13:36
  • It's an example who work with one solution, you can return an array with all response and check if the string have an 'http' or no ... You need to adapt your code – keegzer Dec 12 '13 at 13:42
1

Use DOM Class of PHP to get all images:

  1. Search for image files in CSS.....url(imagefilename.extension)
  2. Search for image file in HTML ......
Andrea
  • 11,801
  • 17
  • 65
  • 72