0

I am using php to save image ($img) from image URLs ($imgURL) using

file_put_contents($img, file_get_contents($imgURL));

The $imgURL links are extracted from the user submitted URL ($webURL). This is working fine. However, sites may have many URLs for images. How do I extract the image of the logo file used in the site ($webURL)? In another words, how do find out which $imgURL contains the logo image?

1 Answers1

0

You need to know the HTML you are parsing. Then, you need to use come attribute to get the information you want, like any class, name, src, ...

So, you can do like this post already show: getting image src in php

$doc = new DOMDocument();
$doc->loadHTMLFile($url);
$xpath = new DOMXpath($doc);
$imgs = $xpath->query("//img");
for ($i=0; $i < $imgs->length; $i++) {
    $img = $imgs->item($i);
    $src = $img->getAttribute("src");
    // do something with $src

    ... if img contains your logo data, return ..
}
Community
  • 1
  • 1
Felippe Duarte
  • 14,901
  • 2
  • 25
  • 29
  • Thanks, it was working fine. The point is: if the user submitted URL ($webURL) has many links to images ($imgURL) then how to automatically find out which image contains the logo or a very relevant image of the site for which the URL is submitted? For example, http://en.unesco.org/ has links to many (>50) images. Is there a way to find out a relevant image (e.g the logo of UNESCO, or image describing UNESCO)? – FreeBenzine May 18 '16 at 21:10
  • You can try to find images with `logo` in the name, but it will not work everytime. There is no way to always know which image is your logo. HTML doesn't provide this information. What you can do is, if you don't find any logo, you can show all images of that page, ask user to choose which image is your logo and then save this information. – Felippe Duarte May 18 '16 at 21:16