3

Found myself needing to be able to query google via a reverse image lookup to find out more about images I have on my server with unknown contents. Found a good question about this here: php Extract Best guess for this image result from google image search?

Tried implementing the methods listed on there, but it seems like these days, google takes your pretty URL and does a 302 redirect to a seemingly randomly generated nonsense URL that takes you to the image search results. I made sure my code had CURLOPT_FOLLOWLOCATION set to 1 to follow, but I still get back the contents of the 302 page. Here's that code:

function fetch_google($terms="sample     search",$numpages=1,$user_agent='Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0')
{
    $searched="";
    for($i=0;$i<=$numpages;$i++)
    {
        $ch = curl_init();
        $url="http://www.google.com/searchbyimage?hl=en&image_url=".urlencode($terms);
        curl_setopt ($ch, CURLOPT_URL, $url);
        curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
        curl_setopt ($ch, CURLOPT_HEADER, 0);
        curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/');
        curl_setopt ($ch,CURLOPT_CONNECTTIMEOUT,120);
        curl_setopt ($ch,CURLOPT_TIMEOUT,120);
        curl_setopt ($ch,CURLOPT_MAXREDIRS,10);
        curl_setopt ($ch,CURLOPT_COOKIEFILE,"cookie.txt");
        curl_setopt ($ch,CURLOPT_COOKIEJAR,"cookie.txt");
        $searched=$searched.curl_exec ($ch);
        curl_close ($ch);
    }

    $xml = new DOMDocument();
    @$xml->loadHTML($searched);

    return $searched;
}

$content = fetch_google("http://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/Grosser_Panda.JPG/1280px-Grosser_Panda.JPG",1);
echo $content."<br>";

Also tried another implementation to get back just the URL, and then make a second cURL call after to the URL that was returned. Same outcome, 302 page contents returned. Here's the get url part of that code, the part that would give me a URL to pull from:

function get_furl($url)
{
$furl = false;

// First check response headers
$headers = get_headers($url);

// Test for 301 or 302
if(preg_match('/^HTTP\/\d\.\d\s+(301|302)/',$headers[0]))
{
    foreach($headers as $value)
    {
        if(substr(strtolower($value), 0, 9) == "location:")
        {
            $furl = trim(substr($value, 9, strlen($value)));
        }
    }
}
// Set final URL
$furl = ($furl) ? $furl : $url;

return $furl;
}

Any ideas greatly appreciated on this!

Community
  • 1
  • 1
carbide20
  • 1,717
  • 6
  • 29
  • 52
  • 2
    Google does not want you to scrape their web interface for the search, that’s likely why you are getting a 302. If you want to query Google for anything automatically, you should do it using the APIs they offer for that. – CBroe Jan 25 '15 at 04:34
  • Thank you kindly. I read the TOS and understand the issue now. – carbide20 Jan 25 '15 at 07:25
  • Can you please provide a link to the section of the Google Custom Search API documentation that deals with a reverse image search for posterity? I'm on the Custom Search documentation, but am unable to find anything related to searching by image. https://developers.google.com/custom-search/docs/xml_results#results_xml_tag_R – carbide20 Jan 25 '15 at 07:32
  • @Kiko Thank you, your answer is very helpful. However, I haven't been able to find a parameter that looks like it could be used for a reverse image search. I still have been unable to verify if this is in fact possible. Have you found something in the documentation for the API that leads you to believe it is? – carbide20 Jan 27 '15 at 00:22

2 Answers2

2

Tineye has an API you can use for Reverse Image Search.

http://services.tineye.com/TinEyeAPI

Edit: Here is a solution for creating your own image search engine, written in python flask.

https://github.com/realpython/flask-image-search http://www.pyimagesearch.com/2014/12/08/adding-web-interface-image-search-engine-flask/

I know this has nothing to do with Google, but Tineye is better solution than Google in this regard. Maybe Google should buy them and then they will be Google. haha

ART GALLERY
  • 540
  • 2
  • 8
  • 1
    Seeing as how Google offers no legitimate way to do this, using an alternative is the answer. I've been using Tinyeye, and am having even more luck with Imagga: http://imagga.com/ Seems to have cropped up in 2014, has a really beautiful website, and they'll give you a good batch of tags for images along with confidence levels for the tags, which my robots love. – carbide20 Jan 29 '15 at 06:17
0

A link to the complete API, which can be used in PHP, is:

https://developers.google.com/image-search/v1/jsondevguide

The code example is:

$url = "https://ajax.googleapis.com/ajax/services/search/images?" .
       "v=1.0&q=barack%20obama&userip=INSERT-USER-IP";

// sendRequest
// note how referer is set manually
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, /* Enter the URL of your site here */);
$body = curl_exec($ch);
curl_close($ch);

// now, process the JSON string
$json = json_decode($body);
// now have some fun with the results...
KIKO Software
  • 15,283
  • 3
  • 18
  • 33
  • This is the deprecated API. The new Custom Search API does not appear to have the ability to do reverse image searches. – carbide20 Jan 27 '15 at 18:22
  • 1
    Ah, yes, sorry. I should read better next time. I can't find a replacement. Enough search engine API's out there, but none that let's you search with an image. I also noted that google doesn't want you to use its engine for anything else than user searches. I guess the simple answer should be that you cannot, or may not, do what you want to do. – KIKO Software Jan 27 '15 at 20:10