0

How do i properly grab an image that is displayed in a HTML document and feed it to PHP to be read as image binary. I do not have direct access to the image file. The image i am trying to grab is fed to the client with HTML via PHP and printed in HTML format and using an <img> tag to display the image. The src is just a link to the same page i am currently on. The link is a GET request.

The link looks like this:

GETIMAGE.php?type=small&path=/path/to/image.png

This does not return the actual image with image MIME types. But rather a HTML displaying the image.

I do not have access to the source code in GETIMAGE.php file. This is encrypted as i am using a portal solution that is licensed.

This is the source that is returned from the GETIMAGE.php script:

<html>
<head>
    <meta name="viewport" content="width=device-width">
    <title>GETIMAGE.php (80×112)</title>
    <style type="text/css"></style>
</head>
<body style="margin: 0px;">
    <img style="-webkit-user-select: none" src="http://portal.craftnordic.com/PORTAL/GETIMAGE.php?type=small&amp;path=Path/To/Image.png">
</body>

Alexander Johansen
  • 522
  • 1
  • 14
  • 28
  • 1
    post your GETIMAGE.php script – Lee Nov 06 '13 at 14:26
  • @Lee I do not have access to the GETIMAGE.php code. It is encrypted as it is part of a licensed application called Xinet WebNative Portal. – Alexander Johansen Nov 06 '13 at 14:29
  • 1
    possible duplicate of [How do you parse and process HTML/XML in PHP?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – Quentin Nov 06 '13 at 14:32
  • In which case, can you post a "view source" of the GETIMAGE.php script so we know exactly what data your working with. I have a feeling the GETIMAGE script has a referrer check on it, that will only output the raw image data if the script is called from itself (based on your comments to previous answers) – Lee Nov 06 '13 at 14:41
  • @Lee Added the source that is returned from the GETIMAGE.php file. – Alexander Johansen Nov 06 '13 at 14:51
  • Ok the only thing i can think of is to use the curl library in php and set the `CURLOPT_REFERER` option to the same as the image url and see what comes back. I can't really help you any further than that without a working link i can debug myself, as its a fairly specific problem to the xinet protal rather than a general programming thing. – Lee Nov 06 '13 at 14:56
  • @Lee using `cURL` gave me positive results. But i was not able to use it properly. It gets an image with image/jpeg MIME and all in the header. But it includes more than just the image. It has headers such as `HTTP/1.1 200 OK`. I have moved from `cURL` to `fsockopen` because i needed to have authentication in it. I could not figure out how to do authentication with `cURL`. I have opened a new question for more details [here](http://stackoverflow.com/questions/19834971/php-get-image-with-fsockopen/19835074?noredirect=1#comment29494924_19835074) – Alexander Johansen Nov 07 '13 at 11:57

3 Answers3

1

Without seeing your script, it is hard to figure out what you are looking for. Let's assume the page generates output like this:

<img src="http://imgplacewhatever.com/lskjdflksdjf.png" />

Using this excellent DOM Parsing Library, we can do something like this:

$html = file_get_html('GETIMAGE.php?type=small&path=/path/to/image.png');
$pictures = array();
foreach($html->find('img') as $element) 
   $pictures[] = $element->src;
}

foreach ($pictures as $picture) {
   $data = file_get_contents($picture);
   ## Do something with the data.
}

Then you will have an array of all pictures in $pictures.

Good luck.

David
  • 3,831
  • 2
  • 28
  • 38
  • I have also tried this. This will return the links of all images in the HTML. The link is just the same link as i am reading the HTML document from. This will not give me raw image data. (which is what i need) – Alexander Johansen Nov 06 '13 at 14:36
  • You can use `file_get_contents` to get the stream. The answer was updated. – David Nov 06 '13 at 14:39
  • This will not work (already tried). It reads in the HTML that the image is displayed on and not the raw image data. – Alexander Johansen Nov 06 '13 at 14:41
  • I understand. If you use the DOM parser and get the absolute URLs of the image, you can use `file_get_contents()` to get the data stream. – David Nov 06 '13 at 14:42
  • I have already tried the DOM parser library. It didn't get me anywhere. There is no direct link to the images, only a PHP script that prints them out in a HTML document. The images them self are stored in a safe folder on a different server. I work on 2 servers. One is the data server that holds all the data, and the other displays it to the user. I am currently on the server that displays data. – Alexander Johansen Nov 06 '13 at 14:58
  • That doesn't make any sense. Please post a truncated version of the HTML page you are working with. – David Nov 06 '13 at 14:59
0

You can use file_get_contents() method to get the data.

Here you can use

$filePath=$_GET['path'];
$imageData=file_get_contents($filePath);
Pavan Kumar
  • 406
  • 1
  • 4
  • 16
  • I have already tried this method. This will read the HTML document the image is displayed on, and not the raw image data. I need it to be raw image data. – Alexander Johansen Nov 06 '13 at 14:31
  • Sorry i didn't get you for example if i use $imageData=file_get_contents('https://www.google.co.in/images/srpr/logo11w.png'); then that will load the google logo icon binary data to $imageData – Pavan Kumar Nov 06 '13 at 14:42
  • Try to specify the exact path of image like this – Pavan Kumar Nov 06 '13 at 14:44
  • I do not have exact images path, and nor do i have direct access to it. It is running on a data server that transfers the content to the other server that grabs the data from the data server and sends it to a client. – Alexander Johansen Nov 06 '13 at 15:00
0

Don't know if you ever found an answer, but I finally did. The data that was being received by file_get_contents - or any CURL method - was actually returning data in a gzip format. When I saved the output to a file and extracted it as a gzip archive, the image was there.