4

I have a process producing very high resolution 600+ MP images. These images are about 2GB when loaded into RAM (40MB highly compressed). I am indexing them and making them available via a PHP web app.

I have data that tells me areas of interested in units of pixels, so I am curious whether there is a method where I could read a particular area of the image without loading the entire thing into memory. Sort of like moving through a file pointer, but choosing when to read. The goal would be to create a small picture of the area of interest.

I know there are a few image processing libraries in PHP, and more than a few for Python, but I don't really know what is the correct question to be asking with regard to the libraries.

I am really looking for a solution in PHP or Python specifically

Kenneth
  • 535
  • 2
  • 17
  • See if this helps. https://stackoverflow.com/questions/19695249/load-just-part-of-an-image-in-python/19772446 – Rockybilly Feb 21 '18 at 23:55
  • That method requires images in an uncompressed BMP format. My images are jpeg compressed and fully extracting them is a daunting task. Knowing what little I know of compression, I am afraid someone will tell me I must uncompress the entire image to work with it. – Kenneth Feb 22 '18 at 00:04
  • On average you're going probably to be uncompressing 50% of the image each time, assuming you find a way to do the thing you're thinking of. It's probably more efficient, both in terms of resource usage and time spent coding, to make an uncompressed BMP copy and use simple file operations to extract the portions you're interested in. – Sammitch Feb 22 '18 at 01:54
  • Better yet, break the image down into a number of uncompressed tiles, and when you get a request for an image portion load in and reassemble the necessary tiles, trim, and export. Sort of a bastardized version of how google maps loads its data. Bear in mind that disk space is far cheaper than RAM and CPU. – Sammitch Feb 22 '18 at 01:55
  • Possible duplicate of [Extracting a region of interest from an image file without reading the entire image](https://stackoverflow.com/questions/48281086/extracting-a-region-of-interest-from-an-image-file-without-reading-the-entire-im) – Cris Luengo Feb 22 '18 at 03:16
  • @CrisLuengo That answer does not provide a solution in PHP nor Python. Thanks though. – Kenneth Feb 22 '18 at 04:57
  • @Kenneth: I linked the question because I think that it explains very clearly what you need to do: store your images as tiled TIFF. Python has functionality to read individual tiles from a TIFF file, and I'm sure PHP does too. That part is trivial. Any other image file type that I know of will require to read the whole thing in memory (or most of it anyway) to extract a small region. You really want to store your data in the right way first. – Cris Luengo Feb 22 '18 at 06:43
  • @Kenneth the solution Cris linked works in both Python and PHP -- there's a Python example near the bottom of the accepted answer. I'll add a PHP version here. – jcupitt Feb 24 '18 at 15:28

2 Answers2

2

You may want to take a look at ImageMagick, which I have used from Java with great success to do something very similar.

There is a bit of a learning curve, but it's tremendously powerful, and I believe the command line example at "Selecting an Image Region" on https://www.imagemagick.org/script/command-line-processing.php illustrates what you're describing (extracting a small known AoI from within a much larger image).

Timothy Johns
  • 1,075
  • 7
  • 17
1

php-vips will read just the part you need, when it can. It's typically 3x to 5x faster than imagemagick and needs much less memory.

Many image formats do not allow random access. JPEG, PNG, GIF and many others will force you to decompress at least the pixels before the pixels you want which, for huge images of the sort you are handling, will be very slow.

One solution would be to switch to JPEG-compressed tiled TIFF. This format breaks the image into (by default) 256x256 pixel tiles and compresses each one separately. The tiles are stored in a TIFF file with an index, so you can pull out individual tiles very quickly.

For example, you can convert a huge JPEG image to a JPEG-compressed tiled tiff with libvips like this:

$ time vips copy wac_nearside.jpg wac_nearside.tif[tile,compression=jpeg]
real    0m3.891s
user    0m6.332s
sys     0m0.198s
peak RES 40mb

The index makes the image a little larger, but it's not too bad:

$ ls -l wac_nearside.* 
-rw-r--r-- 1 john john 74661771 May  7  2015 wac_nearside.jpg
-rw-r--r-- 1 john john 76049323 Feb 24 15:39 wac_nearside.tif
$ vipsheader wac_nearside.jpg wac_nearside.jpg: 24000x24000 uchar, 1 band, b-w, jpegload

You can read out random regions from it in PHP like this:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';

use Jcupitt\Vips;

$image = Vips\Image::newFromFile($argv[1]);

$region_width = 100;
$region_height = 100;

for ($i = 0; $i < 100; $i++) {
    $left = rand(0, $image->width - $region_width - 1);
    $top = rand(0, $image->height - $region_height - 1);
    $region = $image->crop($left, $top, $region_width, $region_height);
    $region->writeToFile($i . ".jpg");
}

I can run that program like this:

$ time ./crop.php ~/pics/wac_nearside.tif 
real    0m0.207s
user    0m0.181s
sys     0m0.042s
peak RES 36mb

So on this elderly laptop it's reading out (and creating) 100 JPEG files in just over 0.2s.

jcupitt
  • 10,213
  • 2
  • 23
  • 39
  • I had a heck of a time trying to get the Vips package to work with Python in Windows. I gave up! Do you know of any resources to help with getting it to work with PHP in Windows? – Kenneth Feb 27 '18 at 02:17
  • 1
    Oh dear, it should be easy. Open an issue on the pyvips tracker and I'll help there: https://github.com/jcupitt/pyvips/issues ... it's supposed to be: download the shared library, download a 64-bit Python (32-bit will not work), set your PATH to include the shared library DLL folder, `pip install pyvips`. – jcupitt Feb 27 '18 at 07:36
  • 1
    For PHP on Windows, it's a native extension, so you'll need a working compiler. pyvips on Windows is easier: it's pure Python and will work without a compiler, though it's slower. – jcupitt Feb 27 '18 at 07:41
  • 1
    You could also just use the vips command-line and invoke it from python/php. `vips crop huge.tif small.jpg 10 10 100 100` will very quickly crop a 100x100 pixel area at left/top 10x10 from `huge.tif` and write it to `small.jpg`. – jcupitt Feb 27 '18 at 07:44
  • @user894736 I have been working with PHP for about 15 years, but never dove into extentions or compiling anything. I found this resource https://wiki.php.net/internals/windows/stepbystepbuild about setting up a compiler (I have used VS for c++ projects before). As I understand it, because it's a native extension, it must be compiled into PHP? Is this the only way to add this type of functionality into php? I am using WAMP server, so I'm sure adding a custom build of PHP would get quite complicated... – Kenneth Mar 01 '18 at 16:42
  • It's one of those things that's easy on linux, but tricky on Windows, because there's no standard C compiler. Yes, native extensions are the only way you can call into external libraries from PHP (there's no FFI system). If you are not a compiler guru, I would just use shell_exec() to run the vips command-line. You'll need to make sure the vips bin directory is on your PATH. – jcupitt Mar 01 '18 at 16:57