2

I downloaded a 150GB satellite .jp2 image, of which I only want a small section at a time. How would I go about tiling the image in manageable chunks? Just extracting a part of the image would also be enough.

As I'm only somewhat familiar with Python, I looked at the Pillow and OpenCV libraries, but without success as the image resolution exceeds their limits. I also looked into Openslide for Python, but couldn't get rid of an error (Could not find module 'libopenslide-0.dll').

Edigorin
  • 21
  • 2
  • 4
    If you drop the `large-data` tag and add tag `vips` instead you might get an answer using `vips` which is very frugal on resources. Also, you might look at `stream` from **ImageMagick**, example here https://stackoverflow.com/a/70461501/2836621 – Mark Setchell Mar 01 '22 at 16:25

2 Answers2

4

libvips can process huge images efficiently.

For example, with this 2.8gb test image:

$ vipsheader 9235.jp2
9235.jp2: 107568x79650 uchar, 4 bands, srgb, jp2kload
$ ls -l 9235.jp2
-rw-r--r-- 1 john john 2881486848 Mar  1 22:37 9235.jp2

I see:

$ /usr/bin/time -f %M:%e \
    vips crop 9235.jp2 x.jpg 10000 10000 1000 1000
190848:0.45

So it takes a 1,000 x 1,000 pixel chunk out of a 110,000 x 80,000 pixel jp2 image in 0.5s and needs under 200mb of memory.

There are bindings for python, ruby, node, etc., so you don't have to use the CLI.

In python you could write:

import pyvips

image = pyvips.Image.new_from_file("9235.jp2")
tile = image.crop(10000, 10000, 1000, 1000)
tile.write_to_file("x.jpg")

It does depend a bit on the jp2 image you are reading. Some are untiled (!!) and it can be very slow to read out a section.

There are windows binaries as well, check the "download" page.

If you're on linux, vipsdisp can view huge images like this very quickly.

jcupitt
  • 10,213
  • 2
  • 23
  • 39
  • 1
    Thanks! Using pyvips I also finally found the resolution of my image (573107x818593). I executed your code for python without errors, but the output image I get is completely black. Any idea? – Edigorin Mar 02 '22 at 12:32
  • 1
    I don't really know how to get vipsheader output, but here's what I found: Width: 580000 Height 825000 Bands: 3 Format: uchar Interpretation: srgb Coding: none Filename: Luxembourg-2020_ortho10cm_RVB_LUREF.jp2 XYOffset: 0 0 XYRes: 1.0 1.0 n-pages: 15. I tried the same code on a smaller sample jp2 image and that worked as expected. – Edigorin Mar 02 '22 at 14:46
  • 1
    The exported jpg's have a bit depth of 24. I don't know how to get the bit-depth of a jp2 file. – Edigorin Mar 02 '22 at 14:54
  • 1
    Thanks for all the help. I think my image was not a standard .jp2 image, because it was also completely black when opened with vipsdisp. As it is a satellite image, I found out later that I can open it directly with QGIS. – Edigorin Mar 04 '22 at 09:11
  • 1
    [Satellite image of Luxembourg 2020 - 25Gb](https://s3.eu-central-1.amazonaws.com/download.data.public.lu/resources/orthophoto-officelle-du-grand-duche-de-luxembourg-edition-2020/20210602-110516/Luxembourg-2020_ortho10cm_RVB_LUREF.jp2) This one always exported black – Edigorin Mar 04 '22 at 13:09
  • [Satellite image of Luxembourg 2021 - 150Gb](https://s3.eu-central-1.amazonaws.com/download.data.public.lu/resources/orthophoto-officielle-du-grand-duche-de-luxembourg-edition-2021/20220125-181252/ortho_2021_rgb.jp2) This one always exported transparent – Edigorin Mar 04 '22 at 13:10
  • 1
    I found the error in openjpeg: https://github.com/uclouvain/openjpeg/blob/master/src/lib/openjp2/jp2.c#L501 It looks like it's just a limit of their decoder. I doubt if they'll ever support images like this. I would write to the people who made the image and ask them to recode in a more widely supported jpeg2000 variant. I'll open an issue on openjpeg with an explanation. – jcupitt Mar 05 '22 at 11:18
  • 1
    openjpeg issue for reference: https://github.com/uclouvain/openjpeg/issues/1414 – jcupitt Mar 05 '22 at 11:30
  • warning about the 25 GB map of Luxembourg - it was created by ECW 3.3, a notoriously buggy version. I can verify that the PLT markers in the file header, used for random access into the image, are corrupt and unusable. – Jacko Mar 09 '22 at 00:54
  • Ah, perhaps that's why openjpeg chokes. Thank you for the heads up. @Edigorin, perhaps you should ask public.lu to recode it. – jcupitt Mar 09 '22 at 10:38
  • @jcupitt openjpeg doesn't make use of PLT markers, so that's probably not it. The problem there is probably that the library doesn't handle single-tile images over 2^32 bytes in size. If the image was also encoded with tiling, then it should be tractable. – Jacko Mar 09 '22 at 20:05
1

Grok JPEG 2000 toolkit is able to decompress regions of extremely large images such as the 150 GB image you linked to.

Sample command:

grk_decompress -i FOO.jp2 -o FOO.tif -d 10000,10000,15000,15000 -v

to decompress the region (10000,10000,15000,15000) to TIFF format.

Jacko
  • 12,665
  • 18
  • 75
  • 126
  • Huh I'd not seen this jp2k library, it looks very nice! We'd love to use it for libvips and throw out openjpeg, but I think the AGPL licence would make it impossible, sadly. I don't suppose you've considered other licences? – jcupitt Apr 16 '22 at 10:55
  • 1
    thanks - yes, I have considered it at various times. A more permissive license would create a larger user base, but abandon the "free as in free speech" copyright. In the end, I am sticking with the current license. – Jacko Apr 17 '22 at 13:19