3

I'm working on image segmentation with large satellite .JP2 images.

  • image shape : (10000, 10000, 13) because of 13 bands (13 different waves-length observations for the same area)

  • uint32

I want to build the most efficient tensorflow pipeline but i don't have much experience.

I want to have an easy tuning of the number of bands used for training (RGB for the first training then i'll try to add more bands to see if it increase the performances)

I imagined two different pipelines :

  • I transform my .JP2 into a (10000 x 10000 x 13) numpy array. Then the pipeline is feed with desired slices (e.g 128x128x3 if i want RGB image)

  • Or, I preprocess my large image into 13 différents folders (13 bands) Then the input pipeline use the desired datasets to build the 128 x 128 x (1-13) input image

Taking a big image and slicing it as i want, directly into the tensorflow pipeline is more convenient because I just need a 10000x10000x13 numpy array as training set. But I don't know if it is releavant/optimized/possible...

What is the most optimized way to solve my pb ? (I have a 11Gb 1080 GPU)

  • What exactly do you mean by 'RGB first, then I'll add IR'? You want different inputs at different stages of learning? Or just be able to change shape of input? I'd stick to the latter, you'll be able to test different combinations. Second question, what are you going to do with labels, if you slice original image? What net architecture are you going to use? – Sharky Apr 13 '19 at 14:03
  • I just want to be able to easy change my training data (using more or less bands). I'm gonna use a Unet model with (128 x 128 x nb_desired_bands) input shape – antoine Mathu Apr 13 '19 at 14:15

1 Answers1

4

Most efficient approach is, almost always, a product of iterative improvement. So, as a solid start, let's consider example. For demostration purposes I used a toy array with random color blocks, splitted it into 13 bands and concatenated just 3. First dimension is added for batch_size

init_image = np.random.randint(0,255,(1, 4, 4, 13))
bands = np.split(d, 13, axis=3)
image = np.concatenate((d_s[0], d_s[1], d_s[2]), axis=3)

First, we create take a single large image from a dataset to extract patches from it. enter image description here

dataset = tf.data.Dataset.from_tensor_slices(image)
dataset = dataset.batch(1)
#This, if evaluated inside session, outputs array of shape (1, 4, 4, 3)

Then we apply a map function to extract patches. This is done with tf.image.extract_image_patches, parameters ksizes, strides and rates define geometric properties of a patch. You can find an excellent explanation here. In this case we'll take patches of size 2x2, directly adjacent to each other. So 4 patches in total. extract_image_patches will place all patches into last dimension , o reshape is applied to get desired output of 4 patches of 3 channeled images of shape 2x2.

def parse_func(image):
    ksizes = [1, 2, 2, 1]
    strides = [1, 2, 2, 1]
    rates = [1, 1, 1, 1]
    patches = tf.image.extract_image_patches(image, ksizes, strides, rates, 'SAME')
    image_tensor = tf.reshape(patches, [-1, 2, 2, 3])
    return image_tensor

Then we apply this function to dataset, then we unbatch the output to shuffle it and create a new batch from patches. In this case batch size and shuffle buffer size are equal to number of patches.

dataset = dataset.map(pf)
dataset = dataset.apply(tf.data.experimental.unbatch())
dataset = dataset.shuffle(4).batch(4)

This will output a batch of shape (4, 2, 2, 3). As you can see, output consists of 4 patches of shape (2, 2, 3). If shuffle isn't applied, the will come in order from top left corner to bottom right.

enter image description hereenter image description here

Also, take a look at official input pipeline performance guide

Sharky
  • 4,473
  • 2
  • 19
  • 27