1

I am trying to run a CNN where the input images have three channels (rgb) and the label (target) images are grayscale images (1 channel). The input and label images are in float32 and tif format.

I got the list of image and label tile pairs as below:

def get_train_test_lists(imdir, lbldir):
    imgs = glob.glob(imdir+"/*.tif")
    dset_list = []
    for img in imgs:
        filename_split = os.path.splitext(img) 
        filename_zero, fileext = filename_split 
        basename = os.path.basename(filename_zero) 
        dset_list.append(basename)
    
    x_filenames = []
    y_filenames = []
    for img_id in dset_list:
        x_filenames.append(os.path.join(imdir, "{}.tif".format(img_id)))
        y_filenames.append(os.path.join(lbldir, "{}.tif".format(img_id)))
    
    print("number of images: ", len(dset_list))
    return dset_list, x_filenames, y_filenames

train_list, x_train_filenames, y_train_filenames = get_train_test_lists(img_dir, label_dir)
test_list, x_test_filenames, y_test_filenames = get_train_test_lists(test_img_dir, test_label_dir)

from sklearn.model_selection import train_test_split
x_train_filenames, x_val_filenames, y_train_filenames, y_val_filenames = 
train_test_split(x_train_filenames, y_train_filenames, test_size=0.1, random_state=42)

num_train_examples = len(x_train_filenames)
num_val_examples = len(x_val_filenames)
num_test_examples = len(x_test_filenames)

In order to read the tiles into tensor, firstly I defined the image dimensions and batch size:

img_shape = (128, 128, 3)
batch_size = 2

I noticed that there is no decoder in tensorflow for tif image based on this link. tfio.experimental.image.decode_tiff can be used but it decodes to unit8 tensor.

here is a sample code for png images:

def _process_pathnames(fname, label_path):
  # We map this function onto each pathname pair  
  img_str = tf.io.read_file(fname)
  img = tf.image.decode_png(img_str, channels=3)

  label_img_str = tf.io.read_file(label_path)

  # These are png images so they return as (num_frames, h, w, c)
  label_img = tf.image.decode_png(label_img_str, channels=1)
  # The label image should have any values between 0 and 9, indicating pixel wise
  # cropt type class or background (0). We take the first channel only. 
  label_img = label_img[:, :, 0]
  label_img = tf.expand_dims(label_img, axis=-1)
  return img, label_img

Is it possible to modify this code by tf.convert_to_tensor or any other option to get float32 tensor from tif images? (I asked this question before, but I don't know how to integrate tf.convert_to_tensor with the mentioned codes)

Sam
  • 59
  • 11

1 Answers1

0

You can read almost any image format and convert it to a numpy array with the Pillow image package:

from PIL import Image
import numpy as np

img = Image.open("image.tiff")
img = np.array(img)

print(img.shape, img.dtype)
# (986, 1853, 4) uint8

You can integrate this function into your code and then convert the numpy array to a tensorflow tensor as well as doing the appropriated image conversions.


Side note: you can simplify a lot your get_train_test_lists function using the pathlib package (which is integrated to Python3 like os but much simpler to use).

def get_train_test_lists(imdir, lbldir):
    x_filenames = list(Path(imdir).glob("*.tif"))
    y_filenames = [Path(lbldir) / f.name for f in x_filenames]
    dset_list = [f.stem for f in x_filenames]
    return dset_list, x_filenames, y_filenames

Note that x_filenames and y_filenames are now absolute paths but this shouldn't be an issue in your code.

Louis Lac
  • 5,298
  • 1
  • 21
  • 36
  • thanks for your reply. I knew from this link that [https://stackoverflow.com/questions/67093424/how-to-decode-float32-tif-images-to-float32-tensor-in-tensoflow/67093521#67093521] OpenCV or Pillow can be used to read the image, but I have a large number of images and I am not sure where in the code they should be called. In addition, I am not sure how `tf.convert_to_tensor` should be added to the code – Sam Apr 15 '21 at 10:43
  • What is your actual issue then? Image conversion from `uint8` to `float32`? How and where to handle database reading? – Louis Lac Apr 15 '21 at 12:15
  • `uint8` is an integer in the range 0 to 255 included, CNN usually accept tensors of type `float32` normalized in the range 0 to 1 and optionally with imagenet normalization. To convert a `uint8` to a normalized `float32`, divide it by 255. – Louis Lac Apr 15 '21 at 12:22
  • my first issue is related to handing the images. the images that i am using are originally `float32` containing decimal values. they are grayscale images with continuous values: [0 to 790.65],[ -2.74174 to 2.4126 ],[ 150.87 to 260.45],[ -32.927 to 69.333]. The first three are input forming `rgb` image and last one is target. I am confused on how they should be normalized and fed into CNN? is it necessary to convert to `unit8`? – Sam Apr 16 '21 at 12:00
  • I plan to divide the images to 128x128 tiles, but first I am not sure how to normalize them since they are already float32. I really appreciate if you can point me to correct direction. – Sam Apr 16 '21 at 12:03
  • Your RGB value ranges are quite unusual (standard range is [0, 255] or [0 1] if normalized). If your network is not pre-trained or you are not doing any transfert learning you can keep the original `float32` value of you images without normalizing. – Louis Lac Apr 16 '21 at 14:06
  • `uint8`(integers in the range [0, 255]) is usually the data format used to represent RGB values of images. The CNN operates on `float32` values only ("decimal" values as you say) and sometimes in a normalized range (for instance [0, 1]) so conversion goes: `uint8 → float32 → float32 in [0, 1]`. – Louis Lac Apr 16 '21 at 14:11
  • If your images are not 'uint8` you must find the exact format of their values. – Louis Lac Apr 16 '21 at 14:12