Unable to run cyclegan example from tensorflow outside google colab

Question

I'm following the tutorial on tensorflows webpage using cyclegan. It works fine running the code through colab but when I am downloading the jupiter code and converting it using jupyter nbconvert:

jupyter nbconvert — to script cyclegan.ipynb --to python

I am running the code with python cyclegan.py but are getting an error:

File "C:\Users\myname\Desktop\PROJECT\GanTutorial\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 549, in rename_v2 compat.path_to_bytes(src), compat.path_to_bytes(dst), overwrite) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 167: invalid continuation byte

I can't get rid of this error. Does anyone successfully run the example outside google colab?

UPDATE After trying to use the trained data on some of my own files I got this errormessage:

' ' tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern:

here is my complete code:

#!/usr/bin/env python
import subprocess
subprocess.run(["pip", "install", "git+https://github.com/tensorflow/examples.git"])


import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow_examples.models.pix2pix import pix2pix

import os
import time
import matplotlib.pyplot as plt
from IPython.display import clear_output

AUTOTUNE = tf.data.AUTOTUNE


GPUS = tf.config.experimental.list_physical_devices('GPU')
if GPUS:
    try:
        for GPU in GPUS:
            tf.config.experimental.set_memory_growth(GPU, True)
            logical_gpus = tf.config.experimental.list_logical_devices('GPU')
            print(len(GPUS), "Physical GPUs,", len(logical_gpus), "Logical GPUs") 
    except RuntimeError as  RE:
        print(RE)

dataset, metadata = tfds.load('cycle_gan/horse2zebra',
                              with_info=True, as_supervised=True)

train_horses, train_zebras = dataset['trainA'], dataset['trainB']
test_horses, test_zebras = dataset['testA'], dataset['testB']


BUFFER_SIZE = 256
BATCH_SIZE = 1
IMG_WIDTH = 256
IMG_HEIGHT = 256


def random_crop(image):
  cropped_image = tf.image.random_crop(
      image, size=[IMG_HEIGHT, IMG_WIDTH, 3])

  return cropped_image

# normalizing the images to [-1, 1]
def normalize(image):
  image = tf.cast(image, tf.float32)
  image = (image / 127.5) - 1
  return image

def random_jitter(image):
  # resizing to 286 x 286 x 3
  image = tf.image.resize(image, [286, 286],
                          method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)

  # randomly cropping to 256 x 256 x 3
  image = random_crop(image)

  # random mirroring
  image = tf.image.random_flip_left_right(image)

  return image


def preprocess_image_train(image, label):
  image = random_jitter(image)
  image = normalize(image)
  return image


def preprocess_image_test(image, label):
  image = normalize(image)
  return image


train_horses = train_horses.map(
    preprocess_image_train, num_parallel_calls=AUTOTUNE).cache().shuffle(
    BUFFER_SIZE).batch(1)

train_zebras = train_zebras.map(
    preprocess_image_train, num_parallel_calls=AUTOTUNE).cache().shuffle(
    BUFFER_SIZE).batch(1)

test_horses = test_horses.map(
    preprocess_image_test, num_parallel_calls=AUTOTUNE).cache().shuffle(
    BUFFER_SIZE).batch(1)

test_zebras = test_zebras.map(
    preprocess_image_test, num_parallel_calls=AUTOTUNE).cache().shuffle(
    BUFFER_SIZE).batch(1)


sample_horse = next(iter(train_horses))
sample_zebra = next(iter(train_zebras))


plt.subplot(121)
plt.title('Horse')
plt.imshow(sample_horse[0] * 0.5 + 0.5)

plt.subplot(122)
plt.title('Horse with random jitter')
plt.imshow(random_jitter(sample_horse[0]) * 0.5 + 0.5)


plt.subplot(121)
plt.title('Zebra')
plt.imshow(sample_zebra[0] * 0.5 + 0.5)

plt.subplot(122)
plt.title('Zebra with random jitter')
plt.imshow(random_jitter(sample_zebra[0]) * 0.5 + 0.5)



OUTPUT_CHANNELS = 3

generator_g = pix2pix.unet_generator(OUTPUT_CHANNELS, norm_type='instancenorm')
generator_f = pix2pix.unet_generator(OUTPUT_CHANNELS, norm_type='instancenorm')

discriminator_x = pix2pix.discriminator(norm_type='instancenorm', target=False)
discriminator_y = pix2pix.discriminator(norm_type='instancenorm', target=False)


to_zebra = generator_g(sample_horse)
to_horse = generator_f(sample_zebra)
plt.figure(figsize=(8, 8))
contrast = 8

imgs = [sample_horse, to_zebra, sample_zebra, to_horse]
title = ['Horse', 'To Zebra', 'Zebra', 'To Horse']

for i in range(len(imgs)):
  plt.subplot(2, 2, i+1)
  plt.title(title[i])
  if i % 2 == 0:
    plt.imshow(imgs[i][0] * 0.5 + 0.5)
  else:
    plt.imshow(imgs[i][0] * 0.5 * contrast + 0.5)
plt.show()


plt.figure(figsize=(8, 8))

plt.subplot(121)
plt.title('Is a real zebra?')
plt.imshow(discriminator_y(sample_zebra)[0, ..., -1], cmap='RdBu_r')

plt.subplot(122)
plt.title('Is a real horse?')
plt.imshow(discriminator_x(sample_horse)[0, ..., -1], cmap='RdBu_r')

plt.show()


LAMBDA = 10

loss_obj = tf.keras.losses.BinaryCrossentropy(from_logits=True)

def discriminator_loss(real, generated):
  real_loss = loss_obj(tf.ones_like(real), real)

  generated_loss = loss_obj(tf.zeros_like(generated), generated)

  total_disc_loss = real_loss + generated_loss

  return total_disc_loss * 0.5

def generator_loss(generated):
  return loss_obj(tf.ones_like(generated), generated)

def calc_cycle_loss(real_image, cycled_image):
  loss1 = tf.reduce_mean(tf.abs(real_image - cycled_image))
  
  return LAMBDA * loss1

def identity_loss(real_image, same_image):
  loss = tf.reduce_mean(tf.abs(real_image - same_image))
  return LAMBDA * 0.5 * loss

generator_g_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
generator_f_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)

discriminator_x_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
discriminator_y_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)


checkpoint_path = "./checkpoints/train"

ckpt = tf.train.Checkpoint(generator_g=generator_g,
                           generator_f=generator_f,
                           discriminator_x=discriminator_x,
                           discriminator_y=discriminator_y,
                           generator_g_optimizer=generator_g_optimizer,
                           generator_f_optimizer=generator_f_optimizer,
                           discriminator_x_optimizer=discriminator_x_optimizer,
                           discriminator_y_optimizer=discriminator_y_optimizer)

ckpt_manager = tf.train.CheckpointManager(ckpt, checkpoint_path, max_to_keep=5)

# if a checkpoint exists, restore the latest checkpoint.
if ckpt_manager.latest_checkpoint:
  ckpt.restore(ckpt_manager.latest_checkpoint)
  print ('Latest checkpoint restored!!')


# ## Training
# 
EPOCHS = 4
def generate_images(model, test_input):
  prediction = model(test_input)
    
  plt.figure(figsize=(12, 12))

  display_list = [test_input[0], prediction[0]]
  title = ['Input Image', 'Predicted Image']

  for i in range(2):
    plt.subplot(1, 2, i+1)
    plt.title(title[i])
    # getting the pixel values between [0, 1] to plot it.
    plt.imshow(display_list[i] * 0.5 + 0.5)
    plt.axis('off')
  plt.show()

@tf.function
def train_step(real_x, real_y):
  # persistent is set to True because the tape is used more than
  # once to calculate the gradients.
  with tf.GradientTape(persistent=True) as tape:
    # Generator G translates X -> Y
    # Generator F translates Y -> X.
    
    fake_y = generator_g(real_x, training=True)
    cycled_x = generator_f(fake_y, training=True)

    fake_x = generator_f(real_y, training=True)
    cycled_y = generator_g(fake_x, training=True)

    # same_x and same_y are used for identity loss.
    same_x = generator_f(real_x, training=True)
    same_y = generator_g(real_y, training=True)

    disc_real_x = discriminator_x(real_x, training=True)
    disc_real_y = discriminator_y(real_y, training=True)

    disc_fake_x = discriminator_x(fake_x, training=True)
    disc_fake_y = discriminator_y(fake_y, training=True)

    # calculate the loss
    gen_g_loss = generator_loss(disc_fake_y)
    gen_f_loss = generator_loss(disc_fake_x)
    
    total_cycle_loss = calc_cycle_loss(real_x, cycled_x) + calc_cycle_loss(real_y, cycled_y)
    
    # Total generator loss = adversarial loss + cycle loss
    total_gen_g_loss = gen_g_loss + total_cycle_loss + identity_loss(real_y, same_y)
    total_gen_f_loss = gen_f_loss + total_cycle_loss + identity_loss(real_x, same_x)

    disc_x_loss = discriminator_loss(disc_real_x, disc_fake_x)
    disc_y_loss = discriminator_loss(disc_real_y, disc_fake_y)
  
  # Calculate the gradients for generator and discriminator
  generator_g_gradients = tape.gradient(total_gen_g_loss, 
                                        generator_g.trainable_variables)
  generator_f_gradients = tape.gradient(total_gen_f_loss, 
                                        generator_f.trainable_variables)
  
  discriminator_x_gradients = tape.gradient(disc_x_loss, 
                                            discriminator_x.trainable_variables)
  discriminator_y_gradients = tape.gradient(disc_y_loss, 
                                            discriminator_y.trainable_variables)
  
  # Apply the gradients to the optimizer
  generator_g_optimizer.apply_gradients(zip(generator_g_gradients, 
                                            generator_g.trainable_variables))

  generator_f_optimizer.apply_gradients(zip(generator_f_gradients, 
                                            generator_f.trainable_variables))
  
  discriminator_x_optimizer.apply_gradients(zip(discriminator_x_gradients,
                                                discriminator_x.trainable_variables))
  
  discriminator_y_optimizer.apply_gradients(zip(discriminator_y_gradients,
                                                discriminator_y.trainable_variables))


for epoch in range(EPOCHS):
  start = time.time()

  n = 0
  for image_x, image_y in tf.data.Dataset.zip((train_horses, train_zebras)):
    train_step(image_x, image_y)
    if n % 10 == 0:
      print ('.', end='')
    n += 1

  clear_output(wait=True)
  # Using a consistent image (sample_horse) so that the progress of the model
  # is clearly visible.
  generate_images(generator_g, sample_horse)

  if (epoch + 1) % 5 == 0:
    ckpt_save_path = ckpt_manager.save()
    print ('Saving checkpoint for epoch {} at {}'.format(epoch+1,
                                                         ckpt_save_path))

  print ('Time taken for epoch {} is {} sec\n'.format(epoch + 1,
                                                      time.time()-start))


# ## Generate using test dataset

# Run the trained model on the test dataset
# for inp in test_horses.take(5):
#   generate_images(generator_g, inp)



import subprocess
subprocess.run(["pip", "install", "git+https://github.com/tensorflow/examples.git"])

import tensorflow as tf


import tensorflow_datasets as tfds
from tensorflow_examples.models.pix2pix import pix2pix

from glob import glob
import os 

image_path_list = glob('/content/horse/*.jpg')
horse_img = tf.data.Dataset.list_files(image_path_list)

for i in horse_img:
    print(i)

tf.Tensor(b'/content/horse/horse4.jpg', shape=(), dtype=string)
tf.Tensor(b'/content/horse/horse3.jpg', shape=(), dtype=string)
tf.Tensor(b'/content/horse/horse2.jpg', shape=(), dtype=string)
tf.Tensor(b'/content/horse/horse1.jpg', shape=(), dtype=string)


def normalize(image):
  image = tf.cast(image, tf.float32)
  image = (image / 127.5) - 1
  return image

def load_images(path):
    image = tf.io.read_file(path)
    image = tf.io.decode_image(image, expand_animations = False)
    return image

def preprocess_image_test(image):
    image = tf.image.resize(image, [256, 256])
    image = normalize(image)
    return image

horse_img = horse_img.map(load_images)
horse_img = horse_img.map(
    preprocess_image_test, num_parallel_calls=AUTOTUNE).cache().shuffle(
    BUFFER_SIZE).batch(1)

for i in horse_img:
    print(i.shape)

(1, 256, 256, 3)
(1, 256, 256, 3)
(1, 256, 256, 3)
(1, 256, 256, 3)

for inp in horse_img.take(4):
  generate_images(generator_g, inp)

Innat · Answer 1 · 2021-05-07T07:37:39.567

2

Update

As we mentioned in the comment, here are the working files, both notebook, and python script. The notebook file is tested on colab and the python file is tested in a local machine. FYI, the error that you've faced is mostly because of loading no image file or others. So make sure you load the files properly before processing.

You don't need any conversion tools here. Download the notebook from that example page. Open the notebook file and save the notebook file as .py format:

File -> Download -> Python (.py)

By doing that, a .py will be saved. Now, open it. You may need to change a few things. In the beginning, you need to do as follows:

# no need, so comment this 
# get_ipython().system('pip install -q git+https://github.com/tensorflow/examples.git')

# need, add this
import subprocess
subprocess.run(["pip", "install", "git+https://github.com/tensorflow/examples.git"])

From now, you are good to go running this .py script as usual. And one thing to remind you, as there are some plot functions (which supposed to run cell by cell in a notebook) you will see an external window pop up. You need to exit those in order to keep running the code. The program will not run further if you don't close these windows.

But here are few things additionally I have to change for my local machine. I'm using TensorFlow 2.4.1 on Windows, GPU RTX 2070 8GB. I change BUFFER_SIZE from 1000 to 256, not gradually but directly set this size. And, I also faced (though it didn't cause any training problem) the following issue (know more Source.)

CUBLAS_STATUS_ALLOC_FAILED

Luckily program didn't crash but for safety, I added the following piece of code at the very beginning of the .py file.

GPUS = tf.config.experimental.list_physical_devices('GPU')
if GPUS:
    try:
        for GPU in GPUS:
            tf.config.experimental.set_memory_growth(GPU, True)
            logical_gpus = tf.config.experimental.list_logical_devices('GPU')
            print(len(GPUS), "Physical GPUs,", len(logical_gpus), "Logical GPUs") 
    except RuntimeError as  RE:
        print(RE)

Jokes aside, the very first time I'm seeing my CUDA core being used constantly 96%, first stage of happiness followed by getting better results. >^-^<

Here are some predictions after 4 epoch training.

Inference / Prediction

Let's try on samples. We scraped some horse samples from here and we will pass those to the generative model.

from glob import glob
import os 

image_path_list = glob('/content/horse/*.jpg')
horse_img = tf.data.Dataset.list_files(image_path_list)

for i in horse_img:
    print(i)

tf.Tensor(b'/content/horse/2.jpg', shape=(), dtype=string)
tf.Tensor(b'/content/horse/3.jpg', shape=(), dtype=string)
tf.Tensor(b'/content/horse/4.jpg', shape=(), dtype=string)
tf.Tensor(b'/content/horse/1.jpg', shape=(), dtype=string)

def normalize(image):
  image = tf.cast(image, tf.float32)
  image = (image / 127.5) - 1
  return image

def load_images(path):
    image = tf.io.read_file(path)
    image = tf.io.decode_image(image, expand_animations = False)
    return image

def preprocess_image_test(image):
    image = tf.image.resize(image, [256, 256])
    image = normalize(image)
    return image

horse_img = horse_img.map(load_images)
horse_img = horse_img.map(
    preprocess_image_test, num_parallel_calls=AUTOTUNE).cache().shuffle(
    BUFFER_SIZE).batch(1)

for i in horse_img:
    print(i.shape)

(1, 256, 256, 3)
(1, 256, 256, 3)
(1, 256, 256, 3)
(1, 256, 256, 3)

for inp in horse_img.take(4):
  generate_images(generator_g, inp)

edited May 07 '21 at 07:37

answered Mar 29 '21 at 04:06

Innat

16,113
6
53
101

Thanks! I downloaded the .py file and that worked without conversion. however I got the exact same errormessage shortly after I runned the script as in my first post – acroscene Mar 29 '21 at 06:49
Let me know if you've any queries. – Innat Mar 29 '21 at 07:02
hmm. what do you mean? :) – acroscene Mar 29 '21 at 07:08
which error message have you encountered now? `CUBLAS_STATUS_ALLOC_FAILED`? – Innat Mar 29 '21 at 07:12
no the same as above: `File "C:\Users\myname\Desktop\PROJECT\GanTutorial\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 549, in rename_v2 compat.path_to_bytes(src), compat.path_to_bytes(dst), overwrite) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 167: invalid continuation byte ` – acroscene Mar 29 '21 at 07:54
Please run these [scripts](https://drive.google.com/file/d/1_K7voxd_RynoSwE81Id9jw5wUf5_6CyD/view?usp=sharing), no need to change anything, just `python cyclegan.py`. It should work there as it worked here. Otherwise there something else wrong in your environment. – Innat Mar 29 '21 at 07:58
same error. I guess there is something in my environment as you said. hm. – acroscene Mar 29 '21 at 08:39
Oh. You can try it on another pc/laptop if available to ensure about that. – Innat Mar 29 '21 at 08:47
1

And just to inform, you can check the tensorflow-data set version, mine `TF DS 4.2.0` – Innat Mar 29 '21 at 08:48
Thanks. I updated tensorflow-datasets and now it works! thanks for you help – acroscene Mar 29 '21 at 09:31
Glad to know. Please if this answer satisfies you, mark it as the right answer. -) – Innat Mar 29 '21 at 09:35
one additional question. I have now trained and got predicted image. I found the input image in the trainA library in horse2zebra. Is it possible to pick a dedicated inputimage and just process that one? – acroscene Mar 29 '21 at 11:03
1

Yes, we can. Please see the updated answer. I added an inference section where we pick samples from the local machines and process them and pass them to the generator for prediction. Hope that helps. – Innat Mar 29 '21 at 22:42
@acroscene let me know if there's anything I need to add. Thanks. – Innat Mar 31 '21 at 02:57
Thanks. I havnt got time yet trying it out but i will implement it next week. Thanks – acroscene Apr 01 '21 at 07:36
so basically I save a couple of images, link them with `image_path_list = glob('/content/horse/*.jpg')` and put the code below all the other one? It takes really long time per epoch but guess that is allright :) just want to make sure Im doing it right so I dont have to wait for the wrong result. – acroscene Apr 14 '21 at 21:35
Yes. The `/content/horse` is a directory where I put some `jpg` images. However, you can also test it on colab, it should be run faster. – Innat Apr 14 '21 at 21:41
1

hm. After a long time of waiting for the training to finish I got this errormessage: ' ' ````tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern: ```` I have pasted my complete code above in my first post. Do I need to repeat the trainingprocess everytime Im trying to generate new images or can I save the trained data so I dont have to do it again? – acroscene Apr 19 '21 at 13:11
Get the files from [here](https://drive.google.com/drive/folders/11T85GuIYkYnQpZNHwebu0fWPILKnFLNZ?usp=sharing). I've tested the notebook file on colab and python file on my local machine. All works without any issue. And nothing has changed from my given answer. The error you've faced mostly because of somehow you didn't load the input image properly in the inference time. So, please make sure they're loaded. – Innat May 06 '21 at 13:10

Unable to run cyclegan example from tensorflow outside google colab

1 Answers1

Update

Inference / Prediction