I'm trying to create a Dataset with multiple input, which can be feed into a model with multiple inputs. It works fine when I'm working with single input, then I can easily set the shape using set_shape
of that tensor inside my Dataset.map
function. But now I don't know which shape I should set the tensor to.
Related information:
x_img1
is 10 dimensional 1024 by 1024 imagex_img2
is a 2D, 32 by 64 image
Here is my code:
def read_image(path, rel):
# blah blah blah, read somehow
return image
def read_image1(path1, filter0):
# blah blah blah, read somehow
return image
def preprocess(x, y):
def func(x, y):
x = json.loads(x)
x_img1 = read_image(x['path'], x['rel']) # 3D image
x_img2 = read_image(x['path-fork'], x['filter']) #2D image with different shape
# image resizing will lose data
y = tf.keras.utils.to_categorical(y, num_classes=len(set(df['label'].values))) # todo: yeah, i'll optimize it never
return (x_img1, x_img2), y
_x, _y = tf.numpy_function(func, [x, y], [tf.float32, tf.float32])
# _x.set_shape([256, 256, 3]) <--- previously i used to do this
_y.set_shape([10])
return _x, _y
# here `x` is an array of string, and those strings are actually json/dictionary
def tf_dataset(x,y, batch=16):
dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.shuffle(buffer_size=1000)
dataset = dataset.map(preprocess)
dataset = dataset.batch(batch)
dataset = dataset.prefetch(16)
return dataset
Now it's throwing an error saying:
InternalError: Graph execution error:
Unsupported object type numpy.ndarray
[[{{node PyFunc}}]]
[[IteratorGetNext]] [Op:__inference_train_function_30745]
Here is the notebook and data(same data being replicated): https://gist.github.com/maifeeulasad/25975541c888aa9bf865ad1827010907