I am using a TensorFlow Dataset to consume data from my hard drive. The data is stored in NumPy arrays, and the paths for the NumPy arrays are stored in a text file. When creating the dataset, I am using the dataset.map()
function to map each path to a NumPy array.
Here are the relevant parts of my code:
def parser(path):
x = np.load(path)
return x
paths = ['data1.npy', 'data2.npy', 'data3.npy', 'data4.npy', ... ]
dataset = tf.data.Dataset.from_tensor_slices((paths))
dataset = dataset.map(map_func=parser)
However, this gives the following error:
AttributeError: 'Tensor' object has no attribute 'read'
The error refers to the line x = np.load(path)
. So it seems that I cannot load a NumPy array in this way in my parser function, because path
is not actually a string, but a Tensor.
What is the correct way to do this? I want to avoid using TFRecords if possible.
I have also tried wrapping the load function as follows:
x = tf.py_func(np.load(path))
But this gives me the same error on that line:
AttributeError: 'Tensor' object has no attribute 'read'