TensorFlow Dataset: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray)

Question

I am aware that similar questions have been asked perviously but none of the proposed solutions seems to work for me. I have the following Pandas Dataframe:

	Title	Author	Target	Tag0	Tag1	Tag2	Tag3	Tag4	Tag5	Tag6	Tag7	Tag8	Tag9
0	Says Ron Johnson referred to "The Lego Movie" as an "insidious anti-business conspiracy."	0	0	30	0	36	35	nan	nan	nan	nan	nan	nan
1	"Forty percent of the Fortune 500 were started either by immigrants or children of immigrants."	1	0	9	21	5	28	nan	nan	nan	nan	nan	nan

I have vectorised Title attribute by means of TextVectorization layer in Keras obtaining the following Dataframe:

	Title	Author	Target	Tag0	Tag1	Tag2	Tag3	Tag4	Tag5	Tag6	Tag7	Tag8	Tag9
0	[9415, 19483, 9066, 16820, 20256, 6959, 6931,...,0 ]	0	0	3213	3829	223	3140	nan	nan	nan	nan	nan	nan

I want to transform this Pandas dataframe to a TensorFlow dataset. I have tried to achieve this using the following code:

dataset = tf.data.Dataset.from_tensor_slices((data.values, target.values))

Here is the error I am getting:

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

By removing Title column the error goes away, then Title is the column that makes the error. Title looks like this:

print(data["Title"].values)

array([array([ 9415., 19483.,  9066., 16820., 20256.,  6959.,  6931.,  8539.,
       10705.,  1342.,  1896.,  4353., 14143.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.],
       ...,
       array([17497., 20189.,  4280.,  3460., 20256., 15754.,  9178.,  1114.,
       19441., 18731., 13875., 14018.,  5789.,  6959.,  8740., 13042.,
         929.,  9541.,   773., 19384.,  5659., 13042., 14578.,  2813.,
       17452.,   888.,  6206.,  6959., 14540.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.,     0.,     0.,     0.,     0.,     0.],
      dtype=float32)], dtype=object)

My question is: What is wrong with `Title`? What should I change ?

I am assuming that is related to the data type of the numpy.ndarray containing each numpy.ndarray title. As it be can seen above dtype=object. But I am not really sure.

Thank you in advance!

Edit:

I found a work around to this issue by simply transforming the dataset to a Numpy ndarray.

# To numpy
numpy_dataset = data.to_numpy(dtype="<U43")

#Get Target
target = data.pop("Target")

#TF dataset
dataset = tf.data.Dataset.from_tensor_slices((numpy_dataset, target.values))

Each cell of the `Title` column is an array. `values` is then an array of arrays. Try `np.stack(data["Title"].values)`. If it raises an error, those nested arrays differ in shape, and cannot be made into a 2d numeric array (which `tensorflow` can use). — hpaulj, Jan 13 '21 at 20:53
Great that solved my problem **but** partially. As you can see in the code above I pass the dataframe not only `Titles`. If I do what you suggested, `tf.data.Dataset.from_tensor_slices((np.stack(data["Title"].values), target.values))` the `TensorFlow` dataset is created. But how can I include the remaining columns? — GGS, Jan 13 '21 at 21:05
Other answers here : https://stackoverflow.com/questions/58636087/tensorflow-valueerror-failed-to-convert-a-numpy-array-to-a-tensor-unsupporte/75139312 — Skippy le Grand Gourou, Jan 16 '23 at 20:24

score 1 · Accepted Answer · answered Jan 16 '21 at 12:08

I found a work around to this issue by simply transforming the dataset to a Numpy ndarray.

# To numpy
numpy_dataset = data.to_numpy(dtype="<U43")

#Get Target
target = data.pop("Target")

#TF dataset
dataset = tf.data.Dataset.from_tensor_slices((numpy_dataset, target.values))

score 1 · Answer 2 · answered Mar 19 '21 at 09:51

1

I meet the same question when I try the demo of tf feature_columns.ipynb. I found the data contain null data, after drop them, the code worked

    #drop null data
     dataframe = dataframe.dropna(axis=0, how='any')

answered Mar 19 '21 at 09:51

sybil wu

11
2

TensorFlow Dataset: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray)

My question is: What is wrong with Title? What should I change ?

Edit:

2 Answers2

My question is: What is wrong with `Title`? What should I change ?