0

I've just trained a classification with BERT and it has 3 classes as an output. The prediction makes it an array like you can see in the picture I've attached.

prediction result

I've tried to make it as a dataframe using this code:

data_result = pd.DataFrame(predictions)

But it gives me warning like this

/usr/local/lib/python3.7/dist-packages/pandas/core/internals/construction.py:305: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray values = np.array([convert(v) for v in values])

And the result when I make it as a csv isn't like what I want. It didn't have three columns

Hope anyone can helps. Thank you

1 Answers1

0

Update

From the comments, reshaping array Initial array shape is (2, 1, 3) as in x but is list of numpy arrays, We concatenate along outer most axis = 0 of each numpy array, which will merge those numpy array into single. This will remove (2, 1, 3) --> (2, 3)

np.newaxis is to convert the inner numpy array by one more dimension.

>>> from numpy import array, float32
>>> import numpy as np
>>> x = [ array([[ 1.9392334, -2.4614801, 1.1337504]], dtype=float32), array([[-2.705459 , 3.260675 , -0.9435711]], dtype=float32)]
>>> x.shape
>>> np.concatenate(x, axis=0)
array([[ 1.9392334, -2.4614801,  1.1337504],
       [-2.705459 ,  3.260675 , -0.9435711]], dtype=float32)
>>> y = np.concatenate(x, axis=0)
>>> y.shape
(2, 3)
>>> z = y[np.newaxis, :]
>>> z.shape
(1, 2, 3)

This means the entries in prediction is not of uniform shape.

Re-inspect or share the prediction array's shape and confirm if all of them are same. I was able to create an example for you which gives same error. This should get you started. As you can see first entry shape is (2, 3) while second entry is (1, 3)

>>> pd.DataFrame([np.array([[1.,2.,3.], [1.,3.,4.]], dtype=np.float32), np.array([[2,3.,4.]], dtype=np.float32)])
/home/lol/anaconda3/lib/python3.8/site-packages/pandas/core/internals/construction.py:305: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  values = np.array([convert(v) for v in values])
                                    0
0  [[1.0, 2.0, 3.0], [1.0, 3.0, 4.0]]
1                   [[2.0, 3.0, 4.0]]

Also the errors will not stop there since it is indirectly a 3-d array. Depending on how you want dataframe How to transform a 3d arrays into a dataframe in python will be helpful

eroot163pi
  • 1,791
  • 1
  • 11
  • 23
  • Thank you so much can I ask more about it? actually it is because the batch, this is not my code so I try to understand it. I change the batch to 1 so the prediction become like this array([[ 1.9392334, -2.4614801, 1.1337504]], dtype=float32), array([[-2.705459 , 3.260675 , -0.9435711]], dtype=float32), and it is has shape (1131, 1, 3). How to make it like this? array([[4.3806553e-02, 4.7469378e-02, 7.5160474e-01], [2.6762217e-02, 9.2843503e-01, 4.9445182e-02], [3.3696175e-02, 5.1952463e-01, 3.0997264e-01] the shape maybe like (1, 1131, 3) – Eva Agustine Jul 22 '21 at 14:00
  • @EvaAgustine I added the reshaping explanation – eroot163pi Jul 22 '21 at 15:05
  • Thank you very much it is works and turns out like what I want! – Eva Agustine Jul 23 '21 at 02:43
  • @EvaAgustine can you accept and upvote the answer? – eroot163pi Jul 23 '21 at 06:49
  • 1
    Thank you this answer helped me a lot. But they say I can't vote it now I need more than 15 reputations or something. Sorry, I am new. Maybe tell me how do you mean by accepting? – Eva Agustine Jul 24 '21 at 07:05