1

I am using retrieved pre-trained to do specvific task BERT model to process new data. The model returns predictions which are concatenated into numpy array

flat_predictions = np.concatenate(predictions, axis=0)

I have to perform calculations on results to set up treshold, so I want to change my numpy array into dataframe.

              #Chage to DF  
              results = np.array(flat_predictions)
              numpy_to_df = pd.DataFrame(results)
              numpy_to_df.head()

Then I get the error ValueError: Must pass 2-d input, shape=(8102, 256, 768) And I cant transform numpy array into DataFrame. The shape return these three numbers which clearly refer to

  1. 8102 - number of rows to processs
  2. 256 - the number of batches
  3. 768 - hidden layers of BERT

When I print just the predictions it seems that they are not being properly concatenetted just split into these batches.

I have no idea why this issue occured and why I cant concatenate the predictions across batches into one array and then just change if to df. Previously the exact same code worked perfectly. Any ideas?

Aleksandra
  • 27
  • 4
  • As the numpy array is 3 dimensional, inorder to convert to dataframe, you may have to reshape the numpy array. Hope this will help : https://stackoverflow.com/questions/36235180/efficiently-creating-a-pandas-dataframe-from-a-numpy-3d-array#:~:text=Efficiently%20Creating%20A%20Pandas%20DataFrame%20From%20A%20Numpy%203d%20array,-numpy%20pandas%20multidimensional&text=The%20idea%20is%20to%20have,dimensions%20in%20the%20original%20array. – Rishin Rahim May 04 '21 at 11:35

0 Answers0