I'm trying do some NLP on stack overflow posts to predict tags according to what's in the title.
I have a constraint which is I have to embed my sentences using the framework of Sentence transformers
The idea was to embed the sentences and use them as an input to a neural network I built.
I'm not an expert in neural network so there are probably a lot of things that I'm missing
The problem I encounter is it failed to convert to a tensor. I have tried solving this with this post on SO , but still have the same issue...
Below is my code :
title_list = df.Title.tolist()
model = SentenceTransformer('paraphrase-distilroberta-base-v1')
embeddings = model.encode(title_list)
embeddings_list = [elem for elem in embeddings_ex]
df_embed = df
df_embed['Embeddings'] = embeddings_list
df_embed.Embeddings = [np.asarray(x).astype('float32') for x in df_embed.Embeddings]
X = df_embed['Embeddings'].values
y = df_embed.Tags
mlb = MultiLabelBinarizer(classes=top_tags)
y_mlb = pd.DataFrame(mlb.fit_transform(y),columns=mlb.classes_, index=y.index)
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y_mlb, test_size = 0.3, random_state = 0)
X_val, X_test, y_val, y_test = train_test_split(X_val, y_val, test_size = 0.4, random_state = 0)
model = Sequential()
# Input - Layer
model.add(Dense(100, activation = "relu"))
# Hidden - Layers
model.add(Dropout(0.3, noise_shape=None, seed=None))
# Output- Layer
model.add(Dense(50, activation = "sigmoid"))
model.compile(loss='binary_crossentropy',
optimizer=Adam(0.01),
metrics=['accuracy'])
hist = model.fit(X_train, y_train, batch_size=8, epochs=10, validation_split=0.1)
I got this error:
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).