I want to use the BERT Word Vector Embeddings in the Embeddings layer of LSTM instead of the usual default embedding layer. Is there any way I can do it?
Asked
Active
Viewed 8,092 times
6
-
1Does [this](https://stackoverflow.com/questions/55669695/how-to-feed-bert-embeddings-to-lstm) answer your question? – Parth Shah Jul 07 '20 at 09:17
-
It will work if I have word embeddings from a single sentence. What if I have embedding matrix made out of several sentences? – PeakyBlinder Jul 08 '20 at 09:54
-
Well, it depends on what you're trying to do with the network, I guess. – Parth Shah Jul 08 '20 at 12:29
-
Excuse me did you solve it please – user1 Jan 10 '22 at 06:47
-
@user1 you may refer this solution https://stackoverflow.com/a/62466528/10097229 – PeakyBlinder Jan 10 '22 at 11:57
-
Thanks for replying. Excuse me ,Did you get word embedding then apply it in the weight of this line . `Embedding=(vocab_size,output=..)` – user1 Jan 13 '22 at 06:28
-
@user1 yes got embeddings first – PeakyBlinder Jan 16 '22 at 12:08
1 Answers
3
Hope these links are of help:
Huggingface transformers with Tf2.0 (TPU training) with embeddings: (https://www.kaggle.com/abhilash1910/nlp-workshop-2-ml-india)
Contextual Similarity with BERT Embeddings(Pytorch): https://github.com/abhilash1910/BERTSimilarity
For generating unique sentence embeddings using BERT/BERT variants, it is recommended to select the correct layers. In some cases the following pattern can be taken into consideration for determining the embeddings(TF 2.0/Keras):
transformer_model = transformers.TFBertModel.from_pretrained('bert-large-uncased')
input_ids = tf.keras.layers.Input(shape=(128,), name='input_token', dtype='int32')
input_masks_ids = tf.keras.layers.Input(shape=(128,), name='masked_token', dtype='int32')
X = transformer_model(input_ids, input_masks_ids)[0]
X = tf.keras.layers.Dropout(0.2)(X)
X = tf.keras.layers.Dense(6, activation='softmax')
model = tf.keras.Model(inputs=[input_ids, input_masks_ids], outputs = X)(X)
- If this does not work, then please refer to "Feature Extraction " in the huggingface repository to get embeddings.(https://huggingface.co/transformers/main_classes/pipelines.html) A sample is provided:
import numpy as np
from transformers import AutoTokenizer, pipeline, TFDistilBertModel
from scipy.spatial.distance import cosine
def transformer_embedding(name,inp,model_name):
model = model_name.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)
pipe = pipeline('feature-extraction', model=model,
tokenizer=tokenizer)
features = pipe(inp)
features = np.squeeze(features)
return features
z=['The brown fox jumped over the dog','The ship sank in the Atlantic Ocean']
embedding_features1=transformer_embedding('distilbert-base-uncased',z[0],TFDistilBertModel)
embedding_features2=transformer_embedding('distilbert-base-uncased',z[1],TFDistilBertModel)
distance=1-cosine(embedding_features1[0],embedding_features2[0])
print(distance)
Thanks.

keramat
- 4,328
- 6
- 25
- 38

Abhilash Majumder
- 124
- 4
-
1I think you miss part of the code in the second snippet. Could you complete it? – Adelson Araújo Mar 23 '21 at 18:35
-
1
-
If I am using your second snippet or sentence-transformer to generate bert embedding, how it should apply in keras model? What I have in my mind is to give a input like (number_of_instance, dimensions) Ex-: (2000,768) as a numpy array – Kavishka Gamage Jan 05 '22 at 03:03