Codes are in Pytorch.
I want part of my embedding matrix to be trainable and I want rest part to freeze weights as they are pretrained vectors.
My src_weight_matrix is the pretrained embedding matrix of dimension : [vocab_size + 4 special_tokens] X [emb_dim] = [49996 + 4] X [300]
first 49996 rows are for word in vocab and last 4 words in the vocabulary are special token i.e. [UNK], [PAD], [START], [STOP]. I have randomly initialized the embedding vectors for these 4 words.
So I want to train these 4 embedding weights and let other embedding have their own weights.
The code is as follow where all embedding weights are frozen except last 4 which is correct or not I don't know.
class Encoder(nn.Module):
def __init__(self, src_weights_matrix):
super(Encoder, self).__init__()
self.embedding = nn.Embedding(config.vocab_size, config.emb_dim)
self.embedding.load_state_dict({'weight': src_weights_matrix})
self.embedding.weight.requires_grad = False
self.embedding.weight[-4:, :].requires_grad = True
init_wt_normal(self.embedding.weight)
self.lstm = nn.LSTM(config.emb_dim, config.hidden_dim, num_layers=1, batch_first=True, bidirectional=True)
init_lstm_wt(self.lstm)
self.W_h = nn.Linear(config.hidden_dim * 2, config.hidden_dim * 2, bias=False)
Any lead is greatly appreciated. Thanks in advance