I have seen both terms used while reading papers about BERT and ELMo so I wonder if there is a difference between them.
Asked
Active
Viewed 7,257 times
1 Answers
20
- A contextualized word embeding is a vector representing a word in a special context. The traditional word embeddings such as Word2Vec and GloVe generate one vector for each word, whereas a contextualized word embedding generates a vector for a word depending on the context. Consider the sentences
The duck is swimming
andYou shall duck when someone shoots at you
. With traditional word embeddings, the word vector forduck
would be the same in both sentences, whereas it should be a different one in the contextualized case. - While word embeddings encode words into a vector representation, there is also the question on how to represent a whole sentence in a way a computer can easily work with. These sentence encodings can embedd a whole sentence as one vector , doc2vec for example which generate a vector for a sentence. But also BERT generates a representation for the whole sentence, the [CLS]-token.
So in short, a conextualized word embedding represents a word in a context, whereas a sentence encoding represents a whole sentence.

chefhose
- 2,399
- 1
- 21
- 32