What is the difference between Sentence Encodings and Contextualized Word Embeddings?

Question

I have seen both terms used while reading papers about BERT and ELMo so I wonder if there is a difference between them.

score 20 · Accepted Answer · answered Jan 25 '20 at 15:24

A contextualized word embeding is a vector representing a word in a special context. The traditional word embeddings such as Word2Vec and GloVe generate one vector for each word, whereas a contextualized word embedding generates a vector for a word depending on the context. Consider the sentences The duck is swimmingand You shall duck when someone shoots at you. With traditional word embeddings, the word vector for duckwould be the same in both sentences, whereas it should be a different one in the contextualized case.
While word embeddings encode words into a vector representation, there is also the question on how to represent a whole sentence in a way a computer can easily work with. These sentence encodings can embedd a whole sentence as one vector , doc2vec for example which generate a vector for a sentence. But also BERT generates a representation for the whole sentence, the [CLS]-token.

So in short, a conextualized word embedding represents a word in a context, whereas a sentence encoding represents a whole sentence.

1 Answers1