LLM token embeddings

Asked Aug 19 '23 at 11:47

Active Aug 19 '23 at 11:47

Viewed 28 times

Hi im just getting started with undertsanding transformer based models and I am not able to find how the token embeddings are arrived at?. there are multiple tokenization approaches and multiple vocabularies/documents llms are trained on. so my question is

whether each llm also trains its own token embeddings?
how do those pre trained embeddings work for transfer learning or finetuning, on custom data sets where some OOV words may be present or we have some special unique tokens we want to keep?

asked Aug 19 '23 at 11:47

dasman

LLM token embeddings

0 Answers0