How text is converted into a format LLMs can understand.
Solving the out-of-vocabulary problem with subword units.
From raw text to a list of integer IDs.
The lookup table that maps token IDs to dense vectors.
How Transformers create dynamic representations unlike Word2Vec.