3

Sampling from Your Language Model One Byte at a Time
Tokenization is used almost universally by modern language models, enabling efficient text representation using multi-byte or …
Tuning Language Models by Proxy
We develop an algorithm for “tuning” language models at decoding-time!