1

SuperBPE: Space Travel for Language Models
The assumption across nearly all language model (LM) tokenization schemes is that tokens should be subwords, i.e., contained within …
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Language model post-training is applied to refine behaviors and unlock new skills across a wide range of recent language models, but …
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models
Despite their wide adoption, the biases and unintended behaviors of language models remain poorly understood. In this paper, we …
LlamaPIE: Proactive In-Ear Conversation Assistants
We introduce LlamaPIE, the first real-time proactive assistant designed to enhance human conversations through discreet, concise …
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better …
We're Afraid Language Models Aren't Modeling Ambiguity
We build a benchmark to evaluate LM understanding of ambiguity, which is an intrinsic feature of language, and find that the task remains extremely challenging, including for GPT-4
That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?
The translation of ambiguous text presents a challenge for translation systems, as it requires using the surrounding context to …
Inverse Scaling: When Bigger Isn't Better
Work on scaling laws has found that large language models (LMs) show predictable improvements to overall loss with increased scale …
How Language Model Hallucinations Can Snowball
A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements. Hallucinations …
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Large “instruction-tuned” language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to …