dataset | Alisa Liu

We're Afraid Language Models Aren't Modeling Ambiguity

We build a benchmark to evaluate LM understanding of ambiguity, which is an intrinsic feature of language, and find that the task remains extremely challenging, including for GPT-4

Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Yejin Choi

That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?

The translation of ambiguous text presents a challenge for translation systems, as it requires using the surrounding context to …

Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith

Inverse Scaling: When Bigger Isn't Better

Work on scaling laws has found that large language models (LMs) show predictable improvements to overall loss with increased scale …

Ian R. McKenzie, 18 others, Alisa Liu, Jiacheng Liu, Tom Tseng, Tomasz Korbak, Najoung Kim, Samuel R. Bowman, Ethan Perez

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Large “instruction-tuned” language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to …

Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi

WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation

We introduce a paradigm for dataset creation based on human and machine collaboration, and demonstrate its empirical effectiveness for collecting a new large-scale NLI dataset

Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi

CODAH: An Adversarially-Authored Question Answering Dataset for Common Sense

An adversarially-constructed dataset for common sense QA

Michael Chen, Mike D’Arcy, Alisa Liu, Jared Fernandez, Doug Downey