We're Afraid Language Models Aren't Modeling Ambiguity
We build a benchmark to evaluate LM understanding of ambiguity, which is an intrinsic feature of language, and find that the task remains extremely challenging, including for GPT-4
Alisa Liu,
Zhaofeng Wu,
Julian Michael,
Alane Suhr,
Peter West,
Alexander Koller,
Swabha Swayamdipta,
Noah A. Smith,
Yejin Choi