Large Language Models are Zero-Shot Reasoners

Takeshi Kojima, Shixiang Gu, Machel Reid et al.

2022 · arXiv (Cornell University) · 1,108 citations

Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by-step answer examples, achieved the state-of-the-art performances in arithmetics and symbolic reasoning, difficult system-2 tasks that do not follow the standard scaling laws for LLMs. While these successes are often attributed to LLMs' ability for few-shot learning, we show that LLMs…

Read the paper →

Explore this paper's citation graph on Constellation.