Showed that providing chain-of-thought reasoning demonstrations in prompts dramatically improves LLM performance on arithmetic, commonsense, and symbolic reasoning. With PaLM 540B, achieved SOTA on GSM8K math benchmarks.

Chain-of-thought prompting is the direct intellectual ancestor of reasoning models (o1, o3, R1, QwQ). The insight that models reason better when they "think step by step" inspired test-time compute scaling and thinking tokens. NeurIPS 2022. 14K+ citations. By Wei, Wang, Schuurmans et al.

Paper

arXiv: 2201.11903

Venue: NeurIPS 2022

foundationalreasoning

Related