Prover-Verifier Games improve legibility of LLM outputs

🧀

This paper explores improving the readability of Large Language Models (LLMs) outputs through Prover-Verifier Games. The authors propose a training algorithm which makes use of small verifiers and provers—both helpful and sneaky. Their studies reveal that legibility training aids humans in verifying solution correctness more efficiently.

Submitted on 18 Jul 2024
Last revised on 1 Aug 2024
Authors include Jan Hendrik Kirchner and 5 others
Focuses on legibility of LLM outputs
Study involved grade-school math problems

View Website arXiv PDF HTML Version

Social

Prover-Verifier Games improve legibility of LLM outputs