Anthropic is introducing the ability to improve prompts and manage examples directly in the Anthropic Console. These new features aim to help developers implement prompt engineering best practices and build more reliable AI applications.
The prompt improver enhances existing prompts through several methods: chain-of-thought reasoning for systematic problem-solving, example standardisation using XML format, example enrichment with aligned reasoning, prompt rewriting for clarity, and prefill addition to direct Claude's actions and enforce output formats.
Testing showed significant improvements in performance. Testing with Claude 3 Haiku involved matching article titles to a sentence pulled at random from 500 Wikipedia articles, where the prompt improver increased accuracy by 30% compared to the original prompt. In a separate test using ten Wikipedia articles, adherence to specific word count ranges for summaries reached 100% after using the prompt improver.
The new example management feature allows developers to manage examples in a structured format directly in the Workbench. For prompts without examples, Claude can automatically generate synthetic example inputs and draft outputs. The system also includes an "ideal output" column in the Evaluations tab to help users grade model outputs on a 5-point scale.