How-to guides¶

Task-oriented recipes for putting the v2 findings to work. Each page is a step-by-step procedure with concrete commands and decision points.

Design a preamble for your system — the six-step procedure derived from the v2 findings.
Interpret CQS-craft effect sizes — when the measured effect matters, and when it doesn't.
A/B test a candidate preamble against a baseline — validation path using your own evaluator.
Extend the rubric for a non-Python domain or different evaluator — add dimensions, calibrate, then run.
Add a new preamble condition to the v2 main run — concrete edits to the main script with line refs.
Run a 3-probe identification test on your own preamble — rule out the rubric-overlap confound before claiming a real-world effect.

For why these recipes work, see the findings and the explanation section. For the canonical metric and judge definitions, see the reference section.