1 issue tagged llm-as-judge
A 7-step guide to building regression tests for production prompts. Catch breakage before deploy with golden datasets, scoring rubrics, and LLM judges.