Evaluation & Testing

3 issues in this category

Issue №12MAY 2026Evaluation & Testing

Prompt drift: a 2026 detection playbook

Prompt drift is when output quality changes over time even though the prompt didn't change. The 2026 cadence, three causes, and a four-step playbook.

12 min read

Issue №11MAY 2026Evaluation & Testing

15 LLM-as-a-judge prompt templates (copy-paste)

15 copy-paste LLM-as-a-judge templates as YAML, organized by dimension. 5 foundations plus 10 specialized rubrics for RAG, code, summarization, agents.

15 min read

Issue №05APR 2026Evaluation & Testing

How to set up prompt regression testing

A 7-step guide to building regression tests for production prompts. Catch breakage before deploy with golden datasets, scoring rubrics, and LLM judges.

18 min read