Schedule eval batches without watching the timer
The new Schedule batch action queues eval-suite runs through Anthropic's Batches API so you can walk away. Results post back when the batch returns.
Eval suites used to require you to sit on the run. A 200-case suite against Claude takes long enough that you'd start the run, flip to email, and miss the finish window. Now there's a Schedule batch button on every suite that queues the run through Anthropic's Batches API and reports back when it's done.
What you can do now
Press Schedule batch on any eval suite that uses an Anthropic target model. The run goes into Anthropic's batch queue at 50% off normal pricing and finishes within their 24-hour SLA, often much sooner. The status card on the suite page shows queued / processing / complete and surfaces any per-case failures inline once the batch returns.
There's no separate billing flow. The cost still hits your own Anthropic account directly via your BYOK key. The batch discount applies automatically because the API does it on Anthropic's side.
Why this matters
Long-running evals are the case where it's actually worth giving up sub-second feedback. A regression suite that runs over lunch and posts results to your inbox is more useful than the same suite that ran in ten minutes but tied up your editor. Honest tradeoff.
Related entries
Skills workbench launches · Behavioral Eval · evaluations open on every tier
Author Agent Skills in Prompt Assay, score them with a six-dimension Critique, and run a Behavioral Eval across Claude, GPT, and Gemini. Evaluation suites now open on every tier.
Demo mode, cost drill-down, the GPT-5 lineage, and a much sharper AI pair
A big week of ships: free demo runs for new accounts, per-run cost receipts, GPT-5 and Gemini 2.5 Flash-Lite, shareable Critique and Compare, plus cross-model testing in the Playground.
Convert Prompt to Skill, Skills REST API, public prompt links, kind switcher
Convert turns any prompt into an Agent Skill bundle. The REST API gains skill-file and Skill Report endpoints, prompts get public share links, and the sidebar adds a Prompts and Skills switcher.