Demo mode, cost drill-down, and the GPT-5 lineage
The biggest week of the run-up to launch. New users without a key get five free AI-pair runs. Every run shows what it cost. The model registry now ships GPT-5.x and Gemini 2.5 Flash-Lite.
The biggest week of the run-up to launch. Three things you'll notice on day one as a new user, plus a stack of cost-transparency and model-lineage work that lets a paying user trust the bill.
Demo mode for new accounts
A new account without a connected provider key now gets a five-call lifetime budget on Critique, Improve, and Rewrite, powered by a Prompt Assay-funded Anthropic key locked to Claude Sonnet 4.6. The point is that you can author your first prompt and watch the AI pair work on it before you've connected anything of your own.
The budget is enforced atomically at the database layer, not in app code. The moment you connect a real BYOK key, a database trigger permanently disables the budget. Deleting the BYOK key afterwards does not revive it. We don't want demo mode to become a back door for free inference, and we want the BYOK posture to stay honest: your provider's bill, your provider's account, no platform middleman, except for these five courtesy calls.
Cost drill-down per prompt
Every AI run, on every panel, now shows what the call cost when it finishes. The five panels (Brainstorm, Critique, Improve, Rewrite, Compare) each carry a Cost receipt under the run with input tokens, output tokens, cached-input tokens, and total in dollars. Cents-resolution, calculated against the published per-1M rates for the actual model that ran.
The usage dashboard now drills down by prompt and by version, so you can answer questions like which prompts are eating the budget this month and did our last revision get cheaper or more expensive. The dashboard stays fast even on workspaces with hundreds of thousands of runs.
GPT-5 lineage and Gemini 2.5 Flash-Lite
The model registry refresh added the full GPT-5.x family (5.5, 5.5 Pro, 5.4, 5.4 Pro, 5.4 Mini, 5.4 Nano, 5.3 Codex) and Gemini 2.5 Flash-Lite. Live API specs verified at ship time: pricing, context window, prompt-caching support, reasoning-effort levels, and thinking-budget ranges per model. Limited-access models surface a Limited access badge in the dropdown so you don't pick one and bounce off a 403.
The cheap-classifier slot for Generate Intent and the Workbench Model fallback now picks Gemini 2.5 Flash-Lite when no Workbench Model is set, ahead of the previous Flash because Flash-Lite is roughly three times cheaper and fast enough for classifier work. OpenAI's gpt-4.1-nano remains the OpenAI choice for the same slot.
Quietly underneath
Improvements & fixes
Limited accessbadge now renders inline in model dropdowns. GPT-5.5 Pro and GPT-5.4 Pro carry an explicit visual badge instead of a text suffix, both in the dropdown row and on the trigger after selection. Saves you from picking a model and bouncing off a 403.
What's next
Next week's likely candidate: a granular reasoning-effort control for the GPT-5.x family. Today the Deep analysis checkbox maps to high or low; users on GPT-5.5 should be able to ask for xhigh. Scope-pending.
Related entries
Vote on what we ship next
A two-board feedback system where you can file bugs, request features, and vote on what we ship next. Plus a brainstorm timeout fix you may have hit on long Opus runs.
Schedule eval batches without watching the timer
The new Schedule batch action queues eval-suite runs through Anthropic's Batches API so you can walk away. Results post back when the batch returns.