Running Playbooks
A playbook execution runs each step sequentially, sending prompts to an AI model and accumulating outputs. Steps can branch, capture named variables, pause at breakpoints, and ask you for input mid-run.
Execution Lifecycle
Every playbook execution moves through three phases:
- Preflight – You choose a model, fill in input variables, and optionally set breakpoints.
- Streaming – Steps execute one at a time. Output streams token-by-token via Server-Sent Events (SSE).
- Completion – A summary card shows total tokens, cost, and duration. You can download artifacts, inspect per-step metadata, or restart from any step.
An execution can end in one of these statuses:
| Status | Meaning |
|---|---|
| completed | All steps finished successfully |
| failed | A step returned an error; remaining steps are skipped |
| cancelled | You cancelled the run manually |
| paused | A breakpoint was hit; waiting for you to resume |
| awaiting_input | An @elicit() directive is waiting for your response |
Preflight
Navigate to a playbook and click Run. The preflight page shows:
- Model picker – Select any model you have an API key for (OpenRouter, OpenAI, or Anthropic). A key must be configured in Settings > Connections before you can run a playbook.
- Input variables – One field per
## INPUTSvariable defined in the playbook. Required inputs are marked; optional inputs show their default value as placeholder text. - Breakpoints – If the playbook has more than one step, you see a checkbox per step. Check a step to pause execution after that step completes.
Click Run Playbook to start execution.
Streaming Execution
When a playbook starts, you are redirected to the live execution page – a three-column layout:
- Left rail – Step navigation. Each step shows a status circle: gray (pending), orange with pulse (running), green checkmark (completed), red X (failed), or dash (skipped). Click a step to scroll to its output.
- Center pane – The main output area. Each step gets a card. Text streams in token-by-token as the AI generates it.
- Right rail – Variable tracker. Shows the current value of every input and named output variable. Variables pulse when they update.
How streaming works
- The server creates an execution record and pending step records in the database.
- Your browser opens an SSE connection to the streaming endpoint.
- For each step, the server:
- Emits a
step-updateevent (status: running). - Calls the AI model with the step’s prompt, system prompt, and accumulated context.
- Streams
step-contentevents as tokens arrive. - Once the response finishes, emits a
step-renderedevent with the markdown-rendered HTML. - Emits a
step-updateevent (status: completed) with latency. - If the step captures a named output (
@output), emits avariable-updateevent.
- Emits a
- After all steps complete, a
statusevent sends the final summary.
Step status progression
Each step progresses through these states:
pending --> running --> completed
\-> failed (remaining steps become "skipped")
\-> paused (breakpoint hit)
\-> awaiting_input (elicitation)
\-> skipped (branch condition not met)
SSE event reference
| Event | Payload | When |
|---|---|---|
step-update |
{step, html} |
Step state changes (running, completed, failed, skipped) |
step-content |
{step, content} |
Each token chunk from the AI model |
step-rendered |
{step, html} |
Markdown-rendered HTML after step completes |
variable-update |
{name, value, step} |
A named output variable was captured |
step-paused |
{step, message} |
Execution paused at a breakpoint |
elicitation |
{step, type, prompt, options} |
An @elicit() directive needs your input |
status |
{html} |
Final execution result (completed, failed, cancelled) |
Cancellation
You can cancel a running execution at any time using the Cancel button in the execution header. Remaining steps are marked as skipped, and the execution status becomes cancelled.
Breakpoints
Breakpoints let you pause execution after a step completes so you can inspect its output and the current variable state before continuing.
Setting breakpoints
On the preflight page, check the box next to any step where you want to pause. You can set multiple breakpoints. Breakpoints are passed to the server as a comma-separated list of step numbers.
What happens when a breakpoint hits
- The step runs to completion – you see its full output.
- Named output variables from the step are captured as normal.
- The execution status changes to
paused. - A yellow pause banner appears with step details and a Resume button.
- The left rail shows a yellow pause icon on the paused step.
Inspecting state at a breakpoint
While paused, you can:
- Read the step’s full output in the center pane.
- Check the variables panel to see all current values.
- View the step’s metadata (model, tokens, cost, latency) via the Meta tab.
Resuming
Click Resume to continue from the next step. You can also override input variables before resuming – the resume handler accepts var_NAME form fields to update values mid-execution.
Elicitation (Human-in-the-loop)
When a step contains an @elicit() directive, the execution pauses and asks you for input before proceeding with the AI call for that step. This lets you inject human decisions into a multi-step workflow.
Elicitation types
| Type | Directive | UI |
|---|---|---|
| Text input | @elicit(input, "Your question?") |
Text field with the question as label |
| Confirm | @elicit(confirm, "Proceed with this?") |
Yes/No buttons |
| Select | @elicit(select, "Pick one", "A", "B", "C") |
Dropdown with the listed options |
How elicitation works
- The engine encounters an
@elicit()directive at the beginning of a step. - If no response has been provided yet, the step returns
awaiting_inputwithout calling the AI. - The server emits an
elicitationSSE event with the step number, type, prompt text, and options (if applicable). - An inline form appears in the execution UI.
- You submit your response.
- The server stores your response and creates a fresh execution context.
- The SSE stream reconnects and re-executes the elicitation step, now with your response available as context.
Your response is stored as a named output with the key __elicit_step_N (where N is the step number). The step’s prompt can then reference it, and subsequent steps can build on it.
awaiting_input status until you respond or the context expires.
Results Page
After an execution completes (or fails), the results page shows:
Per-step output
Each completed step displays its output rendered as markdown with syntax-highlighted code blocks. Three view modes are available via the action bar:
- Copy – Copy the raw output to clipboard.
- Raw – View the unrendered text in a monospace
<pre>block. - Meta – Inspect execution metadata for that step.
Step metadata
The Meta view shows:
- Provider – Which AI provider handled the request
- Model – The model that actually responded (may differ from the requested model)
- Finish reason – Why the model stopped generating (e.g.,
stop,length) - Latency – How long the step took
- Tokens (in/out) – Prompt tokens and completion tokens
- Cached tokens – Tokens served from provider cache (when available)
- Reasoning tokens – Tokens used for chain-of-thought (when available)
- Cost estimate – Estimated cost for this step
- Input preview – Collapsible view of the prompt that was sent to the model
Artifact download
If the playbook defines an ## ARTIFACTS section, a download button appears on the results page. It downloads the last completed step’s output as a file. The file type depends on the artifact type:
| Artifact type | Extension | Content-Type |
|---|---|---|
markdown |
.md |
text/markdown |
json |
.json |
application/json |
chartjs |
.json |
application/json |
mermaid |
.mmd |
text/plain |
html_css |
.html |
text/html |
javascript |
.js |
text/plain |
typescript |
.ts |
text/plain |
Execution summary
A summary card at the bottom shows:
- Total steps completed vs. total steps
- Total tokens (prompt + completion)
- Total estimated cost
- Total duration
- Final status
Restart from Step
From the results page, you can restart execution from any step. This creates a new execution that:
- Inherits all completed step outputs from prior steps.
- Re-executes the selected step and all subsequent steps.
- Links back to the original execution via a
parent_execution_idreference.
Restart is useful when a later step fails or produces unsatisfactory output. You do not need to re-run the entire playbook – just pick up from where you want to redo.
Execution History
Each playbook’s detail page includes an execution history showing up to 100 past runs, newest first. Each entry shows:
- Execution status (completed, failed, cancelled, paused, awaiting_input)
- Model used
- Step count
- Start and completion times
- Total tokens and estimated cost
- Error message (if failed)
Click an execution to view its full results, per-step output, and metadata.
Cost Tracking
Every step records a cost estimate based on the model’s per-token pricing. Costs are tracked at two levels:
- Per-step – Visible in each step’s Meta view. Calculated from prompt tokens, completion tokens, and the model’s pricing rates.
- Per-execution – The sum of all step costs, shown in the execution summary and the execution history list.
Cost display adapts to the amount: values under $0.01 show six decimal places (e.g., $0.000342), while larger values show four decimal places (e.g., $0.0512).
MCP Tools
Playbook execution is fully supported via MCP tools, allowing AI assistants to run playbooks, handle breakpoints, and respond to elicitations programmatically.
| Tool | Description |
|---|---|
execute_playbook |
Run a playbook with input values, model, and optional breakpoints |
resume_playbook_execution |
Resume a paused execution from the next step after a breakpoint |
respond_to_elicitation |
Provide a response to an @elicit() pause |
get_playbook_execution |
Fetch execution details and step outputs |
list_playbook_executions |
List past executions for a playbook |
execute_playbook tool runs all steps and returns the complete result. If a breakpoint or elicitation is hit, the tool returns the execution state with status paused or awaiting_input, along with instructions to call resume_playbook_execution or respond_to_elicitation.
For full MCP tool schemas, see the MCP Reference.