Running Playbooks

A playbook execution runs each step sequentially, sending prompts to an AI model and accumulating outputs. Steps can branch, capture named variables, pause at breakpoints, and ask you for input mid-run.

Execution Lifecycle

Every playbook execution moves through three phases:

Preflight – You choose a model, fill in input variables, and optionally set breakpoints.
Streaming – Steps execute one at a time. Output streams token-by-token via Server-Sent Events (SSE).
Completion – A summary card shows total tokens, cost, and duration. You can download artifacts, inspect per-step metadata, or restart from any step.

An execution can end in one of these statuses:

Status	Meaning
completed	All steps finished successfully
failed	A step returned an error; remaining steps are skipped
cancelled	You cancelled the run manually
paused	A breakpoint was hit; waiting for you to resume
awaiting_input	An `@elicit()` directive is waiting for your response

Preflight

Navigate to a playbook and click Run. The preflight page shows:

Model picker – Select any model you have an API key for (OpenRouter, OpenAI, or Anthropic). A key must be configured in Settings > Connections before you can run a playbook.
Input variables – One field per ## INPUTS variable defined in the playbook. Required inputs are marked; optional inputs show their default value as placeholder text.
Breakpoints – If the playbook has more than one step, you see a checkbox per step. Check a step to pause execution after that step completes.

Click Run Playbook to start execution.

Info

Playbook executions are rate-limited to 20 per hour per user.

Streaming Execution

When a playbook starts, you are redirected to the live execution page – a three-column layout:

Left rail – Step navigation. Each step shows a status circle: gray (pending), orange with pulse (running), green checkmark (completed), red X (failed), or dash (skipped). Click a step to scroll to its output.
Center pane – The main output area. Each step gets a card. Text streams in token-by-token as the AI generates it.
Right rail – Variable tracker. Shows the current value of every input and named output variable. Variables pulse when they update.

How streaming works

The server creates an execution record and pending step records in the database.
Your browser opens an SSE connection to the streaming endpoint.
For each step, the server:
- Emits a step-update event (status: running).
- Calls the AI model with the step’s prompt, system prompt, and accumulated context.
- Streams step-content events as tokens arrive.
- Once the response finishes, emits a step-rendered event with the markdown-rendered HTML.
- Emits a step-update event (status: completed) with latency.
- If the step captures a named output (@output), emits a variable-update event.
After all steps complete, a status event sends the final summary.

Step status progression

Each step progresses through these states:

pending --> running --> completed
                   \-> failed (remaining steps become "skipped")
                   \-> paused (breakpoint hit)
                   \-> awaiting_input (elicitation)
                   \-> skipped (branch condition not met)

SSE event reference

Event	Payload	When
`step-update`	`{step, html}`	Step state changes (running, completed, failed, skipped)
`step-content`	`{step, content}`	Each token chunk from the AI model
`step-rendered`	`{step, html}`	Markdown-rendered HTML after step completes
`variable-update`	`{name, value, step}`	A named output variable was captured
`step-paused`	`{step, message}`	Execution paused at a breakpoint
`elicitation`	`{step, type, prompt, options}`	An `@elicit()` directive needs your input
`status`	`{html}`	Final execution result (completed, failed, cancelled)

Cancellation

You can cancel a running execution at any time using the Cancel button in the execution header. Remaining steps are marked as skipped, and the execution status becomes cancelled.

Breakpoints

Breakpoints let you pause execution after a step completes so you can inspect its output and the current variable state before continuing.

Setting breakpoints

On the preflight page, check the box next to any step where you want to pause. You can set multiple breakpoints. Breakpoints are passed to the server as a comma-separated list of step numbers.

What happens when a breakpoint hits

The step runs to completion – you see its full output.
Named output variables from the step are captured as normal.
The execution status changes to paused.
A yellow pause banner appears with step details and a Resume button.
The left rail shows a yellow pause icon on the paused step.

Inspecting state at a breakpoint

While paused, you can:

Read the step’s full output in the center pane.
Check the variables panel to see all current values.
View the step’s metadata (model, tokens, cost, latency) via the Meta tab.

Resuming

Click Resume to continue from the next step. You can also override input variables before resuming – the resume handler accepts var_NAME form fields to update values mid-execution.

Warning

Paused executions have a 15-minute timeout. If you do not resume within that window, the pause context expires and you will need to re-run the playbook.

Elicitation (Human-in-the-loop)

When a step contains an @elicit() directive, the execution pauses and asks you for input before proceeding with the AI call for that step. This lets you inject human decisions into a multi-step workflow.

Elicitation types

Type	Directive	UI
Text input	`@elicit(input, "Your question?")`	Text field with the question as label
Confirm	`@elicit(confirm, "Proceed with this?")`	Yes/No buttons
Select	`@elicit(select, "Pick one", "A", "B", "C")`	Dropdown with the listed options

How elicitation works

The engine encounters an @elicit() directive at the beginning of a step.
If no response has been provided yet, the step returns awaiting_input without calling the AI.
The server emits an elicitation SSE event with the step number, type, prompt text, and options (if applicable).
An inline form appears in the execution UI.
You submit your response.
The server stores your response and creates a fresh execution context.
The SSE stream reconnects and re-executes the elicitation step, now with your response available as context.

Your response is stored as a named output with the key __elicit_step_N (where N is the step number). The step’s prompt can then reference it, and subsequent steps can build on it.

Info

Elicitation pauses also have a 15-minute timeout. The execution stays in awaiting_input status until you respond or the context expires.

Results Page

After an execution completes (or fails), the results page shows:

Per-step output

Each completed step displays its output rendered as markdown with syntax-highlighted code blocks. Three view modes are available via the action bar:

Copy – Copy the raw output to clipboard.
Raw – View the unrendered text in a monospace <pre> block.
Meta – Inspect execution metadata for that step.

Step metadata

The Meta view shows:

Provider – Which AI provider handled the request
Model – The model that actually responded (may differ from the requested model)
Finish reason – Why the model stopped generating (e.g., stop, length)
Latency – How long the step took
Tokens (in/out) – Prompt tokens and completion tokens
Cached tokens – Tokens served from provider cache (when available)
Reasoning tokens – Tokens used for chain-of-thought (when available)
Cost estimate – Estimated cost for this step
Input preview – Collapsible view of the prompt that was sent to the model

Artifact download

If the playbook defines an ## ARTIFACTS section, a download button appears on the results page. It downloads the last completed step’s output as a file. The file type depends on the artifact type:

Artifact type	Extension	Content-Type
`markdown`	`.md`	`text/markdown`
`json`	`.json`	`application/json`
`chartjs`	`.json`	`application/json`
`mermaid`	`.mmd`	`text/plain`
`html_css`	`.html`	`text/html`
`javascript`	`.js`	`text/plain`
`typescript`	`.ts`	`text/plain`

Execution summary

A summary card at the bottom shows:

Total steps completed vs. total steps
Total tokens (prompt + completion)
Total estimated cost
Total duration
Final status

Restart from Step

From the results page, you can restart execution from any step. This creates a new execution that:

Inherits all completed step outputs from prior steps.
Re-executes the selected step and all subsequent steps.
Links back to the original execution via a parent_execution_id reference.

Restart is useful when a later step fails or produces unsatisfactory output. You do not need to re-run the entire playbook – just pick up from where you want to redo.

Info

Restarting counts toward the 20-per-hour rate limit, same as a fresh execution.

Execution History

Each playbook’s detail page includes an execution history showing up to 100 past runs, newest first. Each entry shows:

Execution status (completed, failed, cancelled, paused, awaiting_input)
Model used
Step count
Start and completion times
Total tokens and estimated cost
Error message (if failed)

Click an execution to view its full results, per-step output, and metadata.

Cost Tracking

Every step records a cost estimate based on the model’s per-token pricing. Costs are tracked at two levels:

Per-step – Visible in each step’s Meta view. Calculated from prompt tokens, completion tokens, and the model’s pricing rates.
Per-execution – The sum of all step costs, shown in the execution summary and the execution history list.

Cost display adapts to the amount: values under $0.01 show six decimal places (e.g., $0.000342), while larger values show four decimal places (e.g., $0.0512).

MCP Tools

Playbook execution is fully supported via MCP tools, allowing AI assistants to run playbooks, handle breakpoints, and respond to elicitations programmatically.

Tool	Description
`execute_playbook`	Run a playbook with input values, model, and optional breakpoints
`resume_playbook_execution`	Resume a paused execution from the next step after a breakpoint
`respond_to_elicitation`	Provide a response to an `@elicit()` pause
`get_playbook_execution`	Fetch execution details and step outputs
`list_playbook_executions`	List past executions for a playbook

Info

MCP execution is synchronous (non-streaming). The execute_playbook tool runs all steps and returns the complete result. If a breakpoint or elicitation is hit, the tool returns the execution state with status paused or awaiting_input, along with instructions to call resume_playbook_execution or respond_to_elicitation.

For full MCP tool schemas, see the MCP Reference.