Skip to main content
roveflow init installs a /roveflow slash command into your Claude Code project. Invoke it in any Claude Code session whose working directory is the Flutter project root.

Prerequisites

  • The app is running in debug mode on a simulator.
  • .mcp.json points at a working flutter_inspector_mcp binary.
  • docs/roveflow/scenarios.md has a valid cold-setup and at least the scenarios you want to run.

Run configuration

Before the orchestrator relaunches the app, it resolves three values:
ValueHow it’s resolved
flutter_commandfvm flutter if .fvmrc / .fvm/ / fvm_config.json is present; otherwise flutter.
flavorThe single subdirectory of ios/config/ if exactly one exists; otherwise omitted.
vm_service_portParsed from --dart-vm-port=N in .mcp.json; defaults to 8181.
To override any of these, drop a docs/roveflow/config.yaml (all fields optional):
flutter_command: fvm flutter
flavor: production
vm_service_port: 8181
Use it when auto-detect picks the wrong flavor, when your repo has a non-standard Flutter wrapper, or when you need a non-default port (keep .mcp.json --dart-vm-port in sync).

Modes

--mode=critical

Default. Runs only scenarios tagged criticality: critical. Use this as the pre-release gate.

--mode=all

Runs every scenario in the inventory. Use for scheduled nightly runs.

--only=<ids>

Runs a specific subset (comma-separated ids). Use when iterating on a single scenario.

Recording the simulator

Every run also captures a full-length .mov of the simulator via xcrun simctl io booted recordVideo. It lands at docs/roveflow/runs/<run-id>/recording.mov alongside report.md. Pass --no-record to skip it:
/roveflow --no-record
Good reasons to skip: no simulator GUI is visible, CI runs on a headless sim, or you just want tighter iteration loops.

Invocation

In Claude Code:
/roveflow
Runs --mode=critical (the default).
/roveflow --mode=all
Runs every scenario.
/roveflow --only=cold-setup
Runs one scenario. Useful while you’re wiring Roveflow into a new app.
/roveflow --only=cold-setup,open-detail
Runs a comma-separated subset.

What happens under the hood

  1. The orchestrator parses scenarios and filters by mode (and --only if present).
  2. It resets the simulator (shut down, erase, boot, relaunch the app in debug mode). That’s the only step that touches your system directly.
  3. It dispatches a roveflow-runner agent (Haiku-class) for cold-setup.
  4. If cold-setup passes, it dispatches one agent per remaining scenario, in declared order, collecting a structured result for each.
  5. Between scenarios it runs reset_to_home; if that fails it has one fallback attempt before aborting with skipped: setup_lost.
  6. It compacts working memory after every 5 scenarios or when the context estimate crosses 60%.
  7. It compiles the results into a markdown report.
See the MCP tools reference for the tools the agent calls, and the Scenario schema for what each run writes to disk.

The report

Manual mode

A markdown table to chat:
| scenario               | result | waypoints hit | screenshots | notes  |
|------------------------|--------|---------------|-------------|--------|
| cold-setup             | pass   | 1/1           | 2           |        |
| open-detail            | fail   | 1/2           | 3           | ...    |
Followed by a numbered list of failures and a prompt asking which to file.

Scheduled mode

Same table written to docs/roveflow/runs/YYYY-MM-DD-HHMM/report.md with no chat interaction.

Reading results

  • pass — the scenario’s pass condition was observable on screen.
  • fail — the scenario’s fail condition fired, or the flow was unreachable (crash, blank screen, unrecoverable error toast).
  • skipped: no_data — environmental gap (e.g., no available slots today).
  • skipped: setup_failed — cold-setup didn’t reach home; every later scenario inherits this.
  • skipped: setup_lost — mid-run reset failed; remaining scenarios inherit it.
A scenario that uses tap_at as a fallback surfaces "fragile": true in its notes. If you see fragility on a hot scenario, consider adding a ValueKey (Using ValueKeys).

Aborts

The orchestrator aborts when:
  • The app fails to boot or the VM service never responds.
  • cold-setup fails.
  • Mid-run reset-to-home fails and the one fallback attempt also fails.
On abort, remaining scenarios are marked skipped: setup_failed or skipped: setup_lost in the report.

Next

Authoring scenarios

Add more flows to the inventory.

Recording demos

Save polished playbacks of your runs.